CHALLENGE: Unpivot financial data (reverse pivot)
I have a challenge for this awesome community.... has anyone successfully found a way to unpivot data? I need to take columnar data ($ per month with each month in a column) to become row data (each project and month on a row).
I need to do this in order to mesh user-entered horizontal forecasts, with Oracle-sourced vertical actuals. I do realize I could pivot the Oracle vertical data and then run some comparisons, but I feel like there's a more elegant way to do this that wouldn't rely on hard-coding months or other pieces. I can think of other use cases for this, like a timecard entry for example.
Starting Data
The GOAL!
Best Answer
-
Ok I got this figured out thanks to @Genevieve P. - a method for unpivoting any data without having to preset an unpivot sheet with values. It's a bit of a long setup, but it works! You do have to manipulate the source data and the result sheet a little bit to get this to work, so it's not quite as easy as a simple ingest and go. But it's close.
The method below uses a bunch of helper columns. There's probably a cleaner method for combining these helper formulas into one big ol' nasty formula to do it all...but I prefer to have it broken out so I can see where things break.
Step 1: Add reference row to data
To your data sheet with the pivoted data on it (Projects and Months in this example), add a row to the top that duplicates the column names on the row
Step 2: Setup the Unpivot sheet
- Add a Checkbox or Text column. You'll need to put something in this column, or check the checkboxes, to create rows to receive the data. You'll need to create the number of rows that equals the number of value columns (months) X the number of data row (projects). 9 rows for this example (3 projects X 3 months).
- Add an Auto-Number column called Auto
- Add a Text/Number column called Row with a column formula: =MATCH(Auto@row, Auto:Auto, 0)
- Add a Text/Number column called Value Column Count with a column formula: =COUNT({Top Row}) - 1 . The {Top Row} reference is the full 1st row of the data sheet (the row that says "Top Row" and is in grey in the screenshot above).
- Add a Text/Number column called Data Row with a column formula: =ROUNDUP(Row@row / [Value Column Count]@row) + 1
- Add a Text/Number column called Data Column Number with a column formula: =COUNTIFS([Data Row]:[Data Row], [Data Row]@row, Row:Row, <=Row@row) + 1
- Add a Text/Number column to hold your primary row value, Project in this case, with a column formula: =INDEX({Full Range}, [Data Row]@row, 1). The {Full Range} reference is all columns of the data sheet (Project, January, February, March).
- Add a Text/Number column to hold your column names, Month in this case, with a column formula: =INDEX({Top Row}, 1, [Data Column Number]@row)
- Add a Text/Number column called Value to hold your values with a column formula: =INDEX({Full Range}, [Data Row]@row, [Data Column Number]@row)
If all is setup correctly you should see results like this:
Explanation
- Row: this formula gives you a row number that is always accurate, even if you move or remove rows.
- Value Column count: this is counting the number of columns in the data sheet, using the 1st row that you setup. Because the first column in the data is the row value (Project), we subtract 1 from the count to get the number of columns with actual values in them.
- Data Row: this is creating a repeating number for us to use in the final INDEX formula. It takes the row number and divides by the number of value columns to get a fraction, then rounds that up. If you add more columns to the data (say-more months) this will auto-adjust to give you each data row number repeated for the number of columns. We add +1 to start the data row count on row 2 of the data, since row 1 is the "Top Row" line that doesn't have any data in it.
- Data Column Number: this creates a sequence of column numbers for each data row set. For example, we want to get data row 2, column 2 on the first row of results, then data row 2, column 3 on the next row of results, and so on. This works by counting up the number of result rows that have already been established for the current data row and adding 1.
- Project: This is an INDEX formula that uses our Data Row column to find the Project name on that row in the data. Because we know Project is always the first column in the data, I've "fixed" the column index to 1. You could also create a standalone reference just to the Project column, instead of reusing the {Full Range} reference if you wanted to.
- Month: This INDEX formula reads the helper "Top Row" and reads the names of each value column (the months in this example)
- Value: Finally, we're able to use the Data Row and Data Column numbers to get the actual values that we're looking for.
Answers
-
If you manually set up the Project and Month columns, then you could use an INDEX(MATCH(MATCH to look into your original sheet and bring back the appropriate data in your VALUE column.
To do this, you'd need to add one redundant row in your first sheet, a Top Row, that re-writes the column names (the months) so that the formula can read that cell and match what column data to bring back.
Then you could do something like this:
=INDEX({Full Sheet}, MATCH(Project@row, {Project Column}, 0), MATCH(Month@row, {Top Row}, 0)
The matching Project finds the correct row to look in, and the helper top row allows the second Match to find what column to bring back.
There's an example of this type of formula in the Project Management Office Template set, where the "Project Intake Sheet" has the helper top row and the "Project Metadata" has the formula.
Cheers,
Genevieve
Join us at Smartsheet ENGAGE 2024 🎉
October 8 - 10, Seattle, WA | Register now -
@Genevieve P. Brilliant! I thought Match always gave a row number, but based on this, it looks like it's will give the location in a row? This solution seems really great, thank you!
I guess the next question is how to do this without having to pre-set the data sheet with the project names and months by hand. Or if the data isn't projects and months, and is changing a lot, some way to set the data sheet automatically.
-
Ok I got this figured out thanks to @Genevieve P. - a method for unpivoting any data without having to preset an unpivot sheet with values. It's a bit of a long setup, but it works! You do have to manipulate the source data and the result sheet a little bit to get this to work, so it's not quite as easy as a simple ingest and go. But it's close.
The method below uses a bunch of helper columns. There's probably a cleaner method for combining these helper formulas into one big ol' nasty formula to do it all...but I prefer to have it broken out so I can see where things break.
Step 1: Add reference row to data
To your data sheet with the pivoted data on it (Projects and Months in this example), add a row to the top that duplicates the column names on the row
Step 2: Setup the Unpivot sheet
- Add a Checkbox or Text column. You'll need to put something in this column, or check the checkboxes, to create rows to receive the data. You'll need to create the number of rows that equals the number of value columns (months) X the number of data row (projects). 9 rows for this example (3 projects X 3 months).
- Add an Auto-Number column called Auto
- Add a Text/Number column called Row with a column formula: =MATCH(Auto@row, Auto:Auto, 0)
- Add a Text/Number column called Value Column Count with a column formula: =COUNT({Top Row}) - 1 . The {Top Row} reference is the full 1st row of the data sheet (the row that says "Top Row" and is in grey in the screenshot above).
- Add a Text/Number column called Data Row with a column formula: =ROUNDUP(Row@row / [Value Column Count]@row) + 1
- Add a Text/Number column called Data Column Number with a column formula: =COUNTIFS([Data Row]:[Data Row], [Data Row]@row, Row:Row, <=Row@row) + 1
- Add a Text/Number column to hold your primary row value, Project in this case, with a column formula: =INDEX({Full Range}, [Data Row]@row, 1). The {Full Range} reference is all columns of the data sheet (Project, January, February, March).
- Add a Text/Number column to hold your column names, Month in this case, with a column formula: =INDEX({Top Row}, 1, [Data Column Number]@row)
- Add a Text/Number column called Value to hold your values with a column formula: =INDEX({Full Range}, [Data Row]@row, [Data Column Number]@row)
If all is setup correctly you should see results like this:
Explanation
- Row: this formula gives you a row number that is always accurate, even if you move or remove rows.
- Value Column count: this is counting the number of columns in the data sheet, using the 1st row that you setup. Because the first column in the data is the row value (Project), we subtract 1 from the count to get the number of columns with actual values in them.
- Data Row: this is creating a repeating number for us to use in the final INDEX formula. It takes the row number and divides by the number of value columns to get a fraction, then rounds that up. If you add more columns to the data (say-more months) this will auto-adjust to give you each data row number repeated for the number of columns. We add +1 to start the data row count on row 2 of the data, since row 1 is the "Top Row" line that doesn't have any data in it.
- Data Column Number: this creates a sequence of column numbers for each data row set. For example, we want to get data row 2, column 2 on the first row of results, then data row 2, column 3 on the next row of results, and so on. This works by counting up the number of result rows that have already been established for the current data row and adding 1.
- Project: This is an INDEX formula that uses our Data Row column to find the Project name on that row in the data. Because we know Project is always the first column in the data, I've "fixed" the column index to 1. You could also create a standalone reference just to the Project column, instead of reusing the {Full Range} reference if you wanted to.
- Month: This INDEX formula reads the helper "Top Row" and reads the names of each value column (the months in this example)
- Value: Finally, we're able to use the Data Row and Data Column numbers to get the actual values that we're looking for.
-
Oh my goodness! I stepped away for a moment and came back and WOW! 🤩
Very impressive - thanks so much for sharing your solution.
Join us at Smartsheet ENGAGE 2024 🎉
October 8 - 10, Seattle, WA | Register now -
@Genevieve Choquette thanks! It was your solution that broke through the issue for me. So thank YOU.
Help Article Resources
Categories
- All Categories
- 14 Welcome to the Community
- Smartsheet Customer Resources
- 62.9K Get Help
- 378 Global Discussions
- 208 Industry Talk
- 440 Announcements
- 4.5K Ideas & Feature Requests
- 139 Brandfolder
- 129 Just for fun
- 130 Community Job Board
- 449 Show & Tell
- 30 Member Spotlight
- 1 SmartStories
- 289 Events
- 33 Webinars
- 7.3K Forum Archives
Check out the Formula Handbook template!