CHALLENGE: Unpivot financial data (reverse pivot)
I have a challenge for this awesome community.... has anyone successfully found a way to unpivot data? I need to take columnar data ($ per month with each month in a column) to become row data (each project and month on a row).
I need to do this in order to mesh user-entered horizontal forecasts, with Oracle-sourced vertical actuals. I do realize I could pivot the Oracle vertical data and then run some comparisons, but I feel like there's a more elegant way to do this that wouldn't rely on hard-coding months or other pieces. I can think of other use cases for this, like a timecard entry for example.
Starting Data
The GOAL!
Best Answer
-
Ok I got this figured out thanks to @Genevieve P. - a method for unpivoting any data without having to preset an unpivot sheet with values. It's a bit of a long setup, but it works! You do have to manipulate the source data and the result sheet a little bit to get this to work, so it's not quite as easy as a simple ingest and go. But it's close.
The method below uses a bunch of helper columns. There's probably a cleaner method for combining these helper formulas into one big ol' nasty formula to do it all...but I prefer to have it broken out so I can see where things break.
Step 1: Add reference row to data
To your data sheet with the pivoted data on it (Projects and Months in this example), add a row to the top that duplicates the column names on the row
Step 2: Setup the Unpivot sheet
- Add a Checkbox or Text column. You'll need to put something in this column, or check the checkboxes, to create rows to receive the data. You'll need to create the number of rows that equals the number of value columns (months) X the number of data row (projects). 9 rows for this example (3 projects X 3 months).
- Add an Auto-Number column called Auto
- Add a Text/Number column called Row with a column formula: =MATCH(Auto@row, Auto:Auto, 0)
- Add a Text/Number column called Value Column Count with a column formula: =COUNT({Top Row}) - 1 . The {Top Row} reference is the full 1st row of the data sheet (the row that says "Top Row" and is in grey in the screenshot above).
- Add a Text/Number column called Data Row with a column formula: =ROUNDUP(Row@row / [Value Column Count]@row) + 1
- Add a Text/Number column called Data Column Number with a column formula: =COUNTIFS([Data Row]:[Data Row], [Data Row]@row, Row:Row, <=Row@row) + 1
- Add a Text/Number column to hold your primary row value, Project in this case, with a column formula: =INDEX({Full Range}, [Data Row]@row, 1). The {Full Range} reference is all columns of the data sheet (Project, January, February, March).
- Add a Text/Number column to hold your column names, Month in this case, with a column formula: =INDEX({Top Row}, 1, [Data Column Number]@row)
- Add a Text/Number column called Value to hold your values with a column formula: =INDEX({Full Range}, [Data Row]@row, [Data Column Number]@row)
If all is setup correctly you should see results like this:
Explanation
- Row: this formula gives you a row number that is always accurate, even if you move or remove rows.
- Value Column count: this is counting the number of columns in the data sheet, using the 1st row that you setup. Because the first column in the data is the row value (Project), we subtract 1 from the count to get the number of columns with actual values in them.
- Data Row: this is creating a repeating number for us to use in the final INDEX formula. It takes the row number and divides by the number of value columns to get a fraction, then rounds that up. If you add more columns to the data (say-more months) this will auto-adjust to give you each data row number repeated for the number of columns. We add +1 to start the data row count on row 2 of the data, since row 1 is the "Top Row" line that doesn't have any data in it.
- Data Column Number: this creates a sequence of column numbers for each data row set. For example, we want to get data row 2, column 2 on the first row of results, then data row 2, column 3 on the next row of results, and so on. This works by counting up the number of result rows that have already been established for the current data row and adding 1.
- Project: This is an INDEX formula that uses our Data Row column to find the Project name on that row in the data. Because we know Project is always the first column in the data, I've "fixed" the column index to 1. You could also create a standalone reference just to the Project column, instead of reusing the {Full Range} reference if you wanted to.
- Month: This INDEX formula reads the helper "Top Row" and reads the names of each value column (the months in this example)
- Value: Finally, we're able to use the Data Row and Data Column numbers to get the actual values that we're looking for.
Answers
-
If you manually set up the Project and Month columns, then you could use an INDEX(MATCH(MATCH to look into your original sheet and bring back the appropriate data in your VALUE column.
To do this, you'd need to add one redundant row in your first sheet, a Top Row, that re-writes the column names (the months) so that the formula can read that cell and match what column data to bring back.
Then you could do something like this:
=INDEX({Full Sheet}, MATCH(Project@row, {Project Column}, 0), MATCH(Month@row, {Top Row}, 0)
The matching Project finds the correct row to look in, and the helper top row allows the second Match to find what column to bring back.
There's an example of this type of formula in the Project Management Office Template set, where the "Project Intake Sheet" has the helper top row and the "Project Metadata" has the formula.
Cheers,
Genevieve
Need more help? 👀 | Help and Learning Center
こんにちは (Konnichiwa), Hallo, Hola, Bonjour, Olá, Ciao! 👋 | Global Discussions
-
@Genevieve P. Brilliant! I thought Match always gave a row number, but based on this, it looks like it's will give the location in a row? This solution seems really great, thank you!
I guess the next question is how to do this without having to pre-set the data sheet with the project names and months by hand. Or if the data isn't projects and months, and is changing a lot, some way to set the data sheet automatically.
-
Ok I got this figured out thanks to @Genevieve P. - a method for unpivoting any data without having to preset an unpivot sheet with values. It's a bit of a long setup, but it works! You do have to manipulate the source data and the result sheet a little bit to get this to work, so it's not quite as easy as a simple ingest and go. But it's close.
The method below uses a bunch of helper columns. There's probably a cleaner method for combining these helper formulas into one big ol' nasty formula to do it all...but I prefer to have it broken out so I can see where things break.
Step 1: Add reference row to data
To your data sheet with the pivoted data on it (Projects and Months in this example), add a row to the top that duplicates the column names on the row
Step 2: Setup the Unpivot sheet
- Add a Checkbox or Text column. You'll need to put something in this column, or check the checkboxes, to create rows to receive the data. You'll need to create the number of rows that equals the number of value columns (months) X the number of data row (projects). 9 rows for this example (3 projects X 3 months).
- Add an Auto-Number column called Auto
- Add a Text/Number column called Row with a column formula: =MATCH(Auto@row, Auto:Auto, 0)
- Add a Text/Number column called Value Column Count with a column formula: =COUNT({Top Row}) - 1 . The {Top Row} reference is the full 1st row of the data sheet (the row that says "Top Row" and is in grey in the screenshot above).
- Add a Text/Number column called Data Row with a column formula: =ROUNDUP(Row@row / [Value Column Count]@row) + 1
- Add a Text/Number column called Data Column Number with a column formula: =COUNTIFS([Data Row]:[Data Row], [Data Row]@row, Row:Row, <=Row@row) + 1
- Add a Text/Number column to hold your primary row value, Project in this case, with a column formula: =INDEX({Full Range}, [Data Row]@row, 1). The {Full Range} reference is all columns of the data sheet (Project, January, February, March).
- Add a Text/Number column to hold your column names, Month in this case, with a column formula: =INDEX({Top Row}, 1, [Data Column Number]@row)
- Add a Text/Number column called Value to hold your values with a column formula: =INDEX({Full Range}, [Data Row]@row, [Data Column Number]@row)
If all is setup correctly you should see results like this:
Explanation
- Row: this formula gives you a row number that is always accurate, even if you move or remove rows.
- Value Column count: this is counting the number of columns in the data sheet, using the 1st row that you setup. Because the first column in the data is the row value (Project), we subtract 1 from the count to get the number of columns with actual values in them.
- Data Row: this is creating a repeating number for us to use in the final INDEX formula. It takes the row number and divides by the number of value columns to get a fraction, then rounds that up. If you add more columns to the data (say-more months) this will auto-adjust to give you each data row number repeated for the number of columns. We add +1 to start the data row count on row 2 of the data, since row 1 is the "Top Row" line that doesn't have any data in it.
- Data Column Number: this creates a sequence of column numbers for each data row set. For example, we want to get data row 2, column 2 on the first row of results, then data row 2, column 3 on the next row of results, and so on. This works by counting up the number of result rows that have already been established for the current data row and adding 1.
- Project: This is an INDEX formula that uses our Data Row column to find the Project name on that row in the data. Because we know Project is always the first column in the data, I've "fixed" the column index to 1. You could also create a standalone reference just to the Project column, instead of reusing the {Full Range} reference if you wanted to.
- Month: This INDEX formula reads the helper "Top Row" and reads the names of each value column (the months in this example)
- Value: Finally, we're able to use the Data Row and Data Column numbers to get the actual values that we're looking for.
-
Oh my goodness! I stepped away for a moment and came back and WOW! 🤩
Very impressive - thanks so much for sharing your solution.
Need more help? 👀 | Help and Learning Center
こんにちは (Konnichiwa), Hallo, Hola, Bonjour, Olá, Ciao! 👋 | Global Discussions
-
@Genevieve Choquette thanks! It was your solution that broke through the issue for me. So thank YOU.
-
Hello @Brian_Richardson @Genevieve P. , I am astonished you found a solution to achieve unpivot in Smartsheet, I can't wait to implement this feature.
I would really like to pick your brain.in regards to step 1, adding reference row to data, do i need to add a reference below every column? I only want to the data i highlighted in my screenshot below?
Please can you advise
-
You can have more columns than just the ones you are unpivoting, but the columns to be unpivoted must be all lumped together. Blank columns in the middle will be included in the unpivot, even if you don't give the column a name on the first row, because the "Data Column Number" formula has no way of knowing that there's an unrelated column in the middle of the data. It's doing a count based on the number of rows in the results sheet, not performing any lookups to the data sheet.
-
How do i generate more rows please?
-
Press the insert key on your keyboard or right click a row and choose Insert Below/Above.
Not e you also need to make all of the formulas to be column formulas by right clicking the formula that you entered and choose "Convert to Column Formula"
-
@Brian_Richardson in my data file, there are 246 unique rows and 41 columns across, which would mean i would have create around 10,000 rows, but there are around 30-40 rows with data across the 41 columns.
sorry to be bothersome here, can you suggest any workaround for this please? really appreciate your responses so far. This model you created is really good nonetheless. I can't wait to utilise on future reports.
-
It would probably be best to copy the rows that actually have data to another sheet and use that consolidated sheet as your source rather than the original sheet with all the hundreds of blank rows. Because adding 10k rows to your results sheet is going to be very painful. You can copy the rows manually, or setup an automation to copy a row when it has data on it and then run that automation.
Help Article Resources
Categories
- All Categories
- 14 Welcome to the Community
- Customer Resources
- 64.8K Get Help
- 434 Global Discussions
- 138 Industry Talk
- 470 Announcements
- 4.9K Ideas & Feature Requests
- 129 Brandfolder
- 148 Just for fun
- 65 Community Job Board
- 486 Show & Tell
- 33 Member Spotlight
- 2 SmartStories
- 300 Events
- 36 Webinars
- 7.3K Forum Archives
Check out the Formula Handbook template!