# Collect data from the latest 5 table entries with a specific criteria

✭✭✭

Hello all,

Does anyone have any bright ideas on how to solve the puzzle I'm currently struggling with?))

I have a source table with multiple entries from different projects. The entries for each specific project are added in a random order. Each entry has the following columns: Row#, Created, Project ID, and Flow Score columns.

I need to build a formula that makes the following:

• Pulls the Flow Score numbers from the above-source table for a specific ProjectID
• Calculate the average number of the the Flow Score
• Only the last/latest five entries should be considered

It's relatively straightforward to achieve the first two items by using AVG(Collect… ) with ProjectID as criteria. But I'm struggling to find the right way to get only the latest entries, either by Row# since the entries don't come in sequence, or by the Created column values.

When I attempt to use two nested Collect functions it returns an error.

Any suggestions on how to overcome this?

Thank you!

• ✭✭✭✭✭✭

1.) You need 2 helper columns. You can name them whatever you want but the ones I did are "FirstOfLastFive" and "LastOfLastFive".

2.) I used the Row# column as an auto number.

3.) Below is a link to a published version of this solution that I've tested on a BASIC level and it seems to work.

4.) After you created your two helper columns put these formulas in them:

FirstOfLastFive:

=LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 5)

LastOfLastFive:

=LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 1)

Average Workflow:

=AVG(COLLECT([Flow Score]:[Flow Score], [Row#]:[Row#], @cell >= FirstOfLastFive@row, [Row#]:[Row#], @cell <= LastofLastFive@row, ProjectID:ProjectID, ProjectID@row))

Published Version:

https://app.smartsheet.com/b/publish?EQBCT=7e800f20a56644889c54c0472d2e9e16

Certified Platinum Partner

2023 Partner of the Year

PrimeConsulting.com

• ✭✭✭✭✭✭

The dilemma of finding the top x numbers was exactly the problem I faced when I converted the timestamp to numeric values. The likelihood that the timestamp (I'm not sure how many milliseconds are actually stored) is exactly the same is why I went with the RankEQ equation. I could then, either by formula or report, bring in the values that I needed. I chose to rank in descending order so that #1 was always the max. This way, if there were only 3 entries even though I needed, say 5, I was always going to find the max timestamp. I used sheet summary fields with individual index/match formulas for each of my 1-5 rankings to pull in the values I needed. This approach worked for me. I hope you can find an approach that works easily for you.

Kelly

• ✭✭✭✭✭

I am pretty sure I can type something up but I am not sure that I can do it using the Time stamp with a date. Is it fine to simply use the date without the time stamp to find the top 5?

So long as the Project ID is a day by day top 5 that's pretty straight forward, it just seems like you would need to use the time stamp as well.

• ✭✭✭

Hi @Cory Page ! Thanks for checking my question :) Some time people can miss to enter their input so it can be in a random order and random time :)

• ✭✭✭✭✭

I found a post in the community that's similar to this request or at least the part about finding the most recent data points. I am trying to convert it over to your request but maybe someone can smarter than I can do it quicker for you as I don't typically use timestamps. I will keep plugging away,

• ✭✭✭

Indeed, it seems to be the right direction, but I'm unsure how to proceed from here. Alternatively, the `Row#` can be used to identify the latest entry by applying the `MAX` function. However, I don't know how to make the formula calculate only a specific number of the latest entries.

• ✭✭✭✭✭✭

1.) You need 2 helper columns. You can name them whatever you want but the ones I did are "FirstOfLastFive" and "LastOfLastFive".

2.) I used the Row# column as an auto number.

3.) Below is a link to a published version of this solution that I've tested on a BASIC level and it seems to work.

4.) After you created your two helper columns put these formulas in them:

FirstOfLastFive:

=LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 5)

LastOfLastFive:

=LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 1)

Average Workflow:

=AVG(COLLECT([Flow Score]:[Flow Score], [Row#]:[Row#], @cell >= FirstOfLastFive@row, [Row#]:[Row#], @cell <= LastofLastFive@row, ProjectID:ProjectID, ProjectID@row))

Published Version:

https://app.smartsheet.com/b/publish?EQBCT=7e800f20a56644889c54c0472d2e9e16

Certified Platinum Partner

2023 Partner of the Year

PrimeConsulting.com

• ✭✭✭✭✭✭

The dilemma of finding the top x numbers was exactly the problem I faced when I converted the timestamp to numeric values. The likelihood that the timestamp (I'm not sure how many milliseconds are actually stored) is exactly the same is why I went with the RankEQ equation. I could then, either by formula or report, bring in the values that I needed. I chose to rank in descending order so that #1 was always the max. This way, if there were only 3 entries even though I needed, say 5, I was always going to find the max timestamp. I used sheet summary fields with individual index/match formulas for each of my 1-5 rankings to pull in the values I needed. This approach worked for me. I hope you can find an approach that works easily for you.

Kelly

• ✭✭✭✭✭✭

Yes that idea above would work also. My suggestion above wouldn't require any sorting or selecting and should also work with any number of values between 1 and 5 from the last and also doesn't need timestamps etc.

Certified Platinum Partner

2023 Partner of the Year

PrimeConsulting.com

• ✭✭✭

Hello All,

Matt, Kelly and Cory - thank you for your bright and creative ideas!

Matt, choose your suggestion because for my case it seems a bit simpler and easier to adjust or modify in the future :)

It works great except in case when there are fewer than 5 entries for a particular Project ID. Then FirstOfLastFive returns an error. I added the following adjustment to make it work regardless of the number of entries:

• COUNTIFS(…) calculates the total number of entries for each Project ID
• IF(...) checks whether the number of entries is less than five. If it is the formula returns the correct number of entries, otherwise it returns 5

=LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), IF(COUNTIFS(ProjectID:ProjectID, @cell = ProjectID@row) < 5, COUNTIFS(ProjectID:ProjectID, @cell = ProjectID@row), 5))

Once again - thanks everyone for help!