Collect data from the latest 5 table entries with a specific criteria

Hello all,

Does anyone have any bright ideas on how to solve the puzzle I'm currently struggling with?))

I have a source table with multiple entries from different projects. The entries for each specific project are added in a random order. Each entry has the following columns: Row#, Created, Project ID, and Flow Score columns.

I need to build a formula that makes the following:

  • Pulls the Flow Score numbers from the above-source table for a specific ProjectID
  • Calculate the average number of the the Flow Score
  • Only the last/latest five entries should be considered

It's relatively straightforward to achieve the first two items by using AVG(Collect… ) with ProjectID as criteria. But I'm struggling to find the right way to get only the latest entries, either by Row# since the entries don't come in sequence, or by the Created column values.

When I attempt to use two nested Collect functions it returns an error.

Any suggestions on how to overcome this?

Thank you!

Best Answers

  • Matt Lynn-PCG
    Matt Lynn-PCG ✭✭✭✭✭✭
    Answer ✓

    @Andrii

    1.) You need 2 helper columns. You can name them whatever you want but the ones I did are "FirstOfLastFive" and "LastOfLastFive".

    2.) I used the Row# column as an auto number.

    3.) Below is a link to a published version of this solution that I've tested on a BASIC level and it seems to work.

    4.) After you created your two helper columns put these formulas in them:

    FirstOfLastFive:

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 5)

    LastOfLastFive:

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 1)

    Average Workflow:

    =AVG(COLLECT([Flow Score]:[Flow Score], [Row#]:[Row#], @cell >= FirstOfLastFive@row, [Row#]:[Row#], @cell <= LastofLastFive@row, ProjectID:ProjectID, ProjectID@row))

    Published Version:

    https://app.smartsheet.com/b/publish?EQBCT=7e800f20a56644889c54c0472d2e9e16

    Certified Platinum Partner

    2023 Partner of the Year

    PrimeConsulting.com

  • Kelly Moore
    Kelly Moore ✭✭✭✭✭✭
    edited 08/27/24 Answer ✓

    Hey @Andrii , @Cory Page and @Matt Lynn-PCG

    The dilemma of finding the top x numbers was exactly the problem I faced when I converted the timestamp to numeric values. The likelihood that the timestamp (I'm not sure how many milliseconds are actually stored) is exactly the same is why I went with the RankEQ equation. I could then, either by formula or report, bring in the values that I needed. I chose to rank in descending order so that #1 was always the max. This way, if there were only 3 entries even though I needed, say 5, I was always going to find the max timestamp. I used sheet summary fields with individual index/match formulas for each of my 1-5 rankings to pull in the values I needed. This approach worked for me. I hope you can find an approach that works easily for you.

    Kelly

Answers

  • Cory Page
    Cory Page ✭✭✭✭✭

    I am pretty sure I can type something up but I am not sure that I can do it using the Time stamp with a date. Is it fine to simply use the date without the time stamp to find the top 5?

    So long as the Project ID is a day by day top 5 that's pretty straight forward, it just seems like you would need to use the time stamp as well.

  • Andrii
    Andrii ✭✭✭

    Hi @Cory Page ! Thanks for checking my question :) Some time people can miss to enter their input so it can be in a random order and random time :)

  • Cory Page
    Cory Page ✭✭✭✭✭

    I found a post in the community that's similar to this request or at least the part about finding the most recent data points. I am trying to convert it over to your request but maybe someone can smarter than I can do it quicker for you as I don't typically use timestamps. I will keep plugging away,

  • Andrii
    Andrii ✭✭✭

    Indeed, it seems to be the right direction, but I'm unsure how to proceed from here. Alternatively, the Row# can be used to identify the latest entry by applying the MAX function. However, I don't know how to make the formula calculate only a specific number of the latest entries.

  • Matt Lynn-PCG
    Matt Lynn-PCG ✭✭✭✭✭✭
    Answer ✓

    @Andrii

    1.) You need 2 helper columns. You can name them whatever you want but the ones I did are "FirstOfLastFive" and "LastOfLastFive".

    2.) I used the Row# column as an auto number.

    3.) Below is a link to a published version of this solution that I've tested on a BASIC level and it seems to work.

    4.) After you created your two helper columns put these formulas in them:

    FirstOfLastFive:

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 5)

    LastOfLastFive:

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 1)

    Average Workflow:

    =AVG(COLLECT([Flow Score]:[Flow Score], [Row#]:[Row#], @cell >= FirstOfLastFive@row, [Row#]:[Row#], @cell <= LastofLastFive@row, ProjectID:ProjectID, ProjectID@row))

    Published Version:

    https://app.smartsheet.com/b/publish?EQBCT=7e800f20a56644889c54c0472d2e9e16

    Certified Platinum Partner

    2023 Partner of the Year

    PrimeConsulting.com

  • Kelly Moore
    Kelly Moore ✭✭✭✭✭✭
    edited 08/27/24 Answer ✓

    Hey @Andrii , @Cory Page and @Matt Lynn-PCG

    The dilemma of finding the top x numbers was exactly the problem I faced when I converted the timestamp to numeric values. The likelihood that the timestamp (I'm not sure how many milliseconds are actually stored) is exactly the same is why I went with the RankEQ equation. I could then, either by formula or report, bring in the values that I needed. I chose to rank in descending order so that #1 was always the max. This way, if there were only 3 entries even though I needed, say 5, I was always going to find the max timestamp. I used sheet summary fields with individual index/match formulas for each of my 1-5 rankings to pull in the values I needed. This approach worked for me. I hope you can find an approach that works easily for you.

    Kelly

  • Matt Lynn-PCG
    Matt Lynn-PCG ✭✭✭✭✭✭

    Yes that idea above would work also. My suggestion above wouldn't require any sorting or selecting and should also work with any number of values between 1 and 5 from the last and also doesn't need timestamps etc.

    Certified Platinum Partner

    2023 Partner of the Year

    PrimeConsulting.com

  • Andrii
    Andrii ✭✭✭

    Hello All,

    Matt, Kelly and Cory - thank you for your bright and creative ideas!

    Matt, choose your suggestion because for my case it seems a bit simpler and easier to adjust or modify in the future :)

    It works great except in case when there are fewer than 5 entries for a particular Project ID. Then FirstOfLastFive returns an error. I added the following adjustment to make it work regardless of the number of entries:

    • COUNTIFS(…) calculates the total number of entries for each Project ID
    • IF(...) checks whether the number of entries is less than five. If it is the formula returns the correct number of entries, otherwise it returns 5

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), IF(COUNTIFS(ProjectID:ProjectID, @cell = ProjectID@row) < 5, COUNTIFS(ProjectID:ProjectID, @cell = ProjectID@row), 5))

    Once again - thanks everyone for help!