Collect data from the latest 5 table entries with a specific criteria

Options
Andrii
Andrii ✭✭✭

Hello all,

Does anyone have any bright ideas on how to solve the puzzle I'm currently struggling with?))

I have a source table with multiple entries from different projects. The entries for each specific project are added in a random order. Each entry has the following columns: Row#, Created, Project ID, and Flow Score columns.

image.png

I need to build a formula that makes the following:

  • Pulls the Flow Score numbers from the above-source table for a specific ProjectID
  • Calculate the average number of the the Flow Score
  • Only the last/latest five entries should be considered

It's relatively straightforward to achieve the first two items by using AVG(Collect… ) with ProjectID as criteria. But I'm struggling to find the right way to get only the latest entries, either by Row# since the entries don't come in sequence, or by the Created column values.

When I attempt to use two nested Collect functions it returns an error.

Any suggestions on how to overcome this?

Thank you!

Best Answers

  • Matt Lynn ACT
    Matt Lynn ACT Community Champion
    Answer βœ“

    @Andrii

    1.) You need 2 helper columns. You can name them whatever you want but the ones I did are "FirstOfLastFive" and "LastOfLastFive".

    2.) I used the Row# column as an auto number.

    3.) Below is a link to a published version of this solution that I've tested on a BASIC level and it seems to work.

    4.) After you created your two helper columns put these formulas in them:

    FirstOfLastFive:

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 5)

    LastOfLastFive:

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 1)

    Average Workflow:

    =AVG(COLLECT([Flow Score]:[Flow Score], [Row#]:[Row#], @cell >= FirstOfLastFive@row, [Row#]:[Row#], @cell <= LastofLastFive@row, ProjectID:ProjectID, ProjectID@row))

    image.png

    Published Version:

    https://app.smartsheet.com/b/publish?EQBCT=7e800f20a56644889c54c0472d2e9e16

    Matt Lynn

    How can I help? Schedule some time on my calendar: CLICK HERE

  • Kelly Moore
    Kelly Moore Community Champion
    edited 08/27/24 Answer βœ“

    Hey @Andrii , @Cory Page and @Matt Lynn-PCG

    The dilemma of finding the top x numbers was exactly the problem I faced when I converted the timestamp to numeric values. The likelihood that the timestamp (I'm not sure how many milliseconds are actually stored) is exactly the same is why I went with the RankEQ equation. I could then, either by formula or report, bring in the values that I needed. I chose to rank in descending order so that #1 was always the max. This way, if there were only 3 entries even though I needed, say 5, I was always going to find the max timestamp. I used sheet summary fields with individual index/match formulas for each of my 1-5 rankings to pull in the values I needed. This approach worked for me. I hope you can find an approach that works easily for you.

    Kelly

Answers

  • Cory Page
    Cory Page ✭✭✭✭✭

    I am pretty sure I can type something up but I am not sure that I can do it using the Time stamp with a date. Is it fine to simply use the date without the time stamp to find the top 5?

    So long as the Project ID is a day by day top 5 that's pretty straight forward, it just seems like you would need to use the time stamp as well.

  • Andrii
    Andrii ✭✭✭

    Hi @Cory Page ! Thanks for checking my question :) Some time people can miss to enter their input so it can be in a random order and random time :)

  • Cory Page
    Cory Page ✭✭✭✭✭

    I found a post in the community that's similar to this request or at least the part about finding the most recent data points. I am trying to convert it over to your request but maybe someone can smarter than I can do it quicker for you as I don't typically use timestamps. I will keep plugging away,

  • Andrii
    Andrii ✭✭✭

    Indeed, it seems to be the right direction, but I'm unsure how to proceed from here. Alternatively, the Row# can be used to identify the latest entry by applying the MAX function. However, I don't know how to make the formula calculate only a specific number of the latest entries.

  • Matt Lynn ACT
    Matt Lynn ACT Community Champion
    Answer βœ“

    @Andrii

    1.) You need 2 helper columns. You can name them whatever you want but the ones I did are "FirstOfLastFive" and "LastOfLastFive".

    2.) I used the Row# column as an auto number.

    3.) Below is a link to a published version of this solution that I've tested on a BASIC level and it seems to work.

    4.) After you created your two helper columns put these formulas in them:

    FirstOfLastFive:

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 5)

    LastOfLastFive:

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), 1)

    Average Workflow:

    =AVG(COLLECT([Flow Score]:[Flow Score], [Row#]:[Row#], @cell >= FirstOfLastFive@row, [Row#]:[Row#], @cell <= LastofLastFive@row, ProjectID:ProjectID, ProjectID@row))

    image.png

    Published Version:

    https://app.smartsheet.com/b/publish?EQBCT=7e800f20a56644889c54c0472d2e9e16

    Matt Lynn

    How can I help? Schedule some time on my calendar: CLICK HERE

  • Kelly Moore
    Kelly Moore Community Champion
    edited 08/27/24 Answer βœ“

    Hey @Andrii , @Cory Page and @Matt Lynn-PCG

    The dilemma of finding the top x numbers was exactly the problem I faced when I converted the timestamp to numeric values. The likelihood that the timestamp (I'm not sure how many milliseconds are actually stored) is exactly the same is why I went with the RankEQ equation. I could then, either by formula or report, bring in the values that I needed. I chose to rank in descending order so that #1 was always the max. This way, if there were only 3 entries even though I needed, say 5, I was always going to find the max timestamp. I used sheet summary fields with individual index/match formulas for each of my 1-5 rankings to pull in the values I needed. This approach worked for me. I hope you can find an approach that works easily for you.

    Kelly

  • Matt Lynn ACT
    Matt Lynn ACT Community Champion

    Yes that idea above would work also. My suggestion above wouldn't require any sorting or selecting and should also work with any number of values between 1 and 5 from the last and also doesn't need timestamps etc.

    Matt Lynn

    How can I help? Schedule some time on my calendar: CLICK HERE

  • Andrii
    Andrii ✭✭✭

    Hello All,

    Matt, Kelly and Cory - thank you for your bright and creative ideas!

    Matt, choose your suggestion because for my case it seems a bit simpler and easier to adjust or modify in the future :)

    It works great except in case when there are fewer than 5 entries for a particular Project ID. Then FirstOfLastFive returns an error. I added the following adjustment to make it work regardless of the number of entries:

    • COUNTIFS(…) calculates the total number of entries for each Project ID
    • IF(...) checks whether the number of entries is less than five. If it is the formula returns the correct number of entries, otherwise it returns 5

    =LARGE(COLLECT([Row#]:[Row#], ProjectID:ProjectID, ProjectID@row), IF(COUNTIFS(ProjectID:ProjectID, @cell = ProjectID@row) < 5, COUNTIFS(ProjectID:ProjectID, @cell = ProjectID@row), 5))

    Once again - thanks everyone for help!