AVG, COLLECT, and LARGE cross-sheet formula

Options

I would like to pull the average test score based on multiple criteria, but instead of a specified date range I would like to use the top 10 most recent entries as not all candidates have the same number of tests taken within the same date range. I started with the following date range formula:


=AVG(COLLECT({score}, {criteria}, Type@row, {duedate}, AND(@cell >= TODAY(-10), @cell <= TODAY())))


Then adjusted my data sheet to include a YEARDAY column as a way to organize the dates into smallest

and largest values so I could pull only the top 10 values and that’s where I’m stuck.


My questions are:

1)     How can I pull multiple LARGE values in one formula to represent the most recent 10 tests:

=LARGE{yearday}, 1-10

2)     Then add that criteria to my existing formula above in place of the date range:

=AVG(COLLECT({score}, {criteria}, Type@row, LARGE{yearday}, 1-10))


Thank you!

Best Answer

Answers

  • Genevieve P.
    Genevieve P. Employee Admin
    Options

    Hi @C. Powell

    I'll answer your second question first:

    In your AVG(COLLECT formula, you'll want to list the YearDay range first, then your criteria is

    @cell >= LARGE({YearDay}, 10)

    This will take the 10th largest, then 9th largest, then 8th, etc. I've added an IFERROR around the LARGE() function in case there are less than 10 rows for that particular Type.

    Try this full formula:

    =AVG(COLLECT({score}, {criteria}, Type@row, {YearDay}, @cell >= IFERROR(LARGE({YearDay}, 10), 0)))


    For your first question, you can use a JOIN(COLLECT to bring back the Names of the most recent tests, like so:

    =JOIN(COLLECT({Names to Return}, {YearDay}, @cell >= IFERROR(LARGE({YearDay}, 10), 0)), " / ")

    This will list the top 10 in one cell.

    Let me know if this makes sense and works for you!

    Cheers,

    Genevieve

  • C. Powell
    C. Powell ✭✭✭
    Options

    Thank you, Genevieve! This is very helpful and we're definitely getting closer!

    When applying your full formula for items with 10 rows or more the result is off by a few points (6%) and for those items that have less than 10 rows I'm getting #DIVIDE BY ZERO despite the IFERROR addition. This should produce the average score for any number of entries less than 10 rows or should it simply not show an error?

    For the other portion, I don't require a list of the results in the cell (although your JOIN formula worked perfectly to achieve this), I wanted to use something to this effect just to ensure I'm actually averaging the top ten values:

    =AVG(COLLECT({YearDay}, {YearDay}, @cell >= IFERROR(LARGE({YearDay}, 10), 0)))

    but the result was still slightly off. So I used your join formula to see which results were being pulled and it's only returning 9 values instead of 10 - I couldn't get it to pull all 10 values until I changed the formula to say "13" instead of "10". I doubled checked my source sheet and while there are items listed with a higher yearday value, they would be excluded due to the "{criteria}, Type@row" portion of the formula so I'm still unable to reconcile the results.

    Do you have any other thoughts that might be contributing to the skewed results?

    Thank you again for all your help!

  • Genevieve P.
    Genevieve P. Employee Admin
    Options

    Hi @C. Powell

    Ah of course! Our LARGE formula is not filtering down the list in the same way that your COLLECT function is for the AVG function. This means it's just looking at the top 10, regardless of your Criteria & Type, my apologies for missing this.

    So the way I can think to get around this is actually by adding another Helper Column into your Source sheet. Add a column that RANKS each of your Type Criteria by the YearDay column you have, like so:

    =RANKEQ(YearDay@row, COLLECT(YearDay:YearDay, Type:Type, Type@row), 1)

    This way you'll have a top 10 per type in your source sheet.

    Now you can use this helper column in your cross-sheet formula to look for the top 10, instead of using the YearDay, like so:

    =AVG(COLLECT({score}, {criteria}, Type@row, {New Rank Column}, @cell <= 10))

    Let me know if this now works!

    Cheers,

    Genevieve

  • C. Powell
    C. Powell ✭✭✭
    Options

    Hi Genevieve,

    Do I need to use LARGE with the new formula in order to pull the 10 largest rank values instead of rank 1-10? As is, the oldest tests scores are being pulled instead of the most recent.

    Could you also advise on where to add back in the IFERROR statement? For those test types that have less than 10 entries I'm still getting #DIVIDE BY ZERO (which is expected) but I'm hoping we can find a solution that pulls whatever values are present and not just 10+ entries (or produce blanks or zero)?

    I really appreciate the help!

  • Genevieve P.
    Genevieve P. Employee Admin
    Options

    Hi @C. Powell

    Sorry for the delay!

    No, we're essentially using the RANK column instead of using LARGE. The RANK column in your sheet will create a number per-row based on the YearDay.

    Then in the second formula we're using this helper number column to look for anything = or below 10. This means that if there's a Type that only has 3 rows, then the Rank would Rank them as 1, 2, 3... and the next formula shouldn't error at all because 1, 2, and 3 are all less than 10, so it should just count those three.

    {New Rank Column}, @cell <= 10

    I would only expected a Divide By 0 if you actually have 0 entries for that type, is that possible? You can wrap the IFERROR around the whole thing:

    =IFERROR(AVG(COLLECT({score}, {criteria}, Type@row, {New Rank Column}, @cell <= 10)), "No Type Listed")

    Cheers!

    Genevieve

  • C. Powell
    C. Powell ✭✭✭
    Options

    Hi Genevieve,

    Thank you for the explanation-makes sense! I think the only component we're missing is that when we rank the type criteria by YearDay it's returning 1-10 of the lowest YearDay values but those are actually the older test scores and not the most recent. So for example I'm getting an average score based on 10 tests taken in September vs December.

    So we need to rank by the type criteria based on the top 10 highest YearDay values to reflect the average of the most recent test scores.

  • Genevieve P.
    Genevieve P. Employee Admin
    Answer ✓
    Options

    Hi @C. Powell

    Ah! That means we're close!

    So all we need to do now is adjust the Order specified within the RANKEQ function: https://help.smartsheet.com/function/rankeq

    =RANKEQ(YearDay@row, COLLECT(YearDay:YearDay, Type:Type, Type@row), 1)

    That 1 at the end should be 0:

    =RANKEQ(YearDay@row, COLLECT(YearDay:YearDay, Type:Type, Type@row), 0)

    Cheers,

    Genevieve

  • C. Powell
    C. Powell ✭✭✭
    Options

    That did it! Thank you for all your guidance - happy holidays!

  • Genevieve P.
    Genevieve P. Employee Admin
    Options

    No problem at all!

    Happy holidays 🙂

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!