Adding Multiple Criteria to a COLLECT Function

I have a working formula gathering the MEDIAN of a group of numbers that fall within a date range used in a trend chart in a dashboard. I now want to add two additional criteria for cases when the dashboard user would like to calculate the MEDIAN but exclude certain values (referred to as 'HOLD' and 'Strategic' data) from the median calculation. The working formula is the following but I'm not getting the correct result:

=IF(ISERROR(MEDIAN(COLLECT({Resi - CPA Executed Duration}, {Resi - Enter Stage 7 Date}, AND(@cell >= DATE (2024, 6, 1), @cell <= DATE(2024, 6, 30)), {Resi - HOLD}, [Exclude HOLD Data Value]1, {Resi - Strategic}, [Exclude Strategic Dealer Data Value]1))), "", MEDIAN(COLLECT({Resi - CPA Executed Duration}, {Resi - Enter Stage 7 Date}, AND(@cell >= DATE (2024, 6, 1), @cell <= DATE(2024, 6, 30)), {Resi - HOLD}, [Exclude HOLD Data Value]1, {Resi - Strategic}, [Exclude Strategic Dealer Data Value]1)))

The last part ({Resi - HOLD}, [Exclude HOLD Data Value]1, {Resi - Strategic}, [Exclude Strategic Dealer Data Value]1) is the new portion where I'm not exactly getting the correct result but I can see what is happening when I did some testing and diagnosis (below):

Current Calc Results 3 LH columns Desired Calc Results 3 RH Columns

My current formula is giving me the median calculations on the the points with the checkmark or without it (LH beige and blue columns) but the results that I am expecting (needing) are on the RH white and blue columns where I am looking for the median of all of the data OR, if the 'Resi - HOLD' boxes are checked in an adjacent column, exclude those data points from the median calculation. How might I need to modify my formula to include all data in the date range or exclude any points that have a checkmark in the 'Resi - HOLD' or 'Resi - Strategic' columns?

Answers

  • KPH
    KPH ✭✭✭✭✭✭

    Hi @GMichal

    Your COLLECT function is collecting the values in

    {Resi - CPA Executed Duration}

    where

    • {Resi - Enter Stage 7 Date} is between 1 and 30th June 2024
    • {Resi - HOLD} is equal to [Exclude HOLD Data Value]1
    • {Resi - Strategic} is equal to [Exclude Strategic Dealer Data Value]1

    I don't know what is in those two cells the formula references and am not sure I understand the ask exactly but maybe you just need to replace [Exclude HOLD Data Value]1 and [Exclude Strategic Dealer Data Value]1 with <>1 (which means unchecked)

    COLLECT({Resi - CPA Executed Duration}, {Resi - Enter Stage 7 Date}, AND(@cell >= DATE (2024, 6, 1), @cell <= DATE(2024, 6, 30)), {Resi - HOLD}, <>1, {Resi - Strategic}, <>1)

    I hope this helps.

  • GMichal
    GMichal ✭✭✭✭

    [Exclude HOLD Data Value]1 and [Exclude Strategic Dealer Data Value]1 are values that are coming from a checkbox on a dashboard. The intent is to allow the dashboard user to exclude the 'HOLD' data and/or the 'strategic' data from the median calculation if checked.

    Not clearly enough, I guess, I was trying to explain that my formula as written above with [Exclude HOLD Data Value]1 and [Exclude Strategic Dealer Data Value]1 = 0 (Resi - HOLD = checked) was not allowing all of the data to be used in the median calculation but giving me this result:

    when I need this (below) result from my formula with [Exclude HOLD Data Value]1 and [Exclude Strategic Dealer Data Value]1 = 0 (Resi - HOLD = checked):

    Simply, I need the formula to exclude data from the median calculation if either of the [Exclude HOLD Data Value]1 and [Exclude Strategic Dealer Data Value]1 checkboxes are checked but, if they are not checked, all of the date range data should be used in the median calculation.

  • KPH
    KPH ✭✭✭✭✭✭

    What does

    {Resi - Strategic} and

    {Resi - HOLD}

    reference?

  • GMichal
    GMichal ✭✭✭✭

    These are the columns where users of the Smartsheet have determined that the prospect has been put on HOLD previously or the prospect is a strategic one for us and the users have checked a box in the row to indicate that.

    For background info, if a prospect was put on HOLD or is a strategic one, the durations captured in the {Resi - CPA Executed Duration} would be unusually long and therefore skew the data which is why I want the dashboard user to be able to omit that data from the median calculation seen in the dashboard.

  • KPH
    KPH ✭✭✭✭✭✭

    Is this true:

    • If [Exclude Strategic Dealer Data Value]1 is checked then you only want to find the median of rows where {Resi - Strategic} is checked.
    • If [Exclude HOLD Data Value]1 is checked then you only want to find the median of rows where {Resi - HOLD} is checked.
    • If both are checked, you want to find the median of all rows where {Resi - HOLD} and {Resi - Strategic} are checked
    • If neither are checked, you want to include all in the median calculation.

  • GMichal
    GMichal ✭✭✭✭

    No, just a bit different on some of these:

    • If [Exclude Strategic Dealer Data Value]1 is checked then you only want to find the median of rows where {Resi - Strategic} is NOT checked.
    • If [Exclude HOLD Data Value]1 is checked then you only want to find the median of rows where {Resi - HOLD} is NOT checked.
    • If both are checked, you want to find the median of all rows where {Resi - HOLD} and {Resi - Strategic} are NOT checked
    • If neither are checked, you want to include all in the median calculation.

    The 'checks' would exclude that data from the calculation.

  • KPH
    KPH ✭✭✭✭✭✭

    Can you try:

    =MEDIAN(COLLECT({Resi - CPA Executed Duration}, {Resi - Enter Stage 7 Date}, AND(@cell >= DATE (2024, 6, 1), @cell <= DATE(2024, 6, 30)), {Resi - HOLD}, <>[Exclude HOLD Data Value]1, {Resi - Strategic}, <>[Exclude Strategic Dealer Data Value]1))

  • GMichal
    GMichal ✭✭✭✭

    The result is #UNPARSEABLE. Just thinking…if we are indicating NOT [Exclude HOLD Data Value]1 as the criteria, what value might it be using in the formula? Maybe that's the #UNPARSEABLE message?

  • KPH
    KPH ✭✭✭✭✭✭

    #UNPARSEABLE suggests a problem with the way the formula is written. Can you check for miss spellings, or misplaced commas? Is it nicely color coded when you type it in?

    The checkboxes will either be true or blank. So I was using <> (not equal). NOT could be an option. But the error you have does not suggest trying that just yet. I made a copy of your sheet and the formula is working for me, assuming I understood the requirement. So it could just be a typo.

  • GMichal
    GMichal ✭✭✭✭
    edited 08/29/24

    Magically, when I went back to the sheet with your formula, it did not have an error and displayed the median value '0', which is not correct. It appears as though it is not ignoring the two rows checked in the {Resi - HOLD} range ('Has been in HOLD Stage' column name). This is the same result that I was getting previously but the median of the entire column of values in the CPA Executed Duration…' column equals '1'. Here is a look at the cross-sheet references in the formula:

    With the formula results we are getting, it's almost as if we need the combination of data (not results) from [Exclude HOLD Data Value]1 = 0 and [Exclude HOLD Data Value]1 = 1 to get all 9 data points included in the median calculation but then still have the ability to exclude some data from the calculation if either {Resi - HOLD} or {Resi - Strategic} checkboxes have been checked.

  • KPH
    KPH ✭✭✭✭✭✭

    Glad that the formula is returning a value. That part will give you the rows where both things have the opposite check/uncheck to {Resi - HOLD} and {Resi - Strategic}. That should cover situations 3 and 4 from your list. Then we can create another COLLECT for situation 1 and 2 from your list.

    COLLECT({Resi - CPA Executed Duration}, {Resi - Enter Stage 7 Date}, AND(@cell >= DATE (2024, 6, 1), @cell <= DATE(2024, 6, 30)), {Resi - HOLD}, <>[Exclude HOLD Data Value]1, {Resi - Strategic}, [Exclude Strategic Dealer Data Value]1)

    and

    COLLECT({Resi - CPA Executed Duration}, {Resi - Enter Stage 7 Date}, AND(@cell >= DATE (2024, 6, 1), @cell <= DATE(2024, 6, 30)), {Resi - HOLD}, [Exclude HOLD Data Value]1, {Resi - Strategic}, <>[Exclude Strategic Dealer Data Value]1)

    And combine the three in the MEDIAN function.

    But to be sure this is doing what your want, how about creating a temporary column and using it to output the values in the COLLECT. That way you can see what is or is not included and confirm. You would need to temporarily change your data to have different values in each row. The use a JOIN function with each of the 3 COLLECTS to see that the rows you expect to be included are and the ones you don't want are not, and there are no duplicates.

  • GMichal
    GMichal ✭✭✭✭

    My mistake on the result of that formula. Somehow the result I was looking at was from the original formula that gave the '0'. Putting the suggested formula in again:

    =MEDIAN(COLLECT({Resi - CPA Executed Duration}, {Resi - Enter Stage 7 Date}, AND(@cell >= DATE (2024, 6, 1), @cell <= DATE(2024, 6, 30)), {Resi - HOLD}, <>[Exclude HOLD Data Value]1, {Resi - Strategic}, <>[Exclude Strategic Dealer Data Value]1))

    gave me a #UNPARSEABLE result. Then I finally found the problem; a space after the first DATE function which left me with this formula:

    =MEDIAN(COLLECT({Resi - CPA Executed Duration}, {Resi - Enter Stage 7 Date}, AND(@cell >= DATE(2024, 6, 1), @cell <= DATE(2024, 6, 30)), {Resi - HOLD}, <>[Exclude HOLD Data Value]1, {Resi - Strategic}, <>[Exclude Strategic Dealer Data Value]1))

    which now gave me an #INVALID ARGUMENT SET result. Is it the '<>' the expression that is invalid?

  • KPH
    KPH ✭✭✭✭✭✭

    Invalid argument set suggest a problem with commas or parentheses. But the formula looks fine as you pasted it above. Can you try typing it in rather than copy/pasting as we've seen that the new comment box here does add spaces, so it might be a similar problem.

    I've made a little mock up to illustrate what the COLLECT is doing in a similar style to the examples you have been sharing. This is my dummy data:

    Instead of the date being between two values, I am using "Inc rows if 1" just to simplify things. Then I have a column with a different number in each row so I can see which are included and excluded (easier than having lots of 0s and not knowing which is which). Then I have HOLD or Strategic in each of the possible combinations (twice for good measure).

    Then I paste the COLLECT function into a JOIN function rather than MEDIAN just so I can see the data it returns:

    =JOIN(COLLECT([Numbers to inc]:[Numbers to inc], [Inc rows if 1]:[Inc rows if 1], 1, [Resi - HOLD]:[Resi - HOLD], <>[Exclude HOLD Data Value]@row, [Resi - Strategic]:[Resi - Strategic], <>[Exclude Strategic Dealer Data Value]@row), ",")

    And I put that on 4 rows with each of the combinations of HOLD and Strategic.

    That gives me this:

    So when neither are checked I get rows 4 and 8 (both are checked), when both are checked I get rows 1 and 5 (neither are checked), when HOLD is checked I get rows 3 and 7 (only strategic is checked) and when Strategic is checked I get rows 2 and 6 (only hold is checked).

    Which I think is what you want in the case of both being checked or unchecked.

    But when HOLD is checked you want rows 3 and 7 and also 1 and 5.

    When Strategic is checked you want rows 2 and 6 and also 1 and 5.

    Is that correct?

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!