MAX and COLLECT to return most recent value not the highest

Options

Hello!


I am currently trying (and failing) to create a cross-sheet formula that will look for the most recent date for a given lot and then return a specific value from that date. Sounds simple enough right?

I have one column on my metrics sheet that correctly pulls the most recent date for a given lot:

=MAX(COLLECT({Assay Date}, {Raw Lot}, [Raw Lot ID]@row))

Where,

{Assay Date} is the date column on another sheet

{Raw Lot} is the unique identifier on another sheet

[Raw Lot ID]@row is the Raw lot I am looking for, AT4175 which is on the metric sheet. Metric sheet not shown since it is needed for this discussion.

This all works just fine and will return 4/30/21 from the data set below. Now on top of looking for the most recent assay date, I also would like to get the data value from the most recent assay of the specific lot. Here is a screenshot of the data set as an example:



My first attempt is as follows:

=VLOOKUP(MAX(COLLECT({Assay Date}, {Raw Lot}, [Raw Lot ID]@row)), {Characterization Experiments Range 3}, 2)

Where,

{Characterization Experiments Range 3} is the 2nd and 3rd columns shown above here.

This will return "99" since it is the highest value that matches the criteria and not 5 which is what I want it to return.

My second attempt was to two-fold in my aims; first to not use Vlookup since I plan on introducing and removing columns from the source sheet with some regularity and isnt a scalable option. The second aim was to have it return the desired 5.

=JOIN(COLLECT({SG1441 55C}, {Raw Lot}, [Raw Lot ID]@row, {Assay Date}, MAX(COLLECT({Assay Date}, {Raw Lot}, [Raw Lot ID]@row))), "-")

Where,

{SG1441 55C} is the 3rd column shown above

This will return "99-5".

Getting closer but not quite what I need. I at least got the 5 to be returned and knew that I would get both values. I just cant make the next leap of syntax understanding to get ONLY the 5 to be returned.


Any help would be greatly appreciated!

Tags:

Best Answer

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    Answer ✓
    Options

    So going with the form populating at the top of the sheet, the Assay Date at the top of the sheet will always be the greatest date and the entry for that particular ID that is closest to the top of the sheet would be the one you want to pull?


    If that is correct, then a standard INDEX/MATCH on the ID should work without the need to pull dates or reference anything else. The INDEX/MATCH works from the top down and stops at the first match.


    =INDEX({Source Sheet Column To Pull}, MATCH([Raw Lot ID]@row, {Source Sheet Lot ID column}, 0))

Answers

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    Options

    How is each new row added? Is it ALWAYS added at the bottom of the sheet?

  • Joseph Adams
    Joseph Adams ✭✭✭✭
    edited 05/20/21
    Options

    Currently it is at the top of the sheet via a form but that can be easily changed to be the bottom of the sheet. When I was testing the formula I just happened to be updating the bottom ones manually. The data set is a test only as the sheets I am creating are not yet live so I can have the data go in anyway that gets me what I need.

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    Answer ✓
    Options

    So going with the form populating at the top of the sheet, the Assay Date at the top of the sheet will always be the greatest date and the entry for that particular ID that is closest to the top of the sheet would be the one you want to pull?


    If that is correct, then a standard INDEX/MATCH on the ID should work without the need to pull dates or reference anything else. The INDEX/MATCH works from the top down and stops at the first match.


    =INDEX({Source Sheet Column To Pull}, MATCH([Raw Lot ID]@row, {Source Sheet Lot ID column}, 0))

  • Joseph Adams
    Joseph Adams ✭✭✭✭
    edited 05/20/21
    Options

    Not necessarily, the team the sheet is for is R&D and that arent always the best at inputting data right away and so the top (or bottom) date will not always be the most recent. This is part of why I wanted to look up the most recent date first and go from there. Not just a training issue, its how they operate and when they decide to call a data set valid enough to input it.

    However, within a given date, the most recent assay will always be the top (or bottom). For example, I can have multiple dates of multiple assays in the column but for any given date, the most recent will always be at a top (or bottom). Those shown in bold will always be the most recent (bottom)

    4/20

    4/26

    4/26

    4/26

    4/22

    4/23

    4/23

    4/23

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    Options

    I would suggest having the most recent at the top. That way we can use the INDEX/MATCH and just let it works the way it is designed to instead of having to build in extras to skip the top rows and look a the bottom.


    It can be done that way if you prefer, but it is much easier if the most recent is always at the top.

  • Joseph Adams
    Joseph Adams ✭✭✭✭
    Options

    Will this still work if someone sorts the source data sheet? Or will basically not allow someone to sort the data? What about filters?

    It will be helpful to sort the data based on some of the other columns like the SG1441@55C, % or other data values found on the source sheet. Granted the idea of the metrics sheet is so they dont have to reference the source data too often unless they want to look at each individual assay run.

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    Options

    Filters won't affect it, but sorting will.


    Insert a system generated Auto-number column with no special formatting. Then instead of pulling the MAX date, pull the MAX auto-number. Then we can use:

    =INDEX({Column To Pull From}, MATCH([Max Auto-Number]@row, {Auto-Number Column}, 0))

  • Joseph Adams
    Joseph Adams ✭✭✭✭
    Options

    Hello,


    I still need to reference the Raw lot since the source sheet wont just have a single lot on it so I modified the formula and came up with this.


    =IFERROR(INDEX({SG1441 55C}, MATCH(MAX(COLLECT({Auto-Number}, {Raw Lot}, [Raw Lot ID]@row)), {Auto-Number}, 0)), "")


    Adding the auto-number column was a great idea and will take care of my issue. Thanks!

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    Options

    You shouldn't have had to reference the lot number since the max auto-number column was doing that. The INDEX formula just went off of the max auto-number which was meant to replace the max date. Normally I would have done what you did above because that means you don't need the max auto-number column on the metrics sheet, but I was going with what you already seemed to be familiar with. Using the formula above, you should now be able to delete that max auto-number column.

  • Joseph Adams
    Joseph Adams ✭✭✭✭
    Options

    I dont follow then how you intended I use the Max auto number formula. As I understood it, as each new row gets added to the source sheet, the auto number column would do its thing and generate an ever increasing integer which can then be referenced in the MAX formula to pull the most recent assay.


    I initially did just that but then when I had multiple raw lots in the source sheet I would only be pulling the most recent overall assay and not the most recent from a given lot. This is why I updated it to include the raw lot in the formula. This allows me to always pull the most recent assay of any given taw lot.


    Did I understand your solution correctly?

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    Options

    Right. You were originally using a MAX/COLLECT to pull the max date based on the lot number (original post). The Auto-number was meant to replace the date, so a MAX/COLLECT on the auto-number.

    I thought you had a "Max Date" column on the metrics sheet where you pulled in the most recent date based on the Lot ID? That was to be replaced by the auto-number with a "Max Auto-Number" column that pulled the most recent auto-number based on the Lot ID.


    Or were you not pulling the MAX/COLLECT for the date into a separate date column?


    Either way... The Auto-number was intended to replace the date in the MAX/COLLECT portion which you were able to figure out despite me very poorly explaining the process.

  • Joseph Adams
    Joseph Adams ✭✭✭✭
    Options

    With potential multiple assays on the same date, the Max auto number is more useful than a Max date for the purposes of this formula. I do use the MAx date in another column to generate that most recent assay date but now that is just so we know if we have recently tested something.


    Thanks for all the help. It works very well now!

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    Options

    Ah. Ok. I thought the max date was specifically for trying to pull the most recent. I see now that you wouldn't want to replace that data. Happy to help. 👍️

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!