INDEX & MATCH using multiple criteria

Hi and thank you!

I'm using Smartsheet to track and complete medical cases. I'm using two sheets, one patient-facing that I've called the Intake sheet, and one for me, which I call the Tracking sheet. I'm using INDEX and MATCH to copy basic demographic data from the intake sheet to the tracking sheet. Since the same person can have more than one case, the one field that would be different and desired is the date of the case (for my purposes, multiple cases on the same date are treated as the same case). I'm using email as the unique identifier of the person, and would like to use the date of the case as the case identifier for each case that needs to be tracked. I've coded a flag on the intake sheet to let me know when there are duplicates, and I bring that result onto the tracking sheet using INDEX and MATCH. In the case of duplicates (only the date field of the case matters), I want to bring only the most recent case date forward onto the tracking sheet.

I have the INDEX and MATCH working for the other fields, but need to compare dates to get the most recent case.

=IFERROR(INDEX({Intake Date of Exposure}, MATCH([Email]@row, {Email}, 0)), "Email Not on Intake")

«1

Answers

  • Andrée Starå
    Andrée Starå ✭✭✭✭✭✭

    Hi @Brent Guhl

    I hope you're well and safe!

    You need to use an INDEX/COLLECT structure instead to only show the latest case.

    Did that work/help?

    I hope that helps!

    Have a fantastic week & Happy New Year!

    Best,

    Andrée Starå | Workflow Consultant / CEO @ WORK BOLD

    Did my post(s) help or answer your question or solve your problem? Please support the Community by marking it Insightful/Vote Up or/and as the accepted answer. It will make it easier for others to find a solution or help to answer!

    SMARTSHEET EXPERT CONSULTANT & PARTNER

    Andrée Starå | Workflow Consultant / CEO @ WORK BOLD

    W: www.workbold.com | E:andree@workbold.com | P: +46 (0) - 72 - 510 99 35

    Feel free to contact me for help with Smartsheet, integrations, general workflow advice, or anything else.

  • Hi @Brent Guhl

    To add to @Andrée Starå's answer, here are some other Community posts with examples of an INDEX(COLLECT formula

    If this hasn't helped, it would be useful to see screen captures of your sheet and the formula you're currently working with, but please block out sensitive data.

    Cheers,

    Genevieve

    Need more help? 👀 | Help and Learning Center

    こんにちは (Konnichiwa), Hallo, Hola, Bonjour, Olá, Ciao! 👋 | Global Discussions

  • Hi @Andrée Starå & @Genevieve P. Thank you both for the help.

    The INDEX COLLECT I've got is:

    =IFERROR(INDEX(COLLECT({Intake Date of Case}, {Intake Email}, Email@row), {Intake Multiple Cases}, 1), "Email not on Intake")

    However, it's not running correctly, only returning "Email not on Intake" rather than looking and returning the most recent.

    Another possible problem is overwriting dates. I don't want that. I only want to bring in the new case date, not erase the old one when duplicates are present.

  • Hi @Brent Guhl

    It looks like you have a closing parentheses too early in your formula! You'll want the COLLECT function to contain all of your columns and criteria, then end off with another 1 in the INDEX function to identify you want the first row match to be brought back, like so:

    =IFERROR(INDEX(COLLECT({Intake Date of Case}, {Intake Email}, Email@row, {Intake Multiple Cases}, 1), 1), "Email not on Intake")

    Would you mind clarifying your last sentence? I presume your {Intake Multiple Cases} is the flag column and you're looking for only the row that's flagged (or 1), which will indicate that it's the most recent out of all of the duplicates. Is that correct, or do you also have a date column that you need to evaluate?

    It would be helpful to see screen captures of both sheets with sensitive data blocked out, if possible.

    Cheers,

    Genevieve

    Need more help? 👀 | Help and Learning Center

    こんにちは (Konnichiwa), Hallo, Hola, Bonjour, Olá, Ciao! 👋 | Global Discussions

  • Thank you so much @Genevieve P.

    I do indeed use a flag with 1 indicating duplicates:

    =IF(COUNTIF(Email:Email, Email@row) > 1, 1)

    Moving the parentheses and putting in the 1), 1) gives me a formula that now works, but returns the first case date rather than the last one:

    =IFERROR(INDEX(COLLECT({Intake Date of Case}, {Intake Email}, Email@row, {Intake Multiple Cases}, 1), 1), "Email not on Intake")

    Rather than the first match, how can I bring forward the latest case date? Would the best solution be to fix the flag formula from a binary to a counting system?

    I was playing with the MAX function yesterday, and could get the highest date returned, but noticed at least one case where it updated a past case's date to the most recent date. Each case needs to be independent, so the date of each becomes the unique identifier when the same person has multiple cases.

    Intake sheet, with the flag formula:

    Tracking sheet, using 3 different formulas. Yellow indicates errors:

    Rows 1-5 are using index and match. These are pulling only the first date, so line 3's date is an error. Rows 6-8 are using the formula: =MAX(COLLECT({Intake Date of Case}, {Intake Multiple Cases}, 1)) , which is returning the most recent date, so all of them are wrong, even line 7, which just happens to fall on the max date. Lines 9-11 are using the INDEX(COLLECT formula, with the parentheses moved and the 1), 1) you helped me with. However, the results returned are the first date, not the most recent. Line 9 is also wrong, even after saving and refreshing on the Intake then Tracking sheets.

    How can I modify any or all of these to return the most recent case for each cartoon character?

  • Hi @Brent Guhl

    Thank you for clarifying! Yes, a MAX function is needed here.

    What I would suggest doing is actually change up what your Flag column looks for. Instead of looking for duplicates, have it flag the Oldest Case per Person. This means there will be a flag even if the email only has one row associated with it, so the Collect function will still find that row (this is why ppig was returning an incorrect result; our formula is filtering out all rows that aren't checked).

    Try using this as your flag formula:

    =IF([Date of Case]@row = MAX(COLLECT([Date of Case]:[Date of Case], Email:Email, Email@row)), 1, 0)

    Then once your flag is only highlighting the Max date rows, you can use the same INDEX(COLLECT formula for your cross-sheet formula:

    =IFERROR(INDEX(COLLECT({Intake Date of Case}, {Intake Email}, Email@row, {Intake Multiple Cases}, 1), 1), "Email not on Intake")

    Keep in mind if you list the email twice in your second sheet, you will see the same date repeated as it will only bring back the Max date for that email. Does that make sense?

    Cheers,

    Genevieve

    Need more help? 👀 | Help and Learning Center

    こんにちは (Konnichiwa), Hallo, Hola, Bonjour, Olá, Ciao! 👋 | Global Discussions

  • Thank you @Genevieve P.

    I'm mulling over the implications and am concerned about the date thing. In the situation of tracking a single case/event, if we get only the most recent date, and it updates all entries of the same person, we won't be able to use all the automations and calculations I have set up based on the date of the incident. One example is the automated reminders, of which there can be many. This could also be a problem for coding stats such as turn-around-time, types of cases in a given day/month/year, and expected duration of cases. Is there a way to keep that case date in tact without resorting to copy and paste or going back to assigning case numbers? Is there a way to count instances of the same person, then return the highest number, using that to bring in the date?

  • Hi @Brent Guhl

    I think I've been misunderstanding what it is you're tracking on your tracking sheet. Re-reading your original post, it sounds like you actually do want duplicates of the same person, as long as the date is different, is that correct?

    INDEX(MATCH or INDEX(COLLECT functions need some sort of unique identifier to filter down and find the matching rows across sheets. In your case, we were using the email address as the unique identifier, however if you want to bring back multiple, different rows with the same email, this is no longer unique. The date, which makes that email unique, is the value you're trying to bring back, so we can't use this to Match across sheets.

    Can I ask why you are looking to use a formula to gather this data? It sound like perhaps a Copy Row automation would be better to copy over the information from the source to the tracking sheet, or if you need the data to be dynamic (so the date will change) then perhaps a Report that clarifies the information you need to see (we could exclude duplicates using a formula and then filter by this in the Report).

    Need more help? 👀 | Help and Learning Center

    こんにちは (Konnichiwa), Hallo, Hola, Bonjour, Olá, Ciao! 👋 | Global Discussions

  • Correct. The same person can have many cases, both of the same type and of different types. No other piece of demographic information will work since they will all be the same for that individual. The date of the case is what is needed.

    My idea was simply to build in both the date and the email criteria, thus making the combination unique. I just need to keep learning how Smartsheet coding works to do this. (If we have to, we can create a join field).

    Automations such as copying or moving a row would be great, but can't be used because they will copy the entire row. Due to the way Smartsheet is designed, (it's a flat table), there are or can be a lot of columns. In the specific application I've built, I ask questions such as job type, and have options such as doctor, nurse, tech, advanced practice, and other. For each of these general options there are sub options (such as what kind of doctor (resident, fellow, faculty), or what type of nurse (RN, LVN, CNA/NA)). Due to these options, the Intake table has 72 columns, and the Tracking table has 64.

    The logic in forms is fantastic, so docs only see doctor stuff and nurses only see nurse stuff, but due to the large number of automations I'm running (20 on the Tracker and 5 on the Intake), plus the outward vs. inward facing design, I and others I've asked, don't see a way to combine the tables--even if that were desirable (which it isn't due to the sheer size).

    Reports won't work due to the need for automations. Each type of case requires different types of follow ups, and most of the automations are built around automatically sending reminders of one type and another. Reports don't work well for dashboards either, since I want to screen stuff (by month, by company, by body part, by job class, etc.), so I've done a fair bit of coding on the underlying grid to make the dashboard easy.

    So I have two ideas: First, can we combine the email and the date, thereby creating a unique identifier for each case (since multiple cases that happen to the same person on the same day are treated as the same case for my purpose here), or, can we use the auto date/time field on the Intake (which will be unique) instead on the date of the case? That will work in all cases except the ones where someone comes in today to report something that happened last week. Those are rare (one a month maybe), and combining that with the rare multiple cases to the same person, we might get lucky and have no mistakes. It's got to be something like this or go back to manual copy and paste! Let me work on this last one as it should work.

  • The piece of the puzzle I'm missing is the comparison. I should be able to build onto the flag column by doing a simple nested IF(AND statement to compare the Email@row with the Email:Email column, and if true that there are more than one of the emails in the email column that match the one at the row, then compare the dates and return the highest.

    The problem here is that it will update all the dates. To get around that, I created a helper column in the Intake that JOINS the the email and the date, making a unique identifier to the minute--then I asked why I needed to do that: Just use the auto date/time column. However, I keep getting errors back, both on the auto date/time and the join columns. I'm pretty sure I've got the syntax right, since it's working on the email data. It can't be the data types or the auto portion, since I've got the combined working. GRR. I'm starting to run low on creative workarounds here!

    =IFERROR(INDEX({Intake Date of Case}, MATCH([Created + Email]@row, {Intake Created + Email}, 0)), "Email Not on Intake")

    This formula results in "Email Not on Intake"

    =INDEX({Intake Date of Case}, MATCH(Created@row, {Intake Created}))

    This one results in a blank until the match field is filled in, and when it is, the result is "No Match," though clearly there is, since it's blank until the match field is entered. Wrapping with the IFERROR does not change these results.

    What's vexing at this point is that it looks like contradictory behavior: Why is the email field working, but not the auto date or the join? I presume it's me that has done something wrong here ;-)

  • Hum. Copying and pasting a working formula in one cell results in an error in another.

    =IFERROR(INDEX(Intake Date of Case}, MATCH(Email@row, {Intake Email}, 0)), "Email Not on Intake")

    I won't be able to find the problem while this is going on...


    @Genevieve P. , is it possible to modify the formula you posted on 11/2/21 to @Reza Djangi so that the latest result gets returned? Instead of the 1 at the end, have a series and use a MAX statement or nested IF? Or use the JOIN(COLLECT then DATEONLY or something to bring back just the date? Here's what you posted on 11/2/21:

    =INDEX(COLLECT({Column To Return}, {Criteria Column 1}, "Criteria 1", {Criteria Column 2}, "Criteria 2"), 1)

    You need the 1 at the end of the INDEX function to identify what row to bring back. In this instance, the first match for all those criteria.

    If you may have multiple matches for the same criteria, you can use JOIN(COLLECT

    =JOIN(COLLECT({Column To Return}, {Criteria Column 1}, "Criteria 1", {Criteria Column 2}, "Criteria 2"), ", ")

  • Hi @Brent Guhl

    You've done a lot of work since I was last on this thread!

    The first thing to note is that with Created System Column dates, this does also include a timestamp. This means that if you create a row on the same date but at different times (even within a minute different), it won't find a match. You could use the DATEONLY function to grab just the date... perhaps like this:

    =INDEX(COLLECT({Date of Cases}, {Created}, DATEONLY(Created@row), {Email}, Email@row), 1)


    If you're not using the Copy Row automation, how are you getting other data into this second sheet? For example, the email address that you're using to match and bring back dates... how is this populated? Are you manually copying all the emails that come in each day to this second sheet, then wanting the formula to populate the rest?


    Your last comment...you may be on to something here.

    What I've never thought about before is using a formula to change out the 1 at the end of the INDEX function in an INDEX(COLLECT... if we could replace that with a COUNTIF formula which counts how many times this email has appeared in this sheet, then use that number as the row to bring back... this could work!


    Try this:

    =INDEX(COLLECT({Date of Cases}, {Email}, Email@row), COUNTIF(Email$1:Email@row, Email@row))


    This should bring back the correct Date for each instance of the same email! It would only match up correctly though if you had the exact same number of emails in this second sheet as you do in the first sheet.

    (@Paul Newcome have you used INDEX(COLLECT like this before to fill out duplicate rows, where you don't have a unique identifier? Am I missing something crucial?)


    So since this needs the same number of emails, I would set up the same INDEX(COLLECT formula but in the Flag column, so you can use a filter on the second sheet to ignore all the duplicates and focus just on the first entry.

    =INDEX(COLLECT({Flag Column}, {Email}, Email@row), COUNTIF(Email$1:Email@row, Email@row))


    Is this getting closer to what you need?

    Need more help? 👀 | Help and Learning Center

    こんにちは (Konnichiwa), Hallo, Hola, Bonjour, Olá, Ciao! 👋 | Global Discussions

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    @Genevieve P. I have, but it is rare (it does work though). 99% of the time I try to find some way to create a unique ID.


    I would also suggest using the 2 column method for generating the row number on the sheet so the COUNTIF could be changed to a COUNTIFS to pull in lower numbered rows and can then convert the formula into a column formula.

  • Hi @Genevieve P. ,

    It seems that there has to be a manual component to getting things from one sheet to another due to this primary key. Last year I was copy and pasting the demographic data. This year I'm using the index and match for all demo data except the date of the case. For that, I need a unique identifier because of this required manual component. Once that key is in hand, I just copy that key from the Intake to the Tracking sheet, and once the formula(s) are set up correctly, the rest of the data will populate (after saving on the Intake and refreshing on the Tracker). If there's a way to avoid any copy and paste, I'm all ears!

    I have to confess some frustration that the date/time stamp won't work. Since this would be the most unique of all identifiers (it takes a while to complete 72 columns!), it would seem the natural choice. It shouldn't be a problem to have the email numbers match, since every case that is entered on the Intake is tracked on the Tracker. Still, this seems like another unnecessary limitation. I suppose we could use the auto number feature. I don't want to because in the bad old days we used an assigned number for each case, and I've only just completed getting the program completely off of paper. Dropping that number was the final step, as the number was used on paper labels.

    If we go back to one of the initial ideas I mentioned, we should be able to use the flag formula to count the number of email duplicates, assigning a simple 1-10 or whatever to each case based on the number of times the email shows up in the intake. Then that number becomes the key that links the correct date--but we'd still need a key to link the sheets, so that may not be the best answer. As long as we need a unique identifier to link the sheets, that would shortcut all the rest of this. I don't see why the JOIN of the date stamp + email isn't working.

    Putting in the second formula you've mentioned, it's reporting the correct dates (once the column property is set to date). YES! An answer that works! I am concerned, however, by your "focus on the first entry" statement. I need the dates to be locked on the Tracker, as I have 20 automations running based on these, as well as several other in-sheet formulas. The date of the case cannot change without undoing all the efficiency Smartsheet offers.

    To explain a bit more about the process, every case is entered and tracked by definition: If it's not a case, it's not a case; if it is, it's entered and tracked. The end point of all cases is "completed." I've set up a nested IF statement to turn Harvey balls different colors to reflect the stage of the case. Since the end point of all cases are to be completed, the tracker requires only a simple filter, set to show only "Active Cases." This means I wouldn't need the flag column at all. I set it up as a stop-gap measure until I could solve the unique identifier/date-of-the-case problem.

    @Paul Newcome Thanks for your insight and continuing the discussion. I agree that a unique identifier would seem a to be a simpler solution. Can you help me understand why the date/time + email isn't working? This would seem to be a unique identifier.

  • Hi @Brent Guhl

    How exactly are you using the Date/Time column? You can use a DATEONLY function to remove out the timestamp so you just have the Date associated with that row, and then combine just the date with the email address:

    =DATEONLY(Created@row) + " - " + Email@row

    Or when you add a Count as well:

    =DATEONLY(Created@row) + " - " + Email@row + [Helper Count]@row

    However that same value would need to then be on your second sheet, so would you be copy/pasting this identifier into the second sheet?


    It sounds like the =INDEX(COLLECT({Flag Column}, {Email}, Email@row), COUNTIF(Email$1:Email@row, Email@row))

    formula is working for you, which is great!

    The "focus on the first entry" comment was in regards to how you want to ignore duplicate entries on the exact same day. You would need to have an indicator on the source sheet (the flag) which identifies that this entry is the first entry today, so it's the one you want to pull from (and then all the other entries on the same day from that same email are ignored). This is based on your initial question: (for my purposes, multiple cases on the same date are treated as the same case)

    I do think at this point we may be at a roadblock for what we can discuss in writing without seeing your actual sheet set-up and getting a feel for the entire process, including automations, etc. I would recommend booking a 30-minute screen share coaching session through Pro Desk, here, since your account has access to it.

    Cheers,

    Genevieve

    Need more help? 👀 | Help and Learning Center

    こんにちは (Konnichiwa), Hallo, Hola, Bonjour, Olá, Ciao! 👋 | Global Discussions

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!