How to check the Duplicate rows except the first occurance

Hi. I have been tring to implement this with a column formula. I was able to check all the occurances but I want to ignore the first occurance.

I used the following formula

=IF(COUNTIFS([Conference Room]:[Conference Room], [Conference Room]@row, [From:]:[From:], [From:]@row, [To:]:[To:], [To:]@row, [Time:]:[Time:], [Time:]@row) > 1, 1, 0)

can someone please let me know on how I can modify this formula such that I can get my desired output.


Thank you

Best Answer

  • Scott Orsey
    Scott Orsey ✭✭✭✭✭
    Answer ✓

    Hi @malanto,

    Since ROWID is always ascending, the most recent will have the largest value. So Just swap the MAX() function for the INDEX() function. In your case:

    =IF([ROWID]@row <> MAX(COLLECT([ROWID]:[ROWID], [Task Number]:[Task Number], [Task Number]@row, [Task Desc]:[Task Desc], [Task Desc]@row), 1, 0)

    Hope that helps,

    Be well

    If my response was helpful or answered your question please be sure to upvote it, mark it asawesome, or mark it as the accepted answer!

Answers

  • Scott Orsey
    Scott Orsey ✭✭✭✭✭
    edited 03/14/24

    I think I understand. You only want to return a "1" for the second duplicate row (not both rows that are duplicates), correct?

    If so, one way you can do this is to leverage a unique identifier for each row. If you don't have one already, you can use auto number to create a column of unique ROWIDs. Then, instead of COUNTIFS, you can use INDEX()/COLLECT() like this:

    =IF([ROWID]@row <> INDEX(COLLECT([ROWID]:[ROWID], [Conference Room]:[Conference Room], [Conference Room]@row, [From:]:[From:], [From:]@row, [To:]:[To:], [To:]@row, [Time:]:[Time:], [Time:]@row),1), 1, 0)

    The COLLECT() function is gathering all of the ROWIDs that meet the set of criteria.

    The INDEX() Function returns the first ROWID from the list you've just collected.

    The IF() function only sets it to 1 if the current ROWID isn't the same as the 1st ROWID in the collected list.

    In the end, you will get a "1" on all of the duplicate rows but the first duplicate. You'll get 0's everywhere else.

    Hopefully I didn't miss a comma or parentheses in my formula.

    Good luck!

    If my response was helpful or answered your question please be sure to upvote it, mark it asawesome, or mark it as the accepted answer!

  • malanto
    malanto ✭✭

    Hi Scott,

    can you help me with a similar case?

    How can I do the opposite?

    In case of duplicate, have a different value that allows me to delete the row automatically.

    For example

    I need

    delete rows ID 1 and 2 (because they are the oldest)

    don't delete row 3 (because it has no duplicates)

    don't delete lines 4 and 5 (they are duplicates of lines 1 and 2 but are the most updated)

    Do you know an easy way to proceed?


    Thank you!

  • Scott Orsey
    Scott Orsey ✭✭✭✭✭
    Answer ✓

    Hi @malanto,

    Since ROWID is always ascending, the most recent will have the largest value. So Just swap the MAX() function for the INDEX() function. In your case:

    =IF([ROWID]@row <> MAX(COLLECT([ROWID]:[ROWID], [Task Number]:[Task Number], [Task Number]@row, [Task Desc]:[Task Desc], [Task Desc]@row), 1, 0)

    Hope that helps,

    Be well

    If my response was helpful or answered your question please be sure to upvote it, mark it asawesome, or mark it as the accepted answer!

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!