Formula for Duplicate Entries

2»

Answers

  • Yes, that makes sense - depends on how often these rows/duplicates are coming in.

    ^ Natalie, will your sheet always be ordered in a specific way? E.g. new rows at the bottom?

    If so, Paul what do you think about an auto-number column then a Row number column, using the MIN row number as the one to keep?

    Join us at Smartsheet ENGAGE 2024 🎉
    October 8 - 10, Seattle, WA | Register now

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    @Genevieve P. I was thinking about that too. Leveraging an auto-number column could help, but that is assuming the first (or last) entry would be the one most recently updated.


    I assume that since Natalie (I can't tag her for some reason) is using the Modified as opposed to the Created that is not the case though.


    @Natalie (still can't tag her), can you confirm exactly what you mean by "latest update"?

  • PJ Beaudry
    PJ Beaudry ✭✭✭
    edited 06/06/23

    What about if I want to narrow the search for duplicates by reference multiple columns? That is, if I only want to flag a duplicate if the "Request Name" AND the "Date" columns match for multiple rows (those are the column names)? For example, if it wouldn't be enough for the "Request Name" to match to flag the potential duplicate.

    I assume it would be some variation of the formula that has AND() built in, but I'm not sure how to write it. I don't need to auto-delete or anything else mentioned above... I'd just like to have the formula check a "flag" column as mentioned.

    Apologies if this has been asked previously... I tried to scan the posts here, but didn't see anything referencing this.

  • PJ Beaudry
    PJ Beaudry ✭✭✭

    Never mind, figured it out through trial and error.

    Using the AND() function worked, but I ran into an issue where if one of the two reference columns were blank, it would flag, so I wrote a nested IF() that fixed it. Here's the result (which only flags a potential duplicate if both fields are filled out AND both match):

    =IF(OR(ISBLANK(Date@row), ISBLANK(Name@row)), 0, IF(AND(COUNTIF(Name:Name, Name@row) > 1, COUNTIF(Date:Date, Date@row) > 1), 1, 0))

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    @PJ Beaudry Here is a simplified version:

    =IF(AND(Date@row <> "", Name@row <> ""), IF(COUNTIFS(Name:Name, Name@row, Date:Date, Date@row)> 1, 1))

  • @Paul Newcome Can you explain the formula you wrote?

    =IF(AND(Date@row <> "", Name@row <> ""), IF(COUNTIFS(Name:Name, Name@row, Date:Date, Date@row)> 1, 1))

    (From another post I submitted earlier) I would like to compare submissions such that older submissions from same field "Name" are moved to different sheet. I can add a helper column called "Older" and would like to either populate the older rows with a number to signify it is older, or a simple yes/no dropdown. I currently have a submission date (Created date) column.

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    @mrodri90 That formula basically says this:

    If the Date and Name column "@row" are not blank, then we want to continue to the next piece of the formula.

    The next piece of the formulas says that if the count of rows that have the same Name AND Date "@row" is more than 1 then output 1 (flag or check a box).


    This is close in logic but not quite what it sounds like you need though. To flag older fields, you want to COUNTIFS the name is the same but the [Created Date] is "greater than" the [Created Date]@row and if there is at least one row with a greater [Created Date], then "@row" is an old row.

    We can also simplify just a little since we already know the [Created Date] column will never be blank (since it is system generated).

    =IF(Name@row <> "", IF(COUNTIFS(Name:Name, Name@row, [Created Date]:[Created Date], @cell> [Created Date])> 0, 1))

  • @Paul Newcome I got an #UNPARSEABLE


    Also is this a correct interpretation of the formula? If so, what does the purple part mean?


  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
    edited 06/27/23

    @mrodri90 Sorry. That was supposed to be

    [Created Date]@row


    (the purple part should be the above)

  • RAllen
    RAllen ✭✭
    edited 03/14/24

    We have an imported project list and a duplicate check as we amend the list very similar to above

    =COUNTIF([Project Number]:[Project Number], [Project Number]@row)

    If the DupeCheck is greater than one it can be filtered and is a duplicate. I was copying 1, 2, in a column ahead, sorting and deleting all the 'duplicate' and '1's

    Now for you GURUs out there - would there be a column formula that could ID the duplicate relative to the list? I.E.:

    Dup Name

    1 SameName

    2 SameName

    3 SameName

    So we could sort by "dup" and delete everything greater than one? Pre-sorting would determine order preference...

    BUT- Smart sheets doesn't allow relative row reference in a column function?! or I have my function incorrect :(

    =COUNTIF([Project Number]:[Project Number]@row, [Project Number]@row)

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    @RAllen You would need to insert an auto-number column (called "Auto" in this example) and a text/number column (called "Row" in this example) with the following column formula:

    =MATCH(Auto@row, Auto:Auto, 0)


    Then your dupe column formula would be something more along the lines of

    =COUNTIFS([Project Number]:[Project Number], @cell = [Project Number]@row, Row:Row, @cell<= Row@row)

  • RAllen
    RAllen ✭✭
    edited 03/18/24

    @Paul Newcome I'll try that! Unfortunate to have to add another column but every little bit helps. Thanks!

    Works - VERY CPU intensive with over 10K entries! I wonder if this us why SS 'doesn't handle' partial row indexing in their functions? I'll add a on/off when the page comes back : )

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!