Formula for Duplicate Entries
Answers
-
Can someone give an example of a formula you would use to catch duplicate entries in a sheet? If I have a "Name" column for example, how would that be implemented?
-
You can find duplicates by using a COUNTIF function to find if something appears more than once, and if it does, return a specific value (such as a checkmark in a check box column).
Try something like this:
=IF(COUNTIF(Name:Name, Name@row) > 1, 1, 0)
I would apply this as a Column Formula (see here).
Here are some other Community Posts where finding duplicates is discussed:
- Is there a way to catch duplicates?
- Cross referencing for duplicate numbers (cross-sheet formula)
- Finding duplicates and checking blank cells
Cheers!
Genevieve
Join us at Smartsheet ENGAGE 2024 🎉
October 8 - 10, Seattle, WA | Register now -
Here is another approach:
=IF((COUNT([Column A]1:[Column A]12)) = (COUNT(DISTINCT([Column A]1:[Column A]12))), "No Dups", "Duplicates exist")
The formula compares a count of all entries in a column (limited range in this case)to a count if distinct values from the same range.
You can put this at the top of a sheet, or in your sheet summary, or drop the cell into a report.
-
Is there a way to only check the box if the duplicate isn't the first in the list? I have been using =IF(COUNTIF(Resources$1:Resources@row, Resources@row) <> 1, 0, 1) as a formula but I need to make the formula a column formula. Is there any way to do this? @Genevieve P.
-
Hi @Sarah123
Since you're using an absolute reference with a row number (Resources$1) then no, you won't be able to make this into a column formula.
That said, if you lock the column, then as long as new rows are added either between two rows with that same formula or immediately below content in your sheet, the formula will auto-fill down your sheet automatically. See: Use or Override Automatic Formatting and Formula Autofill
Cheers,
Genevieve
Join us at Smartsheet ENGAGE 2024 🎉
October 8 - 10, Seattle, WA | Register now -
Hi @Genevieve P. - I was wondering. I have a row id and created the formula to count for duplicates. Mine is a bit different:
My formula allows me to convert to column formula, but it is just blank? =IF(Duplicates@row > 1, [Row ID]@row = MAX(COLLECT([Row ID]:[Row ID], 0, IF(Duplicates@row > 1, [Row ID]@row <> MAX(COLLECT([Row ID]:[Row ID], 10)))))) Can you tell me what I am doing wrong?
-
Hey! Happy to help.
The first thing I notice is that you have two statements that say the same thing... "If the Duplicates cell to the right is greater than 1..."
The second thing is that your MAX(COLLECT( statements are missing their closing parentheses, and potentially another range? The MAX function won't be able to read the text in the Row ID, as MAX only looks at numbers and dates.
Would you be able to explain what you want it to do? If you want to bring back all the duplicate row names, I would suggest using JOIN collect, like so:
=IF(Duplicates@row > 1, JOIN(COLLECT([Column to Return]:[Column to Return], [Row ID]:[Row ID], [Row ID]@row), ", "))
I'm not sure what column you want to bring back, so I've put in [Column to Return] as an example. If this hasn't helped, it would be useful to see our formula in the Duplicates column and hear an explanation of what you want the current formula to do.
Cheers,
Genevieve
Join us at Smartsheet ENGAGE 2024 🎉
October 8 - 10, Seattle, WA | Register now -
I would insert an Auto-number column (called "Auto" in this example). Ten you can adjust your COUNTIFS like so:
=IF(COUNTIFS(Resources:Resources, @cell = Resources@row, Auto:Auto, @cell< Auto@row)> 0, 1)
Come see me in Seattle at ENGAGE 2024! I will be at the 10xViz partner booth throughout as well as the "Lets Talk About Community" Meet & Eat table on Tuesday from 11:45am - 12:45pm!
-
@Genevieve P. In the duplicates for deletion column, I want to identify the old data that needs to be removed from the smartsheet and keep the latest data. Is that possible?
-
Hi Natalie,
What else do you have in your sheet? What I mean is, how would you define "old" data - do you have a Date Created column or an Auto Number like Paul is suggesting?
Can you also let us know what formula you have in the "Duplicates" column?
Join us at Smartsheet ENGAGE 2024 🎉
October 8 - 10, Seattle, WA | Register now -
I have the modified date and thought I could utilize that for the forumula to keep the latest update.
formula for finding duplicates: =IF(COUNTIF([Row ID]:[Row ID], [Row ID]@row) > 1, 1, 0)))
-
Hey Natalie,
This is great, thank you!
Here's the formula I would use:
=IF(OR(Duplicates@row = 0, Modified@row = MAX(COLLECT(Modified:Modified, Duplicates:Duplicates, 1))), "Keep", "Delete")
This says that if the Duplicate column says 0, this is automatically a "keep" row. OR if the Duplicates column has a 1 in it, but the current Modified Date is the MAX date (the newest date) then keep that as well.
In my sheet I have the "Duplicates" column as a Flag column so I don't expect numbers other than 1 to appear:
You can change the words around however you'd like, but this should make it easier to identify which ones to keep or delete. You can also use conditional formatting to highlight rows that should be removed:
Cheers,
Genevieve
Join us at Smartsheet ENGAGE 2024 🎉
October 8 - 10, Seattle, WA | Register now -
I would suggest some caution when leveraging the Modified (date) column.
In theory... If the formula changes it from unflagged to flagged, the modified column will update which in turn makes it the most recently updated row which then unflags the column and flags another, which starts the process all over again on another row.
In practice, you will end up with two rows that have the same exact modified date/time and even though they are duplicates, since they both meet the MAX criteria, they will both be flagged as "Keep" which completely defeats the purpose.
I have been trying to figure out a way around this for a while now. I think the trick is going to be very specific triggers in the Copy Row automation so that it only copies the row when certain fields are changed (to exclude the modified date/time change triggering and exclude the flag change triggering which would end up causing the same issue as above).
Come see me in Seattle at ENGAGE 2024! I will be at the 10xViz partner booth throughout as well as the "Lets Talk About Community" Meet & Eat table on Tuesday from 11:45am - 12:45pm!
-
Ah! Goodness I completely missed that the System Modified column would cause this trouble - of course.
What about a Date column that records a date with a workflow based on if specific columns are changed?
Join us at Smartsheet ENGAGE 2024 🎉
October 8 - 10, Seattle, WA | Register now -
@Genevieve P. I had thought about that too, but that only works if you won't have more than one row updated in a single day. For example... If I update the first "Row 1" in your screenshot this morning and then the second "Row 1" in your screenshot this evening, I will have two rows that are "Row 1" with the same date. If only one row will be updated each day, then that would certainly work though.
Come see me in Seattle at ENGAGE 2024! I will be at the 10xViz partner booth throughout as well as the "Lets Talk About Community" Meet & Eat table on Tuesday from 11:45am - 12:45pm!
Help Article Resources
Categories
- All Categories
- 14 Welcome to the Community
- Smartsheet Customer Resources
- 62.8K Get Help
- 376 Global Discussions
- 207 Industry Talk
- 440 Announcements
- 4.5K Ideas & Feature Requests
- 139 Brandfolder
- 129 Just for fun
- 130 Community Job Board
- 449 Show & Tell
- 30 Member Spotlight
- 1 SmartStories
- 284 Events
- 33 Webinars
- 7.3K Forum Archives
Check out the Formula Handbook template!