Highlight duplicate values with a criteria within multiple select column

Sander K · 01/06/21

Hello

I have this system where workers report with a form their done tasks and locations for those specific tasks. I am trying to avoid the option for them to do duplicate inputs.

It works like this: worker chooses a task in form cell, then the locations where he did that task. They do multiple locations per form input, up to 20 or so, that is why the location is a multiple drop down list. Total list of locations is ca 1000. If one of those locations is reported for that specific task, then it is 100% completed and SHOULD NOT be reported again, however that happens, due to human error and so.

That is why I trying to come up with a helper column with some formula where it is highlighted that work for this location and this specific task there has already been reported.

Example of the form inputs look like this: new entry is added as the top row, leftmost column is is the Locations (a, b, c etc.) with multi select column, middle column is the Task ID, and rightmost column is where duplicates are highlighted.

My goal is so that in the right column the Red flag appears automatically to highlight that a location of row number 3 has already been reported in row number 5 for that specific task, same with row number 6 and 8. In this example I have marked the flag manually.

I figured I could make something with COUNTIFS to first count all locations with specific criteria of the task and compare it within the task ID, but I could not get it to work.

Link to the sheet, feel free to modify it: https://app.smartsheet.com/b/publish?EQBCT=47d9ee0c958d43b4884212959f1cba25

I would really appreciate some anybody has some ideas on how to proceed.

Thank you

Paul Newcome · 01/06/21

You have 1000 locations. How many tasks?

Sander K · 01/06/21

33 Tasks

Paul Newcome · 01/06/21

Ok. I have a couple of ideas, but will need to do some testing first. I'll get back to you with what I find out.

Paul Newcome · 01/06/21

Would you be open to a different method for tracking duplicates? Such as listing out which tasks (if any) were duplicated for each location?

Sander K · 01/06/21

hmmmm, not exactly. But I created now a separate file where I have a table:

Here I have ca 1000 rows with locations and 33 columns tasks. In the cells where are "OK", "Double" and "1" are nested IF formulas, which highlight if for that task and that location there is a duplicate reported.

=IF(COUNTIFS({range}, task1, {range}, CONTAINS($[Location]908, @cell)) > 0, IF(COUNTIFS({range}, task1, {range}, CONTAINS($[Location]908, @cell)) > 1, "DOUBLE", "OK"), 1)

Paul Newcome · 01/06/21

Yes. That is basically where I started as well. I built out the same table with a row for each location and a task for each column then used COUNTIFS to pull the count.

The only difference is that you used the IF's to populate "OK", "Double", or "1" directly in the grid based on the count, and I used a JOIN/COLLECT to pull a list into a separate field based on the counts. So you had the same basic idea as I did. Hahaha

One thing I would suggest... Enter the text for each task in row 1 across the top of your table and use cell references instead of specific text in your formulas. This will allow you to use an even more generic COUNTIFS that can be dragfilled both down the rows and across the columns. It also means that if you need to adjust the text of the task name, you can adjust it once in the cell in row 1 instead of adjusting it twice in the formula and having to dragfill again.

Leibel S · 01/06/21

@Sander K

Here is a solution, but requires a few extra columns...

LINE-ID: SYSTEM AUTO NUMBER
ROW#: =MATCH([Row ID]@row, [Row ID]:[Row ID], 0
Collect Location: (multi select column) =JOIN(COLLECT(Location:Location, [Task ID]:[Task ID], [Task ID]@row, [ROW#]:[ROW#], <=[ROW#]@row), CHAR(10))
Location Count: =COUNTM(Location@row)

Your duplicate flag formula would then be:

=IF(SUMIFS([Location Count]:[Location Count], [Task ID]:[Task ID], [Task ID]@row, [ROW#]:[ROW#], <=[ROW#]@row) > COUNTM([Collect Location]@row), 1, 0)

Let me know if you need help with the details.

Paul Newcome · 01/06/21

@Leibel S I'm having a little trouble following how this works... If I select Location A and Task 1 then later Location A and Task 2, wouldn't this setup flag it as duplicates even though it is a different task for that location?

Leibel S · 01/06/21

@Paul Newcome

I made a slight edit to the above comment to clarify that Collect Location should be a multi select column.

The basic premise is that in collect location we pull in all the locations that have been associated with this task (including the current row). Because it is a multi select column it automatically removes duplicates.

Location count is the qty of locations for that specific row

If I sum up the location count for this task till and including the current row it will give me the total qty of locations specified for this task (till an including this row).

If that number is more then the count of locations in 'collect location', that means that collect location removed a duplicate and that you have a duplicate on your row.

Paul Newcome · 01/06/21

@Leibel S Ah. Ok. Changing it to the multi-select did the trick. That was the part I was missing.

Side note: This solution will not flag the first occurrence. So using the screenshot in the original post it would only flag rows 5 and 8. To get it to flag all rows that have the duplicate including the first occurrence (as shown in the original screenshot), you would only have to remove all references to the row #.

Leibel S · 01/06/21

@Paul Newcome

Good point. That also simplifies it...

Paul Newcome · 01/06/21

@Leibel S I do like the idea of using an additional Multi-Select column to filter out duplicates. I am going to have to remember that little trick. I hadn't thought of it before. Now I need to try to find some of the other posts that I have commented on where we parsed the options out in helper columns to see if this solution can be adapted.

Sander K · 01/07/21

@Paul Newcome and @Leibel S Thank you very much for your help, I got it working nicely. However, right now when new input is reported and it is a duplicate to some of the previous ones, then the previous one is highlighted, not the new one. How could I make it the opposite way?

Paul Newcome · 01/07/21

If you remove the references to the row number from the formulas, it will highlight all duplicates including the first occurrence.

If your new entries are at the top of the sheet and you only wanted to flag the duplicates excluding the first entry, then you could swap the row references from "less than or equal to" to "greater than or equal to".

Leibel S · 01/07/21

@Sander K @Paul Newcome

There is an issue with my process where as once there is a duplicate all of the subsequent rows will be flagged. Will look into this further.

@Sander K Regarding your question about new one vs old one. Are the rows coming into the sheet on top or on bottom?

Highlight duplicate values with a criteria within multiple select column

Answers

Help Article Resources

Categories