Number of duplicates

Pltrmgrl
Pltrmgrl
edited 12/09/19 in Formulas and Functions

Hello,

 I'm brand new to Smartsheets. Is there a way to find the number of duplicates in a column for a specific value? For example, how many "John Smiths" are in a list of names? 

Many thanks,

Pltrmgrl 

Tags:
«1

Comments

  • Mike Wilday
    Mike Wilday ✭✭✭✭✭✭

    You could do a quick formula using countif. 

    =countifs([column name]:[column name], "John Smith")

    Would count the number of times the name John Smith appears in a column. 

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    Not to say Mike's formula won't work. It will. I just wanted to expand on it a little bit and maybe provide some other options.

     

    If you add a column that has a quick list of each name entered only once, you could build off of Mike's formula and use

     

    =countifs([column name]:[column name], [list column name]@row)

     

    next to each name in the list column. That would give you the same result and keep all data in a central location. Adding -1 to the end of the formula would give you how many times it shows up in the first column, not including the first occurrence.

     

    If John Smith is on the list twice, the first formula will give you a count of two total occurrences. The second formula (with -1 on the end) will give you a count of 1 duplicate entry.

     

    It just depends on exactly what you are looking for and how you want to display the data.

  • Hi,

    I can see that this is an older thread, but I have searched high and low to find a solution but haven't been successful.


    I'm trying to find a way that will identify the Second occurrence of a duplicate, only.

    For this exercise, I'm only looking at one Column [Receipit_ID]

    Row [Receipt_ID]

    1 NCC007533

    2 NCC007525

    3 NCC007533

    4 NCC007533

    5 NCC007525

    6 NCC007525



    I can get the Duplication flag, using a Checkbox column with formula

    =IF([Receipt_ID]1 = 0, 0, IF(COUNTIF([Receipt_ID]:[Receipt_ID], [Receipt_ID]1) > 1, 1))


    I have also tried using a "Duplicate Count" formula

    =COUNTIFS([Receipt_ID]:[Receipt_ID], [Receipt_ID]1) - 1


    Ideally I want to be able to flag (checkbox) every duplication occurrence, except for the first one.


    Is this possible?

    Thanks in advance for any assistance you may be able to provide.


    NB: if there is a simple "Remove Duplicates" option that I haven't come across yet, please let me know and I'll use that.


    Cheers

    Steve

  • Hey Steven,

    Here is a formula that will help you find only the 2nd duplicate.

    =IF(ISBLANK([Receipt_ID]3), "", IF(COUNTIF([Receipt_ID]$3:[Receipt_ID]@row, [Receipt_ID]3) = 2, "2nd Value - " + [Receipt_ID]3, ""))

    First Part: Ignore Blank Cells:

    =IF(ISBLANK([Receipt_ID]3), "",

    If you don't have this or another variation of this function you will notice that when the formula counts 2 blank rows that you will get a "2nd Value - " with no NC number value on the end which isn't helpful. So we say if cell is blank @ row then "".

    Second Part: The Formula

    IF(COUNTIF([Receipt_ID]$3:[Receipt_ID]@row, [Receipt_ID]3) = 2, "2nd Value - " + [Receipt_ID]3, ""))

    Next we are saying IF the COUNTIF of a specific Receipt_ID @ row equals 2 then say "2nd Value - " + value from that specific row.

    Hope this helps @StevenHawkins let me know if you have any questions.

    Best,

    Jeff

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    @StevenHawkins Did you want to find ONLY the second, or were you wanting to flag every duplicate other than the 1st? Using @Jeff Zumaris's solution means you will not locate a 3rd, 4th, etc duplicate.


    If you wanted to flag all duplicates except for the first, it would require only a minor modification to your originally provided formula.

    =IF([Receipt_ID]1 = 0, 0, IF(COUNTIF([Receipt_ID]$1:[Receipt_ID]@row, [Receipt_ID]@row) > 1, 1))

  • Thank you both @Jeff Zumaris and @Paul Newcome

    Yes, I should have been clearer, I did indeed want every duplicate identified.


    Have just tested and Paul's formula adaptation works perfectly.


    Thank you so much for the assistance.

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭
  • Hi Paul,

    What If I want to number the duplicate values by sequence like below?


    Item Duplicate Sequence

    apple 1

    banana 1

    apple 2

    apple 3

    banana 2

    apple 4


    Is this possible?


    Hope you find this post and Thank You in advance.

  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    @Julius Benn Sabeniano I would suggest inserting two helper columns. The first is a system generated auto-number column (no need to bother with any kind of syntax or anything/just let it go with the default). The second will be a text/number column called "Row ID" with the following column formula:

    =MATCH([Auto-Number]@row, [Auto-Number]:[Auto-Number], 0)


    Then to get the number you would enter this column formula into a text/number column:

    =COUNTIFS([Fruit Column]:[Fruit Column], @cell = [Fruit Column]@row, [Row ID]:[Row ID], @cell <= [Row ID]@row)

  • Hi Paul,

    Sorry for the late reply and thank you for helping me out.

    it looks like it did not show the expected result. It counts all the same or duplicate item. What I am needing help is to make a number system for the duplicate Items?

    See the table below. Is this possible?

  • Paul, I tried it and I forgot to put <= ROWID part, it is working now. Very Nice! Amazing!

  • @Paul Newcome

    This is perfect. I have this implemented and working.

    I am using this to track our backlog activity, specifically which stories appear in multiple Sprints.

    I am then calculating Story Points and Developer estimates by Sprint on a dashboard and reports.

    My struggle, I want to setup an automated Workflow that will only keep the highest count of the duplicate. The reason here is that the highest counted duplicate is the Sprint where the effort should show.

    I am not sure how to get the automation to find the highest count? My thought was maybe to reverse the count if and then only keep the Duplicate Count row IF=1?

    Hopefully this makes sense.



  • @Paul Newcome

    I am having trouble getting the formula to work to only identify the second duplicate. I can successfully identify all duplicates using this formula:

    I think I need to adjust the COUNTIF range to start from the first row and end at the current row but it says my formula syntax isn't right:


  • Paul Newcome
    Paul Newcome ✭✭✭✭✭✭

    @alexandromeda You would need to insert an auto-number column with no special formatting. Then you would use something like this...


    =IF([Lab Sample #]@row <> "", IF(COUNTIFS([Lab Sample #]:[Lab Sample #], @cell = [Lab Sample #]@row, [Auto-Number Column]:[Auto-Number Column], @cell <= [Auto-Number Column]@row) >= 2, "DUP")

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!