Rank themes by frequency

Kate HC
Kate HC ✭✭
edited 10/10/22 in Formulas and Functions

Hello,

I would like to count and rank themes of concerns on my dashboard so we know which issues come up the most frequently. These are not pre-populated and will change over time as new themes come up.


Example:

In these 5 rows, I would want to display: Care plan (4), Communication (1), EVS Variation (1). If there were a tag of "cleanliness" next week, I would like the new list to autopopulate with: Care plan (4), Communication (1), EVS Variation (1), Cleanliness (1). Ideally, I would like my dashboard to only show the top 10 themes.



Any recommendations? I have thought about COUNTIF and HAS formulas, but cannot figure out how to include new themes as they emerge.

Thank you

Answers

  • PWNA Sam Harwart
    PWNA Sam Harwart ✭✭✭✭✭

    How frequently will the themes change? If the list is known or wouldn't be difficult to update, there are some possible solutions.

    If it needs to be 100% dynamic, you could use a report that summarizes & groups by count but that doesn't split multi-select columns or sort only the top 10.

    It's an interesting problem. Looking forward to seeing what people can come up with to solve.

  • PWNA Sam Harwart
    PWNA Sam Harwart ✭✭✭✭✭

    Hey @Kate HC - still looking for a solution? I've got something that, albeit kind of messy, works. It uses a helper sheet and would need some manual intervention every so often to add rows, but is generally pretty automated. Let me know if it would be helpful and I can post details!

  • Kate HC
    Kate HC ✭✭

    Thanks for the response. It would need to be as dynamic as possible. We have lots of responses to sort through, so manually updating would be very difficult.

    I would appreciate your thoughts.

  • PWNA Sam Harwart
    PWNA Sam Harwart ✭✭✭✭✭

    @Kate HC

    First I made a sheet that holds the data - I've used colors where you'll have your themes.

    Everything else happens on the helper sheet:

    The [Primary Column] is 0 - N, where N is the max number of combos you think you might realistically see. I used 50 for my testing, but there's no reason you couldn't set up way more.

    The [Raw Uniques] parses the list of colors from the multi-select column on the original sheet and extracts all of the unique combinations. It creates a list of the combinations and then uses the [Primary Column] value to get the Nth item in the list. You can make this a column formula.

    =IFERROR(INDEX(DISTINCT({Colors}), [Primary Column]@row), "")

    [Combine Unique Multi] takes all the unique colors from the list of unique combinations and puts them into a single cell. You'd need to change the "50" value here if you want more rows. CHAR(10) is a return, which is what separates multi-select values.

    =JOIN([Raw Uniques]2:[Raw Uniques]50, CHAR(10))

    Next, I turned it into a normal text string, again using CHAR(10) in the [Convert to Text, What's Left] column. This is where some of the formulas are different for the first row or two.

    For Row 1, I used this formula:

    =JOIN([Combine Unique Multi]@row, ";")

    For Row 2 and beyond, we're going to use "What's Left" after we remove each unique color as we go through the list.

    =RIGHT([Convert to Text, What's Left]2, LEN([Convert to Text, What's Left]2) - [Find CHAR 10]@row)

    [Find Char 10] is going to locate the next return separating each value in our list. Row 1 is blank, Row 2 has this:

    =IF([Find CHAR 10]@row = 0, [Convert to Text, What's Left]@row, LEFT([Convert to Text, What's Left]1, [Find CHAR 10]@row))

    Eventually, we get to the bottom of the list and don't find any other CHAR(10) values, so we'll have to deal with that as well.

    [Separate Multi] extracts a unique value for each row, using the CHAR(10) as its delimiter.

    =IF([Find CHAR 10]@row = 0, [Convert to Text, What's Left]@row, LEFT([Convert to Text, What's Left]1, [Find CHAR 10]@row))

    Now we're getting to the easier part. [Color Frequency] looks at the original sheet to count the occurrences of each unique color.

    =COUNTIF({Colors}, CONTAINS([Separate Multi]@row, @cell))

    I couldn't find a better way to do this earlier in the process, so [Color Name] now eliminates the repeated final value. This is the one we'll use for our reports.

    =IFERROR(INDEX(DISTINCT([Separate Multi]:[Separate Multi]), [Primary Column]@row), "")

    From there, we can build a simple report using the [Color Name] and [Color Frequency]. Just filter out any error values from the formulas.

    Now you could also add in a column that used =LARGE to rank the frequency of the colors and then only pull in the top 10 on your report, or pull in them all to a dashboard and then make it small enough that only the top XX show.

    For example, I sized it to only show the top 5. You can scroll to see the rest if needed.

    Whew. This was a mess and I am 1000% sure this isn't the best way to do this, but I've at least got it working. The computer science classes I took back in undergrad lead me to believe this is a super inefficient way to solve the problem and a python script would have made quick work of it.

    Maybe someone like @Andrée Starå will come drop in and find a single formula to solve the whole thing 🤣

    Let me know what you think and what questions you have!

  • Kate HC
    Kate HC ✭✭

    Wow thank you for all your effort on this - It may take me a few days to sort through it! I will let you know what questions I have.

  • PWNA Sam Harwart
    PWNA Sam Harwart ✭✭✭✭✭

    Sure thing. It was a fun problem to try and solve. Hopefully it helps! I'm sure you'll come up with a more efficient solution as you try and implement it.

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!