Formulas and Functions

Formulas and Functions

Ask for help with your formula and find examples of how others use formulas and functions to solve a similar problem.

FIND Not Working because of Colon

I am trying to extract the xth word from a value in [H-Column Name-UF]. Let's say, for example, I am trying to extract the FOURTH word from the value in [H-Column Name-UF].

If [H-Column Name-UF] = "Grief and Bereavement Specialist Professional Certificate", the formula below returns "Specialist"

If [H-Column Name-UF] = "Introduction to CPA Training", the formula returns "Training".

If [H-Column Name-UF] = "Sorting the Spin: How to Spot Made-up, Misleading, and Murky Information", the formula returns "Here". (It is normally a blank but I added "Here" to test out where the formula seems to be failing.)

The only difference I see is that "Sorting the Spin: How to Spot Made-up, Misleading, and Murky Information" has a colon (but maybe any kind of punctuation) after the third word. (Note: Not every value in [H-Course Name-UF] would have a colon, and if it did, it wouldn't always appear after the third word. If the colon is after the first or second words, no problem extracting the second or third word, respectively. It's just when the colon appears after the third, fourth, fifth, etc. word.

I tested "Sorting the Spin, How to Spot Made-up, Misleading, and Murky Information" and I get "Here", so looks like any punctuation, not just a colon.

The other columns (Helper5a and Helper6a) used to find the fifth and sixth words are also blank, even though "Sorting the Spin: How to Spot Made-up, Misleading, and Murky Information" has a fifth and sixth word ("to" and "Spot").

Here is the formula to find the FOURTH WORD (Helper4a), which is returning "Here" when there is a colon after the third word. It should return "How".

=IFERROR(IF(LEN([H-Course Name-UF]@row) - LEN(SUBSTITUTE([H-Course Name-UF]@row, " ", "")) = 3, MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1), LEN([H-Course Name-UF]@row) + 1 - FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1)), IF(OR(LEN([H-Course Name-UF]@row) - LEN(SUBSTITUTE([H-Course Name-UF]@row, " ", "")) = 4, LEN([H-Course Name-UF]@row) - LEN(SUBSTITUTE([H-Course Name-UF]@row, " ", "")) = 5), MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1), FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) + 1) - FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1)), "Here")), "")

=================================================

If it helps, here is the formula to find the FIRST WORD (Helper1a):

=IFERROR(MID([H-Course Name-UF]@row, 1, FIND(" ", [H-Course Name-UF]@row) - 1), "")

Looks if the FIRST WORD is all in caps (Helper1b):

=IF([H-Course Name-Helper1a]@row <> "", FIND("A", [H-Course Name-Helper1a]@row) + FIND("B", [H-Course Name-Helper1a]@row) + FIND("C", [H-Course Name-Helper1a]@row) + FIND("D", [H-Course Name-Helper1a]@row) + FIND("E", [H-Course Name-Helper1a]@row) + FIND("F", [H-Course Name-Helper1a]@row) + FIND("G", [H-Course Name-Helper1a]@row) + FIND("H", [H-Course Name-Helper1a]@row) + FIND("I", [H-Course Name-Helper1a]@row) + FIND("J", [H-Course Name-Helper1a]@row) + FIND("K", [H-Course Name-Helper1a]@row) + FIND("L", [H-Course Name-Helper1a]@row) + FIND("M", [H-Course Name-Helper1a]@row) + FIND("N", [H-Course Name-Helper1a]@row) + FIND("O", [H-Course Name-Helper1a]@row) + FIND("P", [H-Course Name-Helper1a]@row) + FIND("Q", [H-Course Name-Helper1a]@row) + FIND("R", [H-Course Name-Helper1a]@row) + FIND("S", [H-Course Name-Helper1a]@row) + FIND("T", [H-Course Name-Helper1a]@row) + FIND("U", [H-Course Name-Helper1a]@row) + FIND("V", [H-Course Name-Helper1a]@row) + FIND("W", [H-Course Name-Helper1a]@row) + FIND("X", [H-Course Name-Helper1a]@row) + FIND("Y", [H-Course Name-Helper1a]@row) + FIND("Z", [H-Course Name-Helper1a]@row), "")

=================================================

Here is the formula to find the SECOND WORD (Helper2a), which returns the second word even if the first word ends with a colon:

=IFERROR(IF([H-Course Name-Helper1a]@row <> "", MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) - FIND(" ", [H-Course Name-UF]@row) - 1), ""), MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row), LEN([H-Course Name-UF]@row) + 1 - FIND(" ", [H-Course Name-UF]@row)))

Helper2b is the same as Helper1b (finding it second word is ALL CAPS).

=================================================

Here is the formula to find the THIRD WORD (Helper3a), which returns the third word even if the second word ends with a colon:

=IFERROR(IF(LEN([H-Course Name-UF]@row) - LEN(SUBSTITUTE([H-Course Name-UF]@row, " ", "")) > 2, IF([H-Course Name-Helper2a]@row <> "", MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1), FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) - FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1)), ""), MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1), LEN([H-Course Name-UF]@row) + 1 - FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1))), "")

Helper3b is the same as Helper1b (finding if third word is ALL CAPS).

=================================================

Here is the formula to find the FIFTH WORD (Helper5a), which normally returns the word correctly, except when there is a punctuation after the third word, and Helper4a is "Here":

=IFERROR(IF(LEN([H-Course Name-UF]@row) - LEN(SUBSTITUTE([H-Course Name-UF]@row, " ", "")) = 4, MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) + 1), LEN([H-Course Name-UF]@row) + 1 - FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) + 1)), IF(LEN([H-Course Name-UF]@row) - LEN(SUBSTITUTE([H-Course Name-UF]@row, " ", "")) = 5, MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) + 1), FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) + 1) + 1) - FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) + 1)), "")), "")

=================================================

Finally, here is the formula to find the SIXTH WORD (Helper6a), which normally returns the word correctly, except when there is a punctuation after the third word, and Helper4a is "Here":

=IFERROR(IF(LEN([H-Course Name-UF]@row) - LEN(SUBSTITUTE([H-Course Name-UF]@row, " ", "")) = 5, MID([H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) + 1) + 1), LEN([H-Course Name-UF]@row) + 1 - FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row, FIND(" ", [H-Course Name-UF]@row) + 1) + 1) + 1) + 1)), ""), "")

Best Answer

  • Community Champion
    Answer ✓

    Hi

    I was able to replicate your sheet and see the problems, although I do not think the colon is the cause, as one of my courses worked fine, with a colon:

    I could not immediately see what the issue was. However, I tried starting from scratch using a simpler formula and it worked out as expected. So rather than troubleshooting, how about switching the formula to one that uses the SUBSTITUTE function to identify the nth space and return the text between it and the nth+1 space.

    The formula is:

    =IFERROR(MID([H-Course Name-UF]@row, FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 3)), FIND("*", SUBSTITUTE([H-Course Name-UF]@row + " ", " ", "*", 4)) - FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 3))), "")

    For other columns you just need to change those 3 numbers in bold. If you have * in your data replace all the * with another character that is not in the data.

    So 5a would be:

    =IFERROR(MID([H-Course Name-UF]@row, FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 4)), FIND("*", SUBSTITUTE([H-Course Name-UF]@row + " ", " ", "*", 5)) - FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 4))), "")

    The result is:

    Breaking that formula (for the 4th word) down:

    This part/syntax will be familiar: =IFERROR(MID([H-Course Name-UF]@row, …., .… ), "")

    IFERROR returns nothing in the case of an error (which in this situation is if there is no MID text to return because there is not a 4th word)

    I am not doing any comparison of string length with or without the spaces.

    I jump straight to the MID and to the text to parse.

    Instead of multiple FIND functions to identify the start position I use just the one and look for a * rather than a space. Don't panic!

    Rather than looking in the course name column for that *, I wrap that course name column in a SUBSTITUTE function and replace the 3rd space with a * then find the position of that *. In other words this part is the start position:

    FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 3))

    And it says start at the 3rd * in the string if all spaces were replaced by *. It is a bit convoluted but there is no way to find the nth anything without substituting. As we can't find the nth space, we change it into something else and find that.

    Then for the length of text to extract we find the position of the 4th * and subtract the position of the 3rd *. Leaving us with the length of the text in between.

    FIND("*", SUBSTITUTE([H-Course Name-UF]@row + " ", " ", "*", 4)) - FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 3))

    I hope that all makes sense and is a useful alternative. Let me know how you get on.

Answers

  • Community Champion
    Answer ✓

    Hi

    I was able to replicate your sheet and see the problems, although I do not think the colon is the cause, as one of my courses worked fine, with a colon:

    I could not immediately see what the issue was. However, I tried starting from scratch using a simpler formula and it worked out as expected. So rather than troubleshooting, how about switching the formula to one that uses the SUBSTITUTE function to identify the nth space and return the text between it and the nth+1 space.

    The formula is:

    =IFERROR(MID([H-Course Name-UF]@row, FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 3)), FIND("*", SUBSTITUTE([H-Course Name-UF]@row + " ", " ", "*", 4)) - FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 3))), "")

    For other columns you just need to change those 3 numbers in bold. If you have * in your data replace all the * with another character that is not in the data.

    So 5a would be:

    =IFERROR(MID([H-Course Name-UF]@row, FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 4)), FIND("*", SUBSTITUTE([H-Course Name-UF]@row + " ", " ", "*", 5)) - FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 4))), "")

    The result is:

    Breaking that formula (for the 4th word) down:

    This part/syntax will be familiar: =IFERROR(MID([H-Course Name-UF]@row, …., .… ), "")

    IFERROR returns nothing in the case of an error (which in this situation is if there is no MID text to return because there is not a 4th word)

    I am not doing any comparison of string length with or without the spaces.

    I jump straight to the MID and to the text to parse.

    Instead of multiple FIND functions to identify the start position I use just the one and look for a * rather than a space. Don't panic!

    Rather than looking in the course name column for that *, I wrap that course name column in a SUBSTITUTE function and replace the 3rd space with a * then find the position of that *. In other words this part is the start position:

    FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 3))

    And it says start at the 3rd * in the string if all spaces were replaced by *. It is a bit convoluted but there is no way to find the nth anything without substituting. As we can't find the nth space, we change it into something else and find that.

    Then for the length of text to extract we find the position of the 4th * and subtract the position of the 3rd *. Leaving us with the length of the text in between.

    FIND("*", SUBSTITUTE([H-Course Name-UF]@row + " ", " ", "*", 4)) - FIND("*", SUBSTITUTE([H-Course Name-UF]@row, " ", "*", 3))

    I hope that all makes sense and is a useful alternative. Let me know how you get on.

  • Thank you so much. I love finding more streamlined ways to write formulas. This makes adding checks for additional words so much easier.

  • Community Champion

    Glad I could help.

Help Article Resources

Want to practice working with formulas directly in Smartsheet?

Check out the Formula Handbook template!

Trending in Formulas and Functions