Analogous Strings and Validation Thereof

One of the many things that you’ll need to do when working with strings is to compare them. In a perfect world – and we all know that we don’t live in one of those – we would just have to do the simplest of comparison tools – check if they’re equal.

Checking if equal

In this comparison, we’ve checked if the values in the field City are equal to “Atlanta.” Now, there is the exact opposite of this comparison – not equal.

Two different ways to do not equal

Both ways – != or <> – work exactly the same as far as SQL Server is concerned. It may work differently if you work in other relational databases, but I digress…

One of the things you may be asked to solve is if a value is similar to another. On the surface, this does not sound technically possible – a computer will tell you if it or if it isn’t something every time. Period. Well, my friends, this one is a little different.

The function SOUNDEX, which you may have heard of or even used in Microsoft Access, is a built-in way of checking if values are similar.

SOUNDEX values

To figure out the SOUNDEX values of a string, you must remove the letters A, E, I, O, U, H, W, and Y unless they are at the start of the string. You then exchange the letters left in your string with the numbers found in this chart:

NumberRepresents the Letters
1B, F, P, V
2C, G, J, K, Q, S, X, Z
3D, T
4L
5M, N
6R
Handy Dandy SOUNDEX Chart

Any adjacent letters that have the same equivalent number are coded as one letter with a single number.

For more rules that may or may not be used in certain circumstances, please review the webpage found here.

Now, you’re probably wondering how do we do anything with this information? Very simple, my friends!

Using SOUNDEX

So we checked the SOUNDEX result of each city in the NirvanaPerformances table against the SOUNDEX result of “Tacoma.” Our process found that Tacoma and Tijuana sound very similar – at least in the English language.

You can also use the DIFFERENCE function to find the same results. It takes the SOUNDEX of each of the items and gives you a response of how close they are!

To check that the words are very similar according to their SOUNDEX results, you would check the following:

DIFFERENCE results

Notice that we’re checking if the difference is 4. The 4 means to check if the strings that are being compared should be either identical or very very similar. A 0, on the other hand, means that there is a very weak similarity between the two strings.

And that, my friends, is a fair amount of work with SOUNDEX. It is one way to find if two strings are about the same. Until next time, my friends, when we will find yet another way of comparing strings.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.