fuzzy numbers

Sep 10, 2014 at 6:39 AM
Edited Sep 10, 2014 at 6:40 AM
Which algorithm is most suited for comparing numbers
ie. 123456789
     123456189
or
123456789
123465789
or
123456789
128456789
or
123456789
12345678
etc
Feb 24, 2015 at 1:47 PM
Without knowing what you are trying to achieve, here is some general advice

http://en.wikipedia.org/wiki/String_metric

http://en.wikipedia.org/wiki/Edit_distance

http://en.wikipedia.org/wiki/Levenshtein_distance (see section 3 in particular)

Taking your second example {123456789, 123465789} The strings differ by 1 transposition, so if you compared the two using the Damerau–Levenshtein distance algorithm your edit distance would be 1. Using Levenshtein your edit distance would be 2. Damerau-Levenshtein would be better at finding two numbers that have digits swapped around. Levenshtein is good at finding two numbers that only differ by n digits. Longest Substring would help find numbers with the largest matching sections (may be useful if these were phone numbers for example).