Wednesday, January 23, 2013

[android-developers] Re: mapping of strings

This depends on the amount of words and how you define "similarity".

Amount of words:
If your word list won't get bigger over time and if you are sure it won't be too many to fit into your app's process heap memory then you should load these words from the word list into some data structure for faster lookup.

If your word list can get bigger over time (due to user interaction or updates) or if the word list is pretty big in the first place you should consider using an SQLiteDatabase instead of a word list.

Word similarity
There are several ways of defining similarity. The simplest way of defining it is letter-based. You can look for the "word distance" between two words using the Levenshtein algorithm for example. This algorithm is a metric that takes a word A and a word B and returns the amount of steps necessary to change A into B. A single change can be adding, removing or replacing a letter. The lower the number of that algorithm the closer both words A and B are to each other in terms of spelling.
However, this is only useful for covering spelling mistakes like flipped or missing letters.

More advanced is checking for phonetic similarity (sound similarity / similar pronunciation) with a phonetic algorithm. Unfortunately this highly depends on the language you want to cover, because there are no common rules for transforming a written word into its phonetic counterpart.
If you can find a suitable algorithm for your language to transform written words into a phonetic representation you can look up similar sounding words by just matching the phonetic representations. That can give really good results. There is for example the "Soundex" algorithm for the English language.

Another possibility of defining similarity is the creation of word groups that define relationships between words. You could for example define a group called car and add all car manufacturers into that group as well as slang terms for cars.



On Wednesday, January 23, 2013 1:53:27 AM UTC-6, sam jeck wrote:
I am dealing with a mobile application where a certain auto generated characters need to match with the most appropriate word in a separate file(a word list file-which contains words) . how do i map for the most possible word from the auto generated characters and the word list file.

--
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

No comments:

Post a Comment