How IndexDen supporting languages?

Language support is very important in now days, because we operating globally we have to make sure that any language are supported by IndexDen.
What kind of language support do we provide:
1) Search in Unicode
2) Morphology
3) Phonetic or Soundex support
4) Did you mean feature

This article explain what we did and what we will do to support different languages.

Unicode

There are two most common input methods in use today are single-byte encoding and UTF-8.
IndexDen search are supported UTF-8 encoding, it means it cover all kind of languages.
Including variations of Chinese, Japanese, Korean and Vietnamese languages.

Morphology

Morphology preprocessors can be applied to the words being indexed to replace different forms of the same word with the base, normalized form. For instance, English stemmer will normalize both “dogs” and “dog” to “dog”, making search results for both searches the same.
In the time being IndexDen support only English and Russian stemmers.

But, by custom request we also could provide support for following languages:

  • French
  • Spanish
  • Portuguese
  • Italian
  • German
  • Dutch
  • Swedish
  • Norwegian
  • Danish
  • Finnish
  • Arabic
  • Czech

Phonetic or Soundex support

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.

Soundex

We got a lot of requests to support Soundex analyze of words. This is very useful when you need to search by First name, Last name, Street and other genealogical information.
As it is not common for the most of our customers it will be supported only on custom request basis.

Did you mean … ?

Many of our customers requested for “did you mean … ?” functionality from IndexDen.
I think your already know what it is from Google search. In short it is based on comparing the difference between the words in the current query and words from a dictionary.
We already have a plan how to implement it, but it is not easy as it sounds.
This feature also will be enabled by custom requests for our customers.

Conclusion:
Search in Unicode – working for any language including Chinese, Japanese, Korean and Vietnamese languages.
Morphology – by default working for English and Russian language. By request could be enabled for other languages, see the list above.
Phonetic or Soundex support – will be enabled by request.
“Did you mean ..?” feature – will be accessible for all tariff plans except Free plan.

If you want to try and incorporate any of this feature into your application you have to sign up to paid plain.
See plans and pricing which will fit you best.

Feel free to ask any questions or support request.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>