A stemmer for Esperanto
This stemmer takes these steps to produce a stem:
- Lowercase the word
- Stop stemming if the word is one of a variety of exceptions (conjunctions, interjections, prepositions, etc.)
- Stop stemming if the word is a number 1-10
- Stop stemming if the word is a larger number (i.e. 11 can be written dek unu [ten one] or dekunu, this finds the second)
- Determine if the word has the plural or direct object suffixes
- Determine the longest other suffix that does not produce a stem of a smaller size than minStemLength or the position of the first vowel, whichever is longer
- If no such suffix exists, return the word
- Otherwise, return the word minus the found suffix
Copyright (C) 2018 Declan Whitford Jones
Licensed under GPL v3