The goal of morphemizer is to tokenize text into morphemes (the shortest
word pieces that still bear meaning). For example, “unaffable” should
tokenize to "un##", "affable", since “aff” does not have a meaning in
modern English, while “inescapable” should tokenize to "in##", "escape", "##able". "##" characters are used to indicate breaks in
words. Prefixes are followed by “##” (as in "in##"), suffixes are
preceded by “##” (as in "##able"), and breaks between roots are
indicated by the special "##" token.
You can install the released version of morphemizer from CRAN with:
# No you can't.
# install.packages("morphemizer")And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("jonthegeek/morphemizer")This is not an officially supported Macmillan Learning product.
Questions or comments should be directed to Jonathan Bratt (jonathan.bratt@macmillan.com) and Jon Harmon (jonthegeek@gmail.com).