Morphological Analysis
English Analyzer
The English Analyzer takes an input token and its part-of-speech tag in the Penn Treebank style, and splits it into morphemes using inflection, derivation, and prefix rules.
Associated models:
elit-morph-idprule-enAPI reference: EnglishMorphAnalyzer
Supplementary documentation: English Morph Analyzer
Decode parameters: - derivation:
True(_default_) orFalse- prefix:0(no prefix analysis; _default_),1(shortest preferred),2(longest preferred)
Web API
{
"model": "elit_morph_lexrule_en",
"args": {"derivation": true, "prefix": 0}
}
Python API
from elit.structure import Document, Sentence, TOK, POS, MORPH
from elit.component import EnglishMorphAnalyzer
tokens = ['dramatized', 'ownerships', 'environmentalists', 'certifiable', 'realistically']
postags = ['VBD', 'NNS', 'NNS', 'JJ', 'RB']
doc = Document()
doc.add_sentence(Sentence({TOK: tokens, POS: postags}))
morph = EnglishMorphAnalyzer()
morph.decode([doc], derivation=True, prefix=0)
print(doc.sentences[0][MORPH])
Output
[
[["drama", "NN"], ["+tic", "J_IC"], ["+ize", "V_IZE"], ["+d", "I_PST"]],
[["own", "VB"], ["+er", "N_ER"], ["+ship", "N_SHIP"], ["+s", "I_PLR"]],
[["environ", "VB"], ["+ment", "N_MENT"], ["+al", "J_AL"], ["+ist", "N_IST"], ["+s", "I_PLR"]],
[["cert", "NN"], ["+ify", "V_FY"], ["+iable", "J_ABLE"]],
[["real", "NN"], ["+ize", "V_IZE"], ["+stic", "J_IC"], ["+ally", "R_LY"]]
]