Morphological Analysis

English Analyzer

The English Analyzer takes an input token and its part-of-speech tag in the Penn Treebank style, and splits it into morphemes using inflection, derivation, and prefix rules.

Associated models: elit-morph-idprule-en
API reference: EnglishMorphAnalyzer
Supplementary documentation: English Morph Analyzer
Decode parameters: - derivation: True (_default_) or False - prefix: 0 (no prefix analysis; _default_), 1 (shortest preferred), 2 (longest preferred)

Web API

{
    "model": "elit_morph_lexrule_en",
    "args": {"derivation": true, "prefix": 0}
}

Python API

from elit.structure import Document, Sentence, TOK, POS, MORPH
from elit.component import EnglishMorphAnalyzer

tokens = ['dramatized', 'ownerships', 'environmentalists', 'certifiable', 'realistically']
postags = ['VBD', 'NNS', 'NNS', 'JJ', 'RB']
doc = Document()
doc.add_sentence(Sentence({TOK: tokens, POS: postags}))

morph = EnglishMorphAnalyzer()
morph.decode([doc], derivation=True, prefix=0)
print(doc.sentences[0][MORPH])

Output

[
  [["drama", "NN"], ["+tic", "J_IC"], ["+ize", "V_IZE"], ["+d", "I_PST"]],
  [["own", "VB"], ["+er", "N_ER"], ["+ship", "N_SHIP"], ["+s", "I_PLR"]],
  [["environ", "VB"], ["+ment", "N_MENT"], ["+al", "J_AL"], ["+ist", "N_IST"], ["+s", "I_PLR"]],
  [["cert", "NN"], ["+ify", "V_FY"], ["+iable", "J_ABLE"]],
  [["real", "NN"], ["+ize", "V_IZE"], ["+stic", "J_IC"], ["+ally", "R_LY"]]
]