English Models

  • PRE: tools that need to be run prior to the model.
  • DATA: the dataset used to build the model.
  • EVAL: evaluation on the dataset that the model is trained on.
  • BM: the standard benchmark evaluation for the task.
  • tokseg: any tokenizer with sentence segmentation.
  • inmixed: any dataset in the mixed corpus.

Morphological Analysis

Model ID PRE
elit_morph_idprule_en ✳-pos-✳-en-inmixed

Part-of-Speech Tagging

Model ID PRE DATA EVAL BM
elit_pos_cnn_en_mixed tokseg Mixed 97.xx
elit_pos_rnn_en_mixed tokseg Mixed 97.xx
elit_pos_flair_en_mixed tokseg Mixed 97.80 97.72
  • EVAL: accuracy.
  • BM: accuracy on the Wall Street Journal portion of the Penn Treebank using the standard split (trn: 0-18; dev: 19-21; tst: 22-24).

Named Entity Recognition

Model ID PRE DATA EVAL BM
elit_ner_cnn_en_ontonotes tokseg OntoNotes 87.xx
elit_ner_rnn_en_ontonotes tokseg OntoNotes 86.xx
elit_ner_flair_en_ontonotes tokseg OntoNotes 88.75 92.74

Dependency Parsing

Model ID PRE DATA EVAL BM
elit_dep_biaffine_en_mixed ✳-pos-✳-en-inmixed Mixed 92.26/91.03 96.08/95.02
  • EVAL: UAS (unlabeled attachment score) / LAS (labeled attachment score).
  • BM: UAS/LAS on the Wall Street Journal portion of the Penn Treebank using the standard split (trn: 2-21; dev: 22, 24; tst: 23) and the Stanford typed dependencies.

Coreference Resolution

Model ID PRE DATA EVAL
uw_coref_e2e_en_ontonotes tokseg CoNLL12 80.4/70.8/67.6