A minimal, pure Python library to interface with CoNLL-U format files.
-
Updated
Jun 21, 2023 - Python
A minimal, pure Python library to interface with CoNLL-U format files.
End-to-end integration of HuggingFace's models for sequence labeling.
Simple script to parse text with spaCy and print the output in CoNLL-U format.
A package for manipulating Universal Dependencies trees
A number of command-line tools for working with FoLiA (Format for Linguistic Annotation). Includes validators, converters, visualisers, and more.
ACoLi CoNLL libraries: Several tools for processing, manipulating and transforming TSV formats (CoNLL-RDF, CoNLL-Merge, CQP4RDF)
Toolkit that simplifies corpus processing
Repository for the paper "Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities"
A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguistic processing with the Stanza pipeline, machine translation and word alignment with the Eflomal tool.
A Python3 package for extracting syntactic complexity measures from CoNLL-U annotations.
Small bilar packages
Tool for translating a corpus file from one language to another.
NER tagging with HMM and Viterbi algorithm - University Project
Analysing different text representations for genre identification. I parse CONLL-u files and extract various representations of a text (running text, lemmas, part-of-speech), then train a Fasttext model on each to see which representation is the most beneficial for the genre identification task.
Count Bigram frequency in a conllu format corpus
Add a description, image, and links to the conllu topic page so that developers can more easily learn about it.
To associate your repository with the conllu topic, visit your repo's landing page and select "manage topics."