A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding

CogKTR

Make Texts More Knowledgeable

CogKTR is a knowledge enhanced text representation toolkit for natural language understanding.

According to our proposed Unified Knowledge-Enhanced Paradigm (UniKEP), CogKTR consists of four key stages, including knowledge acquisition, knowledge representation, knowledge injection, and knowledge application. CogKTR currently supports easy-to-use knowledge acquisition interfaces, multi-source knowledge embeddings, diverse knowledge-enhanced models, and various knowledge-intensive NLU tasks.

Get Started
Image
Contribution

Main Features:

Unified

CogKTR is designed and built on our Unified Knowledge-Enhanced Paradigm, which consists of four stages: knowledge acquisition, knowledge representation, knowledge injection, and knowledge application.

Knowledgeable

CogKTR integrates multiple knowledge sources, including Wikidata, Wikipedia, WordNet and ConceptNet, and implements many knowledge enhanced methods based on this knowledge.

Modular

CogKTR modularizes our proposed paradigm and consists of Enhancer, Model, Core and Data modules, each of which is highly extensible so that researchers can implement new components easily.

Enhancer module details of CogKTR

BaseModel class is the base class of all models in CogKGE. BaseModel class organizes code into three basic sections: (1) forward function for training, (2) embedding function for getting the embedding of entities and relations, (3) scoring function for computing the score of triples. Model module consists of four parts: translation distance models, semantic matching models, graph neural network-based models and transformer-based models. We summarize the models in the following table:

Components Class Functions Tools
Tagger NerTagger identify entity mention spans CogIE
ConceptNetTagger identify concept mention spans spaCy
WordNetTagger identify candidate texts spans NLTK
SrlTagger tag sentences and get semantics labeling Stanza
SyntaxTagger parse sentences and get dependency trees AllenNLP
Linker WikipediaLinker link entities to Wikipedia CogIE
ConceptNetLinker link concepts to ConceptNet spaCy
WordNetLinker link candidate texts to WordNet CogIE
Searcher WikipediaSearcher query entity titles and text descriptions in Wikipedia KILT
WikidataSearcher look up triples and subgraphs in Wikidata qwikidata
ConceptNetSearcher search subgraphs and relation paths in ConceptNet spaCy
WordNetSearcher synonyms, example sentences, definitions and hypernyms NLTK
Embedder WikidataEmbedder convert Wikidata into continuous knowledge CogKGE
ConceptNetEmbedder convert ConceptNet into continuous knowledge MHGRN
WordNetEmbedder convert WordNet into continuous knowledge CogKGE