Datasets and Resources

As part of ongoing research, the HLTCOE creates new or supplements existing data sets and resources. Each resource download contains detailed information about its contents, including any restrictions on use or re-distribution.

HLTCOE Github page 

XLEL-21: Cross Language Entity Linking in 21 Languages

This collection was developed to support the training and evaluation of cross-language linking of named entities from twenty-one non-English languages into an English knowledge base. It includes over 55,000 queries across twenty-one non-English languages, plus an English version of each query. [click to download]

Human Language Technology Center of Excellence