TaxoEmbed
Sapienza University of Rome Linguistic Computing Laboratory

TaxoEmbed

Supervised Distributional Hypernym Discovery via Domain Adaptation

About

TaxoEmbed is a supervised distributional framework for hypernym discovery which operates at the sense level, enabling large-scale automatic acquisition of disambiguated taxonomies. TaxoEmbed exploits semantic regularities between hyponyms and hypernyms in embeddings spaces to learn a hypernym transformation matrix, and integrates a domain clustering algorithm to produce domain-specific models that are sensitive to the target data. Experiments on ten different domains show that TaxoEmbed is flexible and robust enough to accommodate heterogeneous training pairs, drawn from manually curated knowledge bases as well as OIE-derived resources.

Reference Paper

Luis Espinosa Anke, José Camacho Collados, Claudio Delli Bovi and Horacio Saggion.
Supervised Distributional Hypernym Discovery via Domain Adaptation. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 424–435, Austin, Texas, USA, 1-5 November 2016.

Contacts

Luis Espinosa-Anke

Luis Espinosa-Anke
luis [dot] espinosa [at] upf [dot] edu

José Camacho Collados
collados [at] di.uniroma1 [dot] it
bn:17381131nbn:17381131n @ BabelNet

José Camacho Collados

Claudio Delli Bovi

Claudio Delli Bovi
dellibovi [at] di.uniroma1 [dot] it
bn:17381128n @ BabelNetbn:17381128n

Horacio Saggion
horacio [dot] saggion [at] upf [dot] edu

Horacio Saggion

Download

Training Data: Wikidata, KB-Unify [ zip: 26 MB ]

Nasari Domain Labels [ tsv: 46 MB ]

SensEmbed Sense Vectors [ bin: 3.0 GB ]


Python API v0.9 [ zip: 7.0 KB ]

Java API [ Coming Soon! ]

README

Updates


Last update: Nov 8th 2016 by Claudio Delli Bovi