Semantic Knowledge Acquisition and Categorisation

Alessandro Lenci, Simonetta Montemagni and Vito Pirrelli
Section: Language and Computation
Level: Workshop

Endorsed by SIGSEM, the ACL Special Interest Group in Computational Semantics


The sheer amount of knowledge necessary to shed light on the way word meanings mutually relate in context or distribute in lexico-semantic classes appears to exceed the limits of human conscious awareness and descriptive capability. Particularly at this level of linguistic analysis, then, we seem to be in need of automatic ways of  filtering, structuring and classifying semantic evidence through inspection of a large number of word uses in context. Totally or partially unsupervised inductive methods of knowledge acquisition from corpus data are credited with being able to provide such ways. Yet, it remains to be seen how acquired information can best be represented in current formal models for knowledge representation, for it to be made available to mainstream NLP applications.

There are reasons to believe that this integration will require much more than a simple extension of off-the-shelf machine learning technology. At the same time, any major breakthrough in this area is bound to have significant repercussions on the way word meanings and lexico-semantic classes in general are formally represented and used for applications. With these purposes in mind, the workshop intends to focus on the issue of interaction between techniques for inducing semantic information from corpus data and formal methods of linguistic knowledge representation. In particular, we encourage in-depth analysis of underlying assumptions of the proposed techniques and methods and discussion of possible relevant connections with cognitive, linguistic,logical and philosophical issues.

Topics of Interest

Possible themes for contributions are:

Further Particulars

Over the last ten years, the workshop organisers have been very active in the development of large Lexical Databases, creation of syntactic and semantic treebanks, definition of syntactic and semantic layers of lexical representations, design of ontologies, design of standards and guidelines for Computational Lexicons and Text Corpora, coordination of standardisation projects and language resources building projects in Europe, design and implementation of acquisition systems of linguistic information from text corpora based on machine learning technology. Some of their work was carried out in the framework of projects like ACQUILEX-I and II, Survey of Linguistic Resources for NLP, ONOMASTICA, MULTEXT, COLSIT, LS-GRAM, MEMORIA, EuroWordNet, SPARKLE, PAROLE, SIMPLE, ELSE, EAGLES/ISLE, MATE and UNL of the United Nations.



Istituto di Linguistica Computazionale (CNR)
Area della Ricerca CNR
via Alfieri 1 (San Cataldo)
I-56010 PISA
Phone: +39-050-315 2847/2850
Fax: +39-050-315 2834
E-mail: {lenci,simo,vito}