===============
ABOUT FiNER
===============
FiNER is a tool for Named Entity Recognition in Finnish running text, developed at the University of Helsinki for the FIN-CLARIN consortium. This distibution site contains development snapshots, which may work incompletely and possibly not work at all in some situations. Developer of the 2013 versions: juha.kuokkala AD helsinki.fi, contact for futher development: krister.linden AD helsinki.fi.
The executables offered here have been compiled for 64-bit x86 Linux.
===============
FiNER TAGGING
===============
By default, FiNER separates word-form tokens at white space and outputs them one per line with additional TAB-separated data fields:
1) original word-form
2) lemma (base-form)
3) OMorFi morphological tags
4) OMorFi proper-name tags
5) FiNER entity tags
FiNER entity tags take the following general forms:
a) single-word entity
b) first word of a multi-word entity
c) last word of a multi-word entity
Following TAGs are used in the current implementation. Note that the sub-division of Org(anizations) and especially Loc(ations) is currently only roughly instructive, and in most practical applications, combining these to more general Org and Loc classes might be reasonable.
EnamexLocGpl - Geographical areas and places
EnamexLocPpl - Political and settlement areas
EnamexLocStr - Street locations
EnamexLocXxx - Locations without further specification
EnamexOrgCrp - Corporations, associations etc.
EnamexOrgAth - Athletic/sports organizations
EnamexOrgClt - Cultural organizations
EnamexOrgEdu - Educational organizations
EnamexOrgPlt - Political parties
EnamexOrgTvr - Media organizations (TV, radio, press)
EnamexPrsHum - Human persons
EnamexPrsTit - A title possibly preceding a personal name
TimexTmeDat - Time expressions
NumexMsrCur - Currency amounts
NumexMsrXxx - Other measurement expressions
===============
USAGE EXAMPLE
===============
$ cat > ex1.txt
Raisio Yhtymän pääjohtajan Matti Salmisen mukaan yhtiö on periaatteessa kiinnostunut Suomen Rehusta .
$ ./finer_dist/finer_tag < ex1.txt
Raisio Raisio [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=NOM] [PROP=GEO][PROP=ORG]
Yhtymän Yhtymä [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=GEN] [PROP=GEO]
pääjohtajan pääjohtaja [POS=NOUN][NUM=SG][CASE=GEN]
Matti Matti [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=NOM] [PROP=FIRST][PROP=GEO][PROP=LAST]
Salmisen Salminen [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=GEN] [PROP=GEO][PROP=LAST]
mukaan mukaan [POS=ADPOSITION] -
yhtiö yhtiö [POS=NOUN][NUM=SG][CASE=NOM] -
on olla [POS=VERB][VOICE=ACT][MOOD=INDV][TENSE=PRESENT][PERS=SG3] -
periaatteessa periaate [POS=NOUN][NUM=SG][CASE=INE] -
kiinnostunut kiinnostua [POS=VERB][VOICE=ACT][PCP=NUT][CMP=POS][NUM=SG][CASE=NOM] -
Suomen Suomi [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=GEN] [PROP=GEO][PROP=LAST]
Rehusta Rehu [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=ELA] [PROP=GEO][PROP=LAST]
. . [SUBCAT=PUNCTUATION] -