=============== ABOUT FiNER =============== FiNER is a tool for Named Entity Recognition in Finnish running text, developed at the University of Helsinki for the FIN-CLARIN consortium. This distibution site contains development snapshots, which may work incompletely and possibly not work at all in some situations. Developer of the 2013 versions: juha.kuokkala AD helsinki.fi, contact for futher development: krister.linden AD helsinki.fi. The executables offered here have been compiled for 64-bit x86 Linux. =============== FiNER TAGGING =============== By default, FiNER separates word-form tokens at white space and outputs them one per line with additional TAB-separated data fields: 1) original word-form 2) lemma (base-form) 3) OMorFi morphological tags 4) OMorFi proper-name tags 5) FiNER entity tags FiNER entity tags take the following general forms: a) single-word entity b) first word of a multi-word entity c) last word of a multi-word entity Following TAGs are used in the current implementation. Note that the sub-division of Org(anizations) and especially Loc(ations) is currently only roughly instructive, and in most practical applications, combining these to more general Org and Loc classes might be reasonable. EnamexLocGpl - Geographical areas and places EnamexLocPpl - Political and settlement areas EnamexLocStr - Street locations EnamexLocXxx - Locations without further specification EnamexOrgCrp - Corporations, associations etc. EnamexOrgAth - Athletic/sports organizations EnamexOrgClt - Cultural organizations EnamexOrgEdu - Educational organizations EnamexOrgPlt - Political parties EnamexOrgTvr - Media organizations (TV, radio, press) EnamexPrsHum - Human persons EnamexPrsTit - A title possibly preceding a personal name TimexTmeDat - Time expressions NumexMsrCur - Currency amounts NumexMsrXxx - Other measurement expressions =============== USAGE EXAMPLE =============== $ cat > ex1.txt Raisio Yhtymän pääjohtajan Matti Salmisen mukaan yhtiö on periaatteessa kiinnostunut Suomen Rehusta . $ ./finer_dist/finer_tag < ex1.txt Raisio Raisio [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=NOM] [PROP=GEO][PROP=ORG] Yhtymän Yhtymä [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=GEN] [PROP=GEO] pääjohtajan pääjohtaja [POS=NOUN][NUM=SG][CASE=GEN] Matti Matti [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=NOM] [PROP=FIRST][PROP=GEO][PROP=LAST] Salmisen Salminen [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=GEN] [PROP=GEO][PROP=LAST] mukaan mukaan [POS=ADPOSITION] - yhtiö yhtiö [POS=NOUN][NUM=SG][CASE=NOM] - on olla [POS=VERB][VOICE=ACT][MOOD=INDV][TENSE=PRESENT][PERS=SG3] - periaatteessa periaate [POS=NOUN][NUM=SG][CASE=INE] - kiinnostunut kiinnostua [POS=VERB][VOICE=ACT][PCP=NUT][CMP=POS][NUM=SG][CASE=NOM] - Suomen Suomi [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=GEN] [PROP=GEO][PROP=LAST] Rehusta Rehu [POS=NOUN][SUBCAT=PROPER][NUM=SG][CASE=ELA] [PROP=GEO][PROP=LAST] . . [SUBCAT=PUNCTUATION] -