User Tools

Site Tools


shared_task_description

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
shared_task_description [2013/08/19 11:33]
dseddah
shared_task_description [2015/02/20 19:10] (current)
dseddah
Line 105: Line 105:
 === Evaluating All Scenarios === === Evaluating All Scenarios ===
  
-  * For constituent evaluation on gold word segmentation of bracketed output (eg. PTB), we will use a modified version of Parseval'​s evalb: ​{{:dldata:​evalb_spmrl2013.tar.gz}}+  * For constituent evaluation on gold word segmentation of bracketed output (eg. PTB), we will use a modified version of Parseval'​s evalb: ​[[http://​pauillac.inria.fr/​~seddah/​evalb_spmrl2013.tar.gz]]. Add -fPIC to gcc to compile for Linux. 
 +**update *February 2014: the evalb package that was available on Djame'​s site was not the correct one. if your version doesn'​t have the -X switch, it's the buggy one.** 
   *    * 
   * For dependency evaluation on gold word segmentation,​ we will use the CoNLL 2007 evaluation: [[http://​nextens.uvt.nl/​depparse-wiki/​SoftwarePage#​eval07.pl]]   * For dependency evaluation on gold word segmentation,​ we will use the CoNLL 2007 evaluation: [[http://​nextens.uvt.nl/​depparse-wiki/​SoftwarePage#​eval07.pl]]
Line 114: Line 116:
   * For the fully raw scenario, we will use tedeval in the unlabeled condition: ([[http://​www.tsarfaty.com/​unipar/​index.html]]) wrapper here: {{:​dldata:​tedeval_wrapper_08192013.tar.bz2}}   * For the fully raw scenario, we will use tedeval in the unlabeled condition: ([[http://​www.tsarfaty.com/​unipar/​index.html]]) wrapper here: {{:​dldata:​tedeval_wrapper_08192013.tar.bz2}}
  
 +  * French MWE Evaluation ​
 +On top of classical evalb and eval07.pl evaluation, we will also provide results on multiword expression.
 +Thanks to Marie Candito, the evaluator for dependencies output is provided on tools. (see test/​tools/​do_eval_dep_mwe.pl)
 +In the very next days, we'll provide the same script for mwe eval of constituency parses, however here's the readme of the current tool. 
 +
 +SPMRL 2013 shared task dependency evaluation script for French.
 +
 +EXPECTED FORMAT for marking MWEs:
 +
 +   The script supposes that all MWEs are flat, with one component governing
 +   all the other components of the MWE with dependencies labeled <​MWE_LABEL>​.
 +   If provided, the additional information of the part-of-speech of the MWE
 +   is expected to be given as value of a <​MWE_POS_FEAT>​ feature, on the head token
 +   of the MWE.
 +
 +OUTPUT: ​
 +
 +   The script outputs in any case two evaluations,​ and possibly a third one :
 +
 +   - precision/​recall/​Fmeas on components of MWEs (excluding heads of MWEs)
 +     A component of MWE is counted as correct if it is attached to the same 
 +     token as in the gold file, with label <​MWE_LABEL>​
 +
 +   - precision/​recall/​Fmeas on full MWEs
 +     A MWE is counted as correct if its sequence of tokens also forms 
 +     a MWE in gold file
 +
 +   - if both the gold file and the system files do contain at least one <​MWE_POS_FEAT>​ feature,
 +     then a third evaluation is also provided, which uses a stricter criteria
 +     for full MWEs : they have to be composed of the same tokens as in gold file AND the gold
 +     and predicted part-of-speech for the MWE have to match.
 +
 +
 +USAGE: perl do_eval_dep_mwe.pl [OPTIONS] -g <gold standard conll> -s <system output conll>
 +
 +   [ -mwe_label <​MWE_LABEL>​ ] label used for components of MWEs. Default = dep_cpd
 +   [ -mwe_pos_feat <​MWE_POS_FEAT>​ ] use to define the feature name that marks heads of MWEs. Default = mwehead
 +   [ -help ] 
shared_task_description.1376904815.txt.gz · Last modified: 2013/08/19 11:33 by dseddah