User Tools

Site Tools


official_results_pages_news

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
official_results_pages_news [2013/09/30 00:14]
dseddah [CROSS FRAMEWORKS EVALUATION]
official_results_pages_news [2015/04/24 22:48]
dseddah
Line 47: Line 47:
  
 === Parseval'​s Evaluation === === Parseval'​s Evaluation ===
-Note: that we used a modified version of Evalb (Black et al, 91) {{:dldata:​evalb_spmrl2013.tar.gz|download}}+Note: that we used a modified version of Evalb (Black et al, 91) [[http://​pauillac.inria.fr/​~seddah/​evalb_spmrl2013_good.tar.gz]] 
   so    so 
   * (i) it  can process the particular format of SPMRL Data,    * (i) it  can process the particular format of SPMRL Data, 
Line 154: Line 155:
 ====== CROSS FRAMEWORKS EVALUATION ====== ====== CROSS FRAMEWORKS EVALUATION ======
 We actually do have them (only two data points are still missing, we're recalculating them) We actually do have them (only two data points are still missing, we're recalculating them)
 +
 +  * The evaluation protocol is the following:​\\
 +  * train5k files
 +  * Gold  morphology (and pred, but here gold matters the most as it alleviates the difference of predicted morphology accuracy in the various languages)
 +  * and evaluated on a subset of the test file : First  5000 tokens with respect to sentence boundaries (so that will give 5007 tokens for French and 4983 for Arabic for example, as in the conll2007 test files)
 + 
 +metrics you're seeing are
 +
 +Acc. (x100)  ->​ tedeval accuracy
 +Ex. gold (%) -> exact match wrt the gold (ptb for the const file and conll for the dep files)
 +Ex. gen (%)  -> exact matcj to the generalized gold (that is the generic tree being the intersection of the two other gold)
 +Norm.  ​ -> Normalisation factor.
 +
  
 [[http://​pauillac.inria.fr/​~seddah/​official_cross_tedeval_unlabled-70-5ktok.spmrl_results.html|web]] [[http://​pauillac.inria.fr/​~seddah/​official_cross_tedeval_unlabled-70-5ktok.spmrl_results.html|web]]
  
 [[http://​pauillac.inria.fr/​~seddah/​official_cross_tedeval_unlabled-70-5ktok.csv|csv]] [[http://​pauillac.inria.fr/​~seddah/​official_cross_tedeval_unlabled-70-5ktok.csv|csv]]
 +
 +Given  the time constraints,​ we only compared the IMS rest (ptb vs conll) and we gave a baseline (
  
  
Line 190: Line 206:
 === Why does it take so long to get the cross framework results ? === === Why does it take so long to get the cross framework results ? ===
 Murphy'​s law at its extreme implementation. Last issue was the cluster monitoring dying because some magic numbers were exhausted by the shell so it killed all evaluations (and most of them took more than 12 hours because of a) a race condition somewhere ​ and b) the server room was too hot so the server slowed down the cpu frequencey to 1ghz (instead of way, way more) and so did the ram bandwith.) Murphy'​s law at its extreme implementation. Last issue was the cluster monitoring dying because some magic numbers were exhausted by the shell so it killed all evaluations (and most of them took more than 12 hours because of a) a race condition somewhere ​ and b) the server room was too hot so the server slowed down the cpu frequencey to 1ghz (instead of way, way more) and so did the ram bandwith.)
- 
  
  
official_results_pages_news.txt · Last modified: 2015/04/24 22:48 by dseddah