Machine Translation Archive

Index to evaluation measures

Publications 2000-2004

For other periods go to: Publications since 2010; publications 2005-2009; publications 1990-1999; publications 1970-1989; publications before 1970

[To return to home page click here]

Back translation

(2004) Slaven Bilac & Hozumi Tanaka: A hybrid back-transliteration system for Japanese.  Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 999KB]

(2004) Slaven Bilac & Hozumi Tanaka: Improving back-transliteration by combining information sources. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.216-223. [abstract]

(2000) Jay Tucker & Ace Sarich: A voice-enabled phrase-based translation system. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 4pp.  [PDF, 32KB]

Cloze technique

(2000) Harold Somers & Elizabeth Wild: Evaluating machine translation: the cloze procedure revisited. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 11pp.  [PDF, 85KB]

Confidence estimation

(2002) Eiichiro Sumita, Yasuhiro Akiba, & Kenji Imamura: Reliability measures for translation quality.  ICSLP 2002, Interspeech 2002:7th International Conference on  Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA; pp.1893-1896; abstract [PDF, 43KB]

 (2000) Damir Ćavar, Uwe Küssner & Dan Tidhar: From off-line evaluation to on-line selection. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 597-610. [abstract]

Edit distance

(2002)   Jesús Tomás, Josep Ŕngel Mas, & Francisco Casacuberta: A quantitative method for machine translation evaluation. EACL 2003: Proceedings of the Workshop on Evaluation Initiatives in Natural Language Processing: are evaluation methods, metrics and resources reusable? April 14th 2003, Budapest, Hungary; pp.27-34. [PDF, 6570KB]

(2001) Yasuhiro Akiba, Kenji Imamura & Eiichiro Sumita: Using multiple edit distances to automatically rank machine translation output. . MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.15-20. [PDF, 271KB]

Error detection and correction

(2004) Ariadna Font Llitjós & Jaime Carbonell: The translation correction tool: English-Spanish user studies. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.347-350. [PDF, 606KB]

(2000) Jean-Marc Jutras: An automatic reviser: the TransCheck system. ANLP-NAACL-2000: proceedings of the Sixth conference on Applied Natural Language Processing and 1st Meeting of the North American Chapter of the Association for Computational Linguistics, April 29 – May 4, 2000, Seattle, Washington; pp.127-134. [PDF, 658KB]

Evaluation measures and metrics

(2004) Yasuhiro Akiba, Eiichiro Sumita, Hiromi Nakaiwa, Seiichi Yamamoto, & Hiroshi G. Okuno: Incremental methods to select test sentences for evaluating translation ability.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.2015-2018. [PDF, 271KB]

(2004) Yasuhiro Akiba, Eiichiro Sumita, Hiromi Nakaiwa, Seiichi Yamamoto, & Hiroshi Okuno: Using a mixture of n-best lists from multiple MT systems in rank-sum-based confidence measure for MT outputs. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 152KB]

(2004) Bogdan Babych & Anthony Hartley: Extending the BLEU MT evaluation method with frequency weightings.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp. 621-628. [PDF, 132KB]

(2004) Bogdan Babych, Debbie Elliott, & Anthony Hartley: Extending MT evaluation tools with translation complexity metrics.  Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 68KB]

(2004) Bogdan Babych, Debbie Elliott, & Anthony Hartley: Calibrating resource-light automatic MT evaluation: a cheap approach to ranking MT systems by the usability of their output. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.2031-2034. [PDF, 237KB]

(2004) Bogdan Babych & Anthony Hartley: Modelling legitimate translation variation for automatic evaluation of MT quality. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.833-836. [PDF, 283KB]

(2004) Bogdan Babych: Weighted n-gram model for evaluating machine translation output. Proceedings of the 7th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics [CLUK-2004], Birmingham, UK, 6-7 January 2004; 8pp. [PDF, 174KB]

(2004) Robert S.Belvin, Susanne Riehemann, & Kristin Precoda: A fine-grained evaluation method for speech-to-speech translation using concept annotations. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1427-1430. [PDF, 765KB]

(2004) Hervé Blanchon, Christian Boitet, Francis Brunet-Manquat, Mutsuko Tomokiyo, Agnčs Hamon, Vo Trung Hung & Youcef Bey: Towards fairer evaluations of commercial MT systems on basic travel expressions corpora. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2004], September 30 – October 1, 2004, Kyoto, Japan; pp. 21-26 [PDF, 563KB]

(2004) John Blatz, Erin Fitzgerald, George Foster, Simona Gandrabur, Cyril Goutte, Alex Kulesza, Alberto Sanchis, & Nicola Ueffing: Confidence estimation for machine translation. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; pp.315-321. [PDF, 1150KB]

(2004) Ray Clifford, Neil Granoien, Douglas Jones, Wade Shen, & Clifford Weinstein: The effect of text difficulty on machine translation performance: a pilot study with ILR-rated texts in Spanish, Farsi, Arabic, Russian and Korean. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.343-346. [PDF, 643KB]

(2004) Debbie Elliott, Anthony Hartley, & Eric Atwell: A fluency error categorization scheme to guide automated machine translation evaluation. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 64-73. [go to publisher details]

(2004) Debbie Elliott, Eric Atwell & Anthony Hartley: Compiling and using a shareable parallel corpus for machine translation evaluation. LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 18-21. [PDF, 286KB]

(2004) Andrew Finch, Yasuhiro Akiba, & Eiichiro Sumita: How does automatic machine translation evaluation correlate with human scoring as the number of reference translations increases? LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.2019-2022. [PDF, 284KB]

(2004) Richard Jelinek: Modern MT systems and the myth of human translation: real world status quo. Translating and the Computer 26: proceedings of the Twenty-sixth International Conference on Translating and the Computer, 18-19 November 2004, London. (London: Aslib, 2004); 15pp. [PDF, 149KB]

(2004) Michael Kluck: Evaluation of cross-language information retrieval using the domain-specific GIRT data as parallel German-English corpus. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1343-1346. [PDF, 1533KB]

(2004) Philipp Koehn: Statistical significance tests for machine translation evaluation. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 137KB]

(2004) Alex Kulesza & Stuart M. Shieber: A learning approach to improving sentence-level MT evaluation; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.75-84. [PDF, 108KB]

(2004) Alon Lavie, Kenji Sagae, & Shyamsundar Jayaraman: The significance of recall in automatic metrics for MT evaluation. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 134-143. [go to publisher details]

(2004) Chin-Yew Lin & Franz Josef Och: ORANGE: a method for evaluating automatic evaluation metrics for amchine translation. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 211KB]

(2004) Chin-Yew Lin & Franz Josef Och: Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp.605-612. [PDF, 167KB]

(2004) Widad Mustafa El Hadi, Marianne Dabbadie, Ismail Timimi, Martin Rajman, Philippe Langlais, Anthony Hartley, & Andrei Popescu Belis: Work-in-progree project report: CESTA – machine translation evaluation campaign. Coling 2004: Proceedings of the Second International Workshop on Language Resources for Translation Work, Research and Training, 28th August, University of Geneva, Switzerland; pp. 16-25.. [PDF, 151KB]

(2004) Franz Josef Och, Daniel Gildea, Sanjeev Khudanpur, Anoop Sarkar, Kenji Yamada, Alex Fraser, Shankar Kumar, Libin Shen, David Smith, Katherine Eng, Viren Jain, Zhen Jin, & Dragomir Radev: A smorgasbrod of features for statistical machine translation.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA; pp. 161-168. [PDF, 192KB]

(2004) Carol Peters, Martin Braschler, Khalid Choukri, Julio Gonzalo, & Michael Kluck: The future of evaluation for cross-language information retrieval systems. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.841-844. [PDF, 266KB]

(2004) Christopher B. Quirk: Training a sentence-level machine translation confidence measure.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.825-828. [PDF, 243KB]

(2004) Florence Reeder: Investigation of intelligibility judgments. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 227-235. [go to publisher details]

(2004) Diana Santos, Belinda Mala, & Luís Sarmento: Gathering empirical data to evaluate MT from English to Portuguese.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 14-17. [PDF, 317KB]

(2004) Libin Shen, Anoop Sarkar, & Franz Josef Och: Discriminitive reranking for machine translation.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA; pp.177-184. [PDF, 106KB]

(2004) Radu Soricut & Eric Brill: A unified framework for automatic evaluation using n-gram cooccurrence statistics.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp.613-620. [PDF, 119KB]

(2004) Aree Teeraparbseree: Qualitative evaluation of automatically calculated acception based MLDB.  Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.23-30. [PDF, 261KB]

(2004) M. Vanni, C.R.Voss, & C. Tate: Ground truth, reference truth & “omniscient truth” – parallel phrases in parallel texts for MT evaluation.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 10-13. [PDF, 352KB]

(2004) Keiji Yasuda, Fumiaki Sugaya, Eiichiro Sumita, Toshiyuki Takezawa, Genichiro Kikui, & Seiichi Yamamoto: Automatic measuring of English language proficiency using MT evaluation technology. Coling 2004: Workshop eLearning for Computational Linguistics and Computational Linguistics for eLearning, Geneva, 28 August 2004; 8pp. [PDF, 762KB]

(2004) Ying Zhang & Stephan Vogel: Measuring confidence intervals for the machine translation evaluation metrics; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.85-94. [PDF, 348KB]

(2004) Ying Zhang, Stephan Vogel & Alex Waibel: Interpreting BLEU/NIST scores: how much improvement do we need to have a better system?  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.2051-2054. [PDF, 361KB]

(2004) The 2004 NIST machine translation evaluation plan (MT-04). [NIST, 2004]; 3pp. [PDF, 125KB]

(2003) proceedings of the workshop Towards systematizing MT evaluation at the MT Summit IX, New Orleans,USA, 27 September 2003. [HTML]

(2003) Yasuhiro Akiba, Eiichiro Sumita, Hiromi Nakaiwa, Seiichi Yamamoto & Hiroshi G. Okuno: Experimental comparison of MT evaluation methods: RED vs.BLEU MT Summit IX, New Orleans, USA, 23-27 September 2003 [PDF, 96KB]

(2003) Bogdan Babych, Anthony Hartley & Eric Atwell: Statistical modelling of MT output corpora for information extraction. In: D.Archer, P.Rayson, A.Wilson, T.McEnery (eds.) Proceedings of CL2003: International Conference on Corpus Linguistics, Lancaster University; pp.191-200. [PDF, 661KB]

(2003) Andreea Calude: Machine translation of various text genres. [Unpublished] Presented at 7th Language and Society Conference of the New Zealand Linguistic Society, November 2002, Hamilton, New Zealand. 12pp. [PDF, 206KB]

(2003) Michael Carl & Sisay Fissaha: Phrase-based evaluation of word-to-word alignments HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 253KB]

(2003) Nelson Correa: A fine-grained evaluation framework for machine translation system development MT Summit IX, New Orleans, USA, 23-27 September 2003 [PDF, 60KB]

(2003) Deborah Coughlin: Correlating automated and human assessments of machine translation quality MT Summit IX, New Orleans, USA, 23-27 September 2003 [PDF, 88KB]

(2003) Christopher Culy & Susanne Z. Riehemann: The limits of n-gram translation evaluation metrics MT Summit IX, New Orleans, USA, 23-27 September 2003 [PDF, 338KB]

(2003) Marcello Federico: Evaluation frameworks for speech translation technologies. Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.377-380; abstract [PDF, 33KB]

(2003) Kenji Imamura, Eiichiro Sumita, & Yuji Matsumoto: Feedback cleaning of machine translation rules using automatic evaluation ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 62KB]

(2003) Kanayama Hiroshi: Paraphrasing rules for automatic evaluation of translation into Japanese. ACL 2003 International Workshop on Paraphrasing, July 11, 2003, Sapporo, Japan; 6pp. [PDF, 59KB]

(2003) Margaret King, Andrei Popescu-Belis, & Eduard Hovy: FEMTI: creating and using a framework for MT evaluation MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.224-231. [PDF, 163KB]

(2003) Gregor Leusch, Nicola Ueffing, & Hermann Ney: A novel string-to-string distance measure with applications to machine translation evaluation MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.240-247. [PDF, 104KB]

(2003) I.Dan Melamed, Ryan Green, & Joseph P.Turian: Precision and recall of machine translation HLT-NAACL 2003: conference combining Human Language Technology conference series and the North American Chapter of the Association for Computational Linguistics conference series,  May 27 – June 1,  2003, Edmonton, Canada; 3pp. [PDF, 153KB]

(2003) Rada Mihalcea & Ted Pedersen: An evaluation exercise for word alignment HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada.[PDF, 131KB]

(2003) Andrei Popescu-Belis: An experiment in comparative evaluation: humans vs. computers MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.307-314.[PDF, 71KB]

(2003) Jesús Tomás, Josep Ŕngel Mas, & Francisco Casacuberta: A quantitative method for machine translation evaluation. EACL 2003: Proceedings of the Workshop on Evaluation Initiatives in Natural Language Processing: are evaluation methods, metrics and resources reusable? April 14th 2003, Budapest, Hungary; pp.27-34. [PDF, 6570KB]

(2003) Joseph P. Turian, Luke Shen, & I.Dan Melamed: Evaluation of machine translation and its evaluation MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.386-393. [PDF, 71KB]

(2003) John S. White: How to evaluate machine translation. In: Harold Somers (ed.) Computers and translation: a translator’s guide (Amsterdam/Philadelphia: John Benjamins Publishing Company, 2003); pp.211-244.

(2003) The 2003 NIST machine translation evaluation plan (MT-03). [NIST, 2003]; 3pp. [PDF, 173KB]

(2002) proceedings of  Workshop: Machine translation evaluation: human evaluators meet automated metrics,LREC-2002-Hovy-2.pdf LREC-2002: Third International Conference on Language Resources and Evaluation, Las Palmas Canary Islands, 27 May 2002. [PDF, 249KB]

(2002) Richard Campbell, Carmen Lozano, Jessie Pinkham, & Martine Smets: Machine translation as a testbed for multilingual analysis; Coling-2002 workshop "Grammar engineering and evaluation", 1 September 2002, Taipei,Taiwan; 7pp. [PDF, 199KB]

(2002) Marianne Dabbadie, Anthony Hartley, Margaret King, Keith J.Miller, Widad Mustafa El Hadi, Andrei Popescu-Belis, Florence Reeder, & Michelle Vanni: A hands-on study of the reliability and coherence of evaluation metrics. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Machine translation evaluation: human evaluators meet automated metrics, Las Palmas Canary Islands, 27 May 2002; pp.8-16. [PDF, 114KB]

(2002) George Doddington: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. HLT 2002: Human Language Technology Conference: proceedings of the second international conference on human language technology research, March 24-27, 2002, San Diego, California; ed. Mitchell Marcus [San Francisco, CA: Morgan Kaufmann for DARPA]; pp. 138-145. [PDF, 344KB]

(2002) George Doddington: Automatic evaluation of language translation using n-gram cooccurrence statistics. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Machine translation evaluation: human evaluators meet automated metrics, Las Palmas Canary Islands, 27 May 2002; 9pp. [PDF of PPT presentation, 572KB]

(2002) Kurt Godden: Towards a speech-to-speech machine translation quality metric; ACL-2002 workshop "Speech-to-speech translation",11 July 2002, Philadelphia, USA; pp. 117-120 [PDF, 194KB]

(2002) Eduard Hovy, Margaret King, & Andrei Popescu-Belis: Computer-aided specification of quality models for machine translation evaluation. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1239-1246. [PDF, 78KB]

(2002) Eduard Hovy, Maghi King, & Andrei Popescu-Belis: An introduction to MT evaluation. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Machine translation evaluation: human evaluators meet automated metrics, Las Palmas Canary Islands, 27 May 2002; pp.1-7. [PDF, 96KB]

(2002) Alon Lavie, Stephan Vogel, Alex Waibel, Ulrich Germann, Kevin Knight, Daniel Marcu, Young-Suk Lee, Kishore Papineni, Salim Roukos, Franz Josef Och, Moussa Bamba, Chris Cieri, Shudong Huang, Florence Reeder, George Doddington: DARPA TIDES MT group meeting, Marina del Rey, Jan 25, 2002; 9pp. [PDF of PPT, 540KB]

(2002) Kishore Papineni: Machine translation evaluation: n-grams to the rescue. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; 2pp.[abstract only] [PDF, 52KB]

(2002) Kishore Papineni, Salim Roukos, Todd Ward & Wei-Jing Zhu: BLEU: a method for automatic evaluation of machine translation. ACL-2002: 40th Annual meeting of the Association for Computational Linguistics, Philadelphia, July 2002; pp.311-318. [PDF, 281KB]

(2002) Kishore Papineni, Salim Roukos, Todd Ward, John Henderson, & Florence Reeder: Corpus-based comprehensive and diagnostic MT evaluation: initial Arabic, Chinese, French, and Spanish results. HLT 2002: Human Language Technology Conference: proceedings of the second international conference on human language technology research, March 24-27, 2002, San Diego, California; ed. Mitchell Marcus [San Francisco, CA: Morgan Kaufmann for DARPA]; pp. 132-137. [PDF, 198KB]

(2002) Andrei Popescu-Belis, Margaret King, & Houcine Benantar: Towards a corpus of corrected human translations. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Machine translation evaluation: human evaluators meet automated metrics, Las Palmas Canary Islands, 27 May 2002; pp.17-21. [PDF, 50KB]

(2002) Martin Rajman & Anthony Hartley: Automatic ranking of MT systems. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1247-1253. [PDF, 48KB]

(2002) Véronique Sauron: Tearing out the terms: evaluating terms extractors. Translating and the Computer 24: proceedings from the Aslib conference held on 21-22 November 2002 (London: Aslib, 2002); 18pp. [PDF, 142KB]

(2002) Eiichiro Sumita, Yasuhiro Akiba, & Kenji Imamura: Reliability measures for translation quality.  ICSLP 2002, Interspeech 2002:7th International Conference on  Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA; pp.1893-1896; abstract [PDF, 43KB]

(2002) Michelle Vanni & Keith Miller: Scaling the ISLE framework: use of existing corpus resources for validation of MT evaluation metrics across languages. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1254-1262. [PDF, 91K]

(2002) Jianmin Yao, Ming Zhou, Tiejun Zhao, Hao Yu & Sheng Li: An automatic evaluation method for localization oriented lexicalised EBMT system. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 320KB]

(2002) The 2002 NIST machine translation evaluation plan (MT-02). [NIST, 2002]; 2pp. [PDF, 42KB]

(2001) proceedings of the Workshop on MT evaluation, MT Summit VIII, Santiago de Compostela, Spain, 21 September 2001.

(2001) proceedings of MT Eval Workshop, Geneva, 19-24 April 2001.

(2001) Timothy Baldwin: Low cost, high-performance translation retrieval: dumber is better ACL-EACL-2001: 39th Annual meeting [of the Association for Computational Linguistics] and 10th Conference of the European Chapter [of ACL], July 9th - 11th 2001, Toulouse, France; pp.18-25. [PDF, 283KB]

(2001) Martin Braschler & Carol Peters: The CLEF campaign. NTCIR Workshop 2: Proceedings of the Second NTCIR Workshop on Research in Chinese & Japanese Text retrieval and Text Summarization, March 7-9, 2001, Tokyo, Japan; 6pp. [PDF, 163KB]

(2001) Chris Callison-Burch & Raymond S. Flournoy: A program for automatically selecting the best output from multiple machine translation engines. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 63-66. [PDF, 155KB]

(2001) Simon Corston-Oliver, Michael Gamon, & Chris Brockett: A machine learning approach to the automatic evaluation of machine translation ACL-EACL-2001: 39th Annual meeting [of the Association for Computational Linguistics] and 10th Conference of the European Chapter [of ACL], July 9th - 11th 2001, Toulouse, France; pp.140-147. [PDF, 87KB]

(2001) M. Fuji, N. Hatanaka, E. Ito, S. Kamei, H. Kumai, T. Sukehiro, T. Yoshimi & H. Isahara: Evaluation method for determining groups of users who find MT "useful". MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 103-108. [PDF, 303KB]

(2001) Michael Gamon, Hisami Suzuki & Simon Corston-Oliver: Using machine learning for system-internal evaluation of transferred linguistic representations. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 109-114. [PDF, 85KB]

(2001) A. Guessoum & R. Zantout: Semi-automatic evaluation of the grammatical coverage of machine translation systems. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 133-138. [PDF, 55KB]

(2001) Kevin Knight, Lori Levin, Young-Suk Lee, Salim Roukos, & Alex Waibel: Machine translation in TIDES. Planning Committee report, [2001]. 13 slides [PDF from PPT, 20KB]

(2001) Keith J. Miller & Michelle Vanni: Scaling the ISLE taxonomy: development of metrics for the multi-dimensional characterisation of machine translation quality. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.229-238. [PDF, 134KB]

(2001) Hermann Ney: Stochastic modelling: from pattern classification to language translation. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.33-37. [PDF, 68KB]

(2001) Kishore Papineni, Salim Roukos, Todd Ward, & Wei-Jing Zhu: Bleu: a method for automatic evaluation of machine translation. IBM Research Report, RC22176, September 17, 2001. 10pp. [PDF, 379KB]

 (2001) Andrei Popescu-Belis: MT evaluation [workshop at MT Summit VIII].  In: MT News International no.29, December 2001. [PDF]

 (2001) Flo Reeder: Hands-on evaluation workshops: a report on a continuing series.  In: MT News International no.27, Spring 2001. [PDF]

(2001) Florence Reeder: Is that your final answer?  HLT-2001: Proceedings of the First International Conference on Human Language Technology Research, San Diego, CA, March 18-21, 2001; 4pp. [PDF, 55KB]

 (2001) Celia Rico: Reproducible models for CAT tools evaluation: a user-oriented perspective. Translating and the Computer 23: papers from the Aslib conference held on 29 & 30 November 2001 (London: Aslib, 2001); 12pp. [PDF, 76KB]

(2001) Fumiaki Sugaya, Keiji Yasuda, Toshiyuki Takezawa & Seiichi Yamamoto: Precise measurement method of a speech translation system's capability with a paired comparison method between the system and humans. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.345-350. [PDF, 78KB]

(2001) Michelle Vanni & Keith J. Miller: Scaling the ISLE framework: validating tests of machine translation quality for multi-dimensional measurement. MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Workshop on MT Evaluation; pp.21-27. [PDF, 121KB]

(2001) John White: Predicting intelligibility from fidelity in MT evaluation. MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Workshop on MT Evaluation; pp.35-37. [PDF, 235KB]

(2001) Rick Woyde: Introduction to the SAE J2450 translation quality metric. Language International 13 (2), April 2001; pp.37-39. [PDF, 616KB]

(2001) Keiji Yasuda, Fumiaki Sugaya, Toshiyuki Takezawa, Seiichi Yamamoto & Masuzo Yanagida: An automatic evaluation method of translation quality using translation answer candidates queried from a parallel corpus. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.373-378. [PDF, 193KB]

(2001) Shoichi Yokoyama, Hideki Kashioka, Akira Kumano, Masaki Matsudaira, Yoshiko Shirokizawa, Shuji Kodama, Terumasa Ehara, Shinichiro Miyazawa & Yuzo Murata: An automatic evaluation method for machine translation using two-way MT. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.379-384. [PDF, 176KB]

(2001) [NIST]. Automatic evaluation of language translation using n-gram co-occurrence statistics. [NIST, 2001]; 8pp. [PDF of PPT, 40KB]

(2000) Lars Ahrenberg, Magnus Merkel, Anna Sĺgvall Hein, & Jörg Tiedemann: Evaluation of word alignment systems. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 1255-1261. [PDF, 124KB]

(2000) Lars Ahrenberg & Magnus Merkel: Correspondence measures for MT evaluation. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop proceedings: Evaluation of machine translation, Athens, Greece, 29 May 2000; pp. 41-45. [PDF, 219KB]

(2000) Catalina Barbu & Ruslan Mitkov: Evaluation environment for anaphora resolution. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1772KB]

(2000) Niamh Bohan, Elisabeth Breidt, & Martin Volk: Evaluating translation quality as input to product development. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 33-37. [PDF, 57KB]

(2000) Martin Braschler, Donna Harman, Michael Hess, Michael Kluck, Carol Peters, & Peter Schäuble: The evaluation of systems for cross-language information retrieval. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 1469-1474. [PDF, 60KB]

(2000) Michael Carl: A model of competence for corpus-based machine translation Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 997-1001 [PDF,.473KB]

(2000) Lynette Hirschman, Florence Reeder, John D.Burger, & Keith Miller: Name translation as a machine translation evaluation task. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop proceedings: Evaluation of machine translation, Athens, Greece, 29 May 2000; pp. 21-28?. [PDF, 189KB]

(2000) Susanne J. Jekat & Lorenzo Tessiore: End-to-end evaluation of machine interpretation systems: a graphical evaluation tool. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 1583-1588. [PDF, 69KB]

(2000) Douglas A.Jones & Gregory M.Rusk: Toward a scoring function for quality-driven machine translation Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 376-382 [PDF, 644KB]

(2000) Sonja Nießen, Franz Josef Och, Gregor Leusch, & Hermann Ney: An evaluation tool for machine translation: fast evaluation for MT research. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 39-45. [PDF, 1048KB]

(2000) Uwe Reinke: Evaluating the linguistic performance of translation memory systems. [abstract] In: IAI Working Paper no.36, 2000; 5pp. [PDF, 76KB]

(2000) Celia Rico: Evaluation metrics for translation memories. Language International 12 (6), December 2000; pp.36-37. [PDF, 531KB]

(2000) Harold Somers & Elizabeth Wild: Evaluating machine translation: the cloze procedure revisited. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 11pp.  [PDF, 85KB]

 (2000) Lorenzo Tessiore & Walther v. Hahn: Functional validation of a machine interpretation system: Verbmobil. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 611-631. [abstract]

(2000) Michelle Vanni & Florence Reeder: How are you doing? A look at MT evaluation. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.109-116. [go to publisher details]

(2000) Stephan Vogel, Sonja Nießen, & Hermann Ney: Automatic extrapolation of human assessment of translation quality. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop proceedings: Evaluation of machine translation, Athens, Greece, 29 May 2000; pp. 35-39. [PDF, 636KB]

(2000) John S. White: Contemplating automatic MT evaluation. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.100-108. [go to publisher details]

(2000) John White: Toward an automated, task-based MT evaluation strategy. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop proceedings: Evaluation of machine translation, Athens, Greece, 29 May 2000; p. 11. [abstract only] [PDF, 72KB]

(2000) John S.White, Jennifer B. Doyon, & Susan W. Talbott: Task tolerance of MT output in integrated text processes; ANLP/NAACL 2000 workshop: Embedded machine translation systems, May 4, 2000, Seattle, Washington, [USA]; pp.9-16. [PDF, 786KB]

(2000) John White, Jennifer Doyon, & Susan Talbott: Determining the tolerance of text-handling tasks for MT output. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 29-32. [PDF, 34KB]

Evaluations of systems (see also Product reviews, User experiences)

(2004) Heather Fulford & Joaquin Granell-Zafra: The freelance translator's workstation: an empirical investigation 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp.53-61. [PDF, 165KB]

(2004) Federico Gaspari: Online MT services and real users' needs: an empirical usability evaluation. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 74-85. [go to publisher details]

(2004) Federico Gaspari: Integrating on-line MT services into monolingual web-sites for dissemination purposes: an evaluation perspective 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp.62-72. [PDF, 186KB ]

(2004) Nano Gough & Andy Way: Robust large-scale EBMT with marker-based segmentation; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.95-104. [PDF, 94KB]

(2004) Richard Jelinek: Modern MT systems and the myth of human translation: real world status quo. Translating and the Computer 26: proceedings of the Twenty-sixth International Conference on Translating and the Computer, 18-19 November 2004, London. (London: Aslib, 2004); 15pp. [PDF, 149KB]

(2004) J. Laoudi, C.Tate, & Clare R.Voss: Towards an automated evaluation of an embedded MT system 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp. 106-115. [PDF, 250KB]

(2004) Johann Roturier: Assessing the set of controlled language rules: can they improve the performance of commercial machine translation systems? Translating and the Computer 26: proceedings of the Twenty-sixth International Conference on Translating and the Computer, 18-19 November 2004, London. (London: Aslib, 2004); 14pp. [PDF, 178KB]

(2004) Diana Santos, Belinda Mala, & Luís Sarmento: Gathering empirical data to evaluate MT from English to Portuguese.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 14-17. [PDF, 317KB]

(2004) Kiyoshi Sudo, Satoshi Sekine, & Ralph Grishman: Cross-lingual information extraction system evaluation. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 186KB]

(2004) Gregor Thurmair: Comparing rule-based and statistical MT output. LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 5-9. [PDF, 225KB]

(2004) M. Vanni, C.R.Voss, & C. Tate: Ground truth, reference truth & “omniscient truth” – parallel phrases in parallel texts for MT evaluation.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 10-13. [PDF, 352KB]

(2004) Per Weijnitz, Eva Forsbom, Ebba Gustavii, Eva Pettersson, & Jörg Tiedemann: MT goes farming: comparing two machine translation approaches on a new domain.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.2043-2046. [PDF, 296KB]

 (2003) Olga Bezhanova: Analysis of the translation quality of the WordMagic EnglishŰSpanish Interpreter Professional. International Journal of Translation 15 (1), Jan-June 2003; pp.71-79. [PDF, 35KB]

(2003) Katri A. Clodfelder: An LSA implementation against parallel texts in French and English HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 119KB]

(2003) John Hutchins: Has machine translation improved? some historical comparisons MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.181-188. [PDF, 190KB]

(2003) Philipp Koehn, Franz Josef Och, & Daniel Marcu: Statistical phrase-based translation HLT-NAACL 2003: conference combining Human Language Technology conference series and the North American Chapter of the Association for Computational Linguistics conference series,  May 27 – June 1,  2003, Edmonton, Canada; pp.48-54 [PDF, 100KB]

(2003) Rob Koeling, Adam Kilgarriff, David Tugwell, & Roger Evans: An evaluation of a lexicographer's workbench: building lexicons for machine translation 7th EAMT Workshop,"Improving machine translation through other language technology tools", 13 April 2003, Budapest, Hungary; pp. 9-16 [PDF, 250KB]

(2003) Elisabeth Maier & Anthony Clarke: Scalability in MT systems MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.248-253. [PDF, 414KB]

(2003) Harold Somers & Yuri Sugita: Evaluating commercial spoken language translation software MT Summit IX, New Orleans, USA, 23-27 September 2003; pp. 370-377. [PDF, 174KB]

(2003) David Stallard, John Makhoul, Frederick Choi, Ehry Macrostie, Premkumar Natarajan, Richard Schwartz, & Bushra Zawaydeh: Design and evaluation of a limited two-way speech translator. Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.2221-2224; abstract [PDF, 34KB]

(2003) Eiichiro Sumita, Yasuhiro Akiba, Takao Doi, Andrew Finch, Kenji Imamura, Michael Paul, Mitsuo Shimohata & Taro Watanabe: A corpus-centred approach to spoken language translation. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.171-174 [PDF, 243KB]

(2003) Marina Vassiliou, Stella Markantonatou, Yanis Maistros & Vangelis Karkaletsis: Evaluating specifications for controlled Greek Controlled language translation, EAMT-CLAW-03, Dublin City University, 15-17 May 2003; pp.185-193. [PDF, 278KB]

(2003) Ashish Venugopal, Stephan Vogel, & Alex Waibel: Effective phrase translation extraction from aligned models ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 134KB]

(2003) Keiji Yasuda, Fumiaki Sugaya, Toshiyuki Takezawa, Seiichi Yamamoto & Masuzo Yamagida:  Applications of automatic evaluation methods to measuring a capability of speech translation system. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.371-378 [PDF, 1659KB]

 (2002) Michael Benis: Softly spoken or hard of hearing?  Language International 14 (3), June 2002; pp.26-29. [PDF, 1061KB]

(2002) R.Cattoni, G.Lazzari, N.Mana, F.Pianesi, E.Pianta, S.Burger, D.Gates,A.Lavie, L.Levin, C.Langley, K.Peterson, T.Schultz, A.Waibel, D.Wallace, F.Metze, J.McDonough, H.Soltau, L.Besacier, H.Blanchon, D.Vaufreydaz, E.Costantini, & L.Taddei: Not only translation quality: evaluating the NESPOLE! speech-to-speech translation system along other viewpoints; ACL-2002 workshop "Speech-to-speech translation",11 July 2002, Philadelphia, USA; 9pp. [PDF, 148KB]

(2002) Marianne Dabbadie, Widad Mustafa El Hadi, & Ismail Timimi: Terminological enrichment for non-interactive MT evaluation. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1878-1884. [PDF, 63KB]

(2002) Robert E.Frederking, Alan W.Black, Ralf D.Brown, John Moody, & Eric Steinbrecher: Field testing the Tongues speech-to-speech machine translation system. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.160-164. [PDF, 130KB]

(2002) Bowen Hui: Measuring user acceptability of machine translations to diagnose system errors: an experience report; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 7pp. [PDF, 254KB]

(2002) Yu-Seop Kim, Jeong-Ho Chang, Byoung-Tak Zhang: A comparative evaluation of data-driven models in translation selection of machine translation. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 170KB]

(2002) Philippe Langlais, Marie Loranger, & Guy Lapalme: Translators at work with TransType: resource and evaluation. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.2128-2135. [PDF, 253KB]

(2002) Alon Lavie, Florian Metze, Roldano Cattoni, & Erica Costantini: A multi-perspective evaluation of the NESPOLE! speech-to-speech translation system; ACL-2002 workshop "Speech-to-speech translation",11 July 2002, Philadelphia, USA; pp. 121-128 [PDF, 144KB]

(2002) Angela Moisl: A fresh look at MT. [review of Personal Translator.] Language International 14 (5), October 2002; pp.26-31. [PDF, 1631KB]

(2002) Douglas W.Oard, Frederic C. Gey, & Bonnie J. Dorr: Evaluating Arabic retrieval from English or French queries: the TREC-2001 cross-language information retrieval track. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop Arabic language resources and evaluation: status and prospects, Las Palmas de Gran Canaria, Spain, 1 June 2002; 6pp. [PDF, 183KB]

(2002) Martin Rajman & Anthony Hartley: Automatic ranking of MT systems. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1247-1253. [PDF, 48KB]

(2002) Solange Rossato, Hervé Blanchon, & Laurent Besacier: Speech-to-speech translation system evaluation: results for French for the NESPOLE! project first showcase.  ICSLP 2002, Interspeech 2002:7th International Conference on  Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA; pp.1905-1908; abstract [PDF, 68KB]

(2002) Véronique Sauron: Tearing out the terms: evaluating terms extractors. Translating and the Computer 24: proceedings from the Aslib conference held on 21-22 November 2002 (London: Aslib, 2002); 18pp. [PDF, 142KB]

(2002) Fumiaki Sugaya, Keiji Yasuda, Toshiyuki Takezawa, & Seiichi Yamamoto: Quality-sensitive test set selection for a speech translation system; ACL-2002 workshop "Speech-to-speech translation",11 July 2002, Philadelphia, USA; pp. 109-116 [PDF, 297KB]

(2002) Keiji Yasuda, Fumiaki Sugaya, Toshiyuki Takezawa, Seiichi Yamamoto, & Masuzo Yanagida: Automatic machine translation selection scheme to output the best result. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.525-528. [PDF, 113KB]

(2002) Angelika Zerfass: Evaluating translation memory systems. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Language resources for translation work and research, Las Palmas Canary Islands, 27 May 2002; pp.49-52. [PDF, 29KB]

(2002) Machine translation and the Virtual Museum of Canada. [2002]. 188pp. [PDF, 903KB]

 (2001) Olga Bezhanova: Software for processing Spanish: products by Word Magic Software.  International Journal of Translation 13 (1-2), Jan-Dec 2001; pp.159-166. [PDF, 35KB]

(2001) Chris Callison-Burch: Upping the ante for “best of breed” machine translation providers.  Translating and the Computer 23: papers from the Aslib conference held on 29 & 30 November 2001 (London: Aslib, 2001); 9pp. [PDF, 52KB]

(2001) Maki Darwin: Trial and error: an evaluation project on Japanese <> English MT output quality. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 77-82. [PDF, 248KB]

(2001) Sungryong Koh, Jinee Maeng, Ji-Young Lee, Young-Sook Chae & Key-Sun Choi: A test suite for evaluation of English-to-Korean machine translation systems. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.191-195. [PDF, 140KB]

(2001) Derek Lewis: PC-based machine translation: an illustration of capabilities in response to submitted test sentences.  In: Machine Translation Review, issue 12: December 2001; pp.36-57.

(2001) Elisabeth Maier, Anthony Clarke & Hans-Udo Stadler: Evaluation of machine translation systems at CLS Corporate Language Services AG. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.223-228. [PDF, 187KB]

(2001) Sébastien Sauvé, Philippe Langlais, & Guy Lapalme: User interface aspects of a translation typing system.  Advances in Artificial Intelligence: 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence, AI 2001, Ottawa, Canada, June 7-9, 2001, Proceedings; pp. 246-256 [PDF, 267KB]

(2001) Benjamin K. Tsou & Oi Yee Kwong: Evaluating Chinese-English translation systems for personal name coverage MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Towards a Road Map for MT [PDF, 197KB]

(2000) Arendse Bernth & Michael C. McCord: The effect of source analysis on translation confidence. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.89-99. [go to publisher details]

(2000) Damir Ćavar, Uwe Küssner, & Dan Tidhar: From human evaluation to automatic selection of good translations. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop proceedings: Evaluation of machine translation, Athens, Greece, 29 May 2000; pp. 29-33. [PDF, 158KB]

 (2000) Damir Ćavar, Uwe Küssner & Dan Tidhar: From off-line evaluation to on-line selection. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 597-610. [abstract]

 (2000) Bert Esselink: Translators take to the Web. Language International 12 (6), December 2000; pp.34-35. [PDF, 522KB]

(2000) A.Fourla, O.Yannoutsou, I.Tsakou, S.Stamou, & A.Petrits: The contribution of a user group to the evaluation and improvement of an MT system. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 13pp.  [PDF, 132KB]

(2000) M.Holland, C. Schlesinger, & C. Tate: Evaluating embedded machine translation in military field exercises. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.239-247. [go to publisher details]

(2000) Olivier Kraif: Evaluation of statistical tools for automatic extraction of lexical correspondences between parallel texts. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1579KB]

(2000) Philippe Langlais, Sébastien Sauvé, George Foster, Elliott Macklovitch, & Guy Lapalme: Evaluation of TransType, a computer-aided translation typing system: a comparison of a theoretical- and a user-oriented evaluation procedures. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 641-648. [PDF, 1302KB]

(2000) Lori Levin, Boris Bartlog, Ariadna Font Llitjos, Donna Gates, Alon Lavie, Dorcas Wallace, Taro Watanabe, & Monika Woszczyna: Lessons learned from a task-based evaluation of speech-to-speech machine translation. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 721-724. [PDF, 460KB]

 (2000) Michael Malenke, Marcus Bäumler, & Erwin Paulus: Speech recognition performance assessment. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 583-591.[abstract]

(2000) Rita Nübel & Jörg Schütz: Evaluation as a language technology deployment trigger Fifth EAMT Workshop "Harvesting existing resources", May 11 - 12, 2000, Ljubljana, Slovenia; pp.69-75. [PDF, 55KB]

 (2000) Harold Somers & Elizabeth Wild: Evaluating machine translation: the cloze procedure revisited. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 11pp.  [PDF, 85KB]

(2000) Robert Sprung: Guarding the guards. Language International 12 (1), February 2000; pp.24-25. [PDF, 340KB]

(2000) Jochen Steffens & Erwin Paulus: Speech synthesis quality assessment. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 592-596.  [abstract]

(2000) Clare R.Voss & Carol Van Ess-Dykema: When is an embedded MT system “good enough” for filtering? ANLP/NAACL 2000 workshop: Embedded machine translation systems, May 4, 2000, Seattle, Washington, [USA]; pp.1-8. [PDF, 649KB]

 (2000) Jean Véronis, & Philippe Langlais: Evaluation of parallel text alignment systems: the ARCADE project [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 369-388.

(2000) Rémi Zajac, Steve Helmreich, & Karine Megerdoomian: Black-box/glass-box evaluation in Shiraz. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop proceedings: Evaluation of machine translation, Athens, Greece, 29 May 2000; pp. 13-20. [PDF, 693KB]

Examples of MT output

(2004) W. John Hutchins: The Georgetown-IBM experiment demonstrated in January 1954. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 102-114.

(2002) Thei Zervaki: Online free translation services. Translating and the Computer 24: proceedings from the Aslib conference held on 21-22 November 2002 (London: Aslib, 2002); 10pp. [PDF, 33KB]

(2001) Larissa Beliaeva: Machine translation methods, text structure and translator work. International Journal of Translation 13 (1-2), Jan-Dec 2001; pp.119-146. [PDF, 217KB]

Minimum Error Rate Training [MERT]

(2004) Yuan Ding & Martha Palmer: Automatic learning of parallel dependency treelet pairs. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.233-243. [abstract]

(2003) Franz Josef Och: Minimum error rate training in statistical machine translation ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 117KB]

Product reviews

(2001) Christophe Declercq: Breaking new grounds? Meeting the challenge of full integration of computer-aided translation and computer translation: the new version of SDLX. International Journal for Language and Documentation 10, August 2001; pp.31-32. [PDF, 1548KB]

 (2001) Christophe Declercq: Managing the management tools: LTC Organiser multilingual management and workflow control software system. International Journal for Language and Documentation 10, August 2001; pp.14-16. [PDF, 2210KB]

 (2001) Christophe Declercq: PASS on an other continent: is the global market ready for Passolo? Is Passolo ready for the global market? International Journal for Language and Documentation 8, December 2000/January 2001; pp.19-21. [PDF, 777KB]

(2001) Christophe Declercq: TRADOS 5: a one-stop shop for translation projects. International Journal for Language and Documentation 10, August 2001; pp.19-23. [PDF, 1876KB]

(2001) Derek Lewis: PC-based machine translation: an illustration of capabilities in response to submitted test sentences.  In: Machine Translation Review, issue 12: December 2001; pp.36-57.

 (2001) Dimitri Stoquart: Alchemy Catalyst version 3.1. International Journal for Language and Documentation 10, August 2001; pp.26-29. [PDF, 3087KB]

(2001) Reverso Pro. International Journal for Language and Documentation 9, May 2001/June 2001; pp.20-21. [PDF, 412KB]

(2000) Ted Assur: Product review: LTC Organiser. Language International 12 (3), June 2000; pp.30-31. [PDF, 418KB]

 (2000) Robert Clark: LTC Organiser review: let’s get organised! International Journal for Language and Documentation 5, June 2000; pp.22-24. [PDF, 793KB]

(2000) Bob Clark: MoBiMouse, the world’s first “no-click” dictionary program. International Journal for Language and Documentation 3, January 2000; pp.26-27. [PDF, 626KB]

 (2000) Christophe Declercq: SDLX 3.1.2: let’s get localised!  International Journal for Language and Documentation 6, August/September 2000; pp.24, 26-27. [PDF, 2042KB]

(2000) Christophe Declercq: TRADOS 3 further up on the road. International Journal for Language and Documentation 7, October/November 2000; pp.19-21. [PDF, 1960KB]

Quality assurance

(2001) Carmen Heine: Quality assurance in the technical documentation and translation process. Translating and the Computer 23: papers from the Aslib conference held on 29 & 30 November 2001 (London: Aslib, 2001); 13pp. [PDF, 74KB]

(2000) Ian Johnson & Maria-José Palos Caravina: Validation and quality control issues in a new web-based, interactive terminology database for the institutions and agencies of the European Union.  Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 9pp.  [PDF, 54KB]

(2000) Gr.Thurmair: TQPro: quality tools for the translation process. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 7pp.  [PDF, 56KB]

Quality control

(2000) Robert Sprung: Guarding the guards. Language International 12 (1), February 2000; pp.24-25. [PDF, 340KB]

Quality improvement techniques (see also Interactive methods)

(2004) Libin Shen, Anoop Sarkar, & Franz Josef Och: Discriminitive reranking for machine translation.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA; pp.177-184. [PDF, 106KB]

(2003) Bogdan Babych & Anthony Hartley: Improving machine translation quality with automatic named entity recognition 7th EAMT Workshop, "Improving machine translation through other language technology tools", 13 April 2003, Budapest, Hungary; pp. 1-8 [PDF, 313KB]

(2003) Martine Smets, Michael Gamon, Jessie Pinkham, Tom Reutter, & Martine Pettenaro: High quality machine translation using a machine-learned sentence realization component MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.362-369. [PDF, 76KB]

(2003) Rémi Zajac, Elke Lange, & Jin Yang: Customizing complex lexical entries for high-quality MT MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.433-438. [PDF, 277KB]

(2002) Paisarn Charoenpornsawat, Virach Sornlertlamvanich & Thatsanee Charoenporn: Improving translation quality of rule-based machine translation; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 6pp. [PDF, 355KB]

(2002) Marianne Dabbadie, Widad Mustafa El Hadi, & Ismail Timimi: Terminological enrichment for non-interactive MT evaluation. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1878-1884. [PDF, 63KB]

(2002): Hideo Watanabe, Katashi Nagao, Michael C. McCord & Arendse Bernth: An annotation system for enhancing quality of natural language processing. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 202KB]

(2001) Christian Boitet: Four technical and organizational keys to handle more languages and improve quality (on demand) in MT. MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Towards a Road Map for MT [PDF, 48KB]

(2001) Daniel J. Walker, David E. Clements, Maki Darwin and Jan W. Amtrup: Sentence boundary detection: a comparison of paradigms for improving MT quality. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 369-372. [PDF, 181KB]

(2001) Elliott Macklovitch & Antonio S. Valderrábanos: Rethinking interaction: the solution for high-quality MT? MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Towards a Road Map for MT [PDF, 10KB]

(2000) Niamh Bohan, Elisabeth Breidt, & Martin Volk: Evaluating translation quality as input to product development. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 33-37. [PDF, 57KB]

Reading and comprehension

(2002) Li Defeng: Information technology in translator training: reflections on an aborted comprehensibility test of machine-translated texts. In: Chan Sin-wai (ed.) Translation and Information Technology (Hong Kong: Chinese University Press, 2002); pp.165-176.

(2000) Harold Somers & Elizabeth Wild: Evaluating machine translation: the cloze procedure revisited. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 11pp.  [PDF, 85KB]

Translatability

(2004) Bogdan Babych, Debbie Elliott, & Anthony Hartley: Extending MT evaluation tools with translation complexity metrics.  Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 68KB]

(2004) Sharon O’Brien: Machine translatability and post-editing effort: how do they relate? Translating and the Computer 26: proceedings of the Twenty-sixth International Conference on Translating and the Computer, 18-19 November 2004, London. (London: Aslib, 2004); 31pp. [PDF, 38KB]

(2003) Ursula Reuther: Two in one -- can it work? Readability and translatability by means of controlled language Controlled language translation, EAMT-CLAW-03, Dublin City University, 15-17 May 2003; pp.124-132. [PDF, 233KB]

(2001) Nancy Underwood & Bart Jongejan: Translatability checker: a tool to help decide whether to use MT. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 363-368. [PDF, 177KB]

Translationese

(2003) Pernilla Danielsson: Units of meaning in translation – how to make real use of corpus evidence. Translating and the Computer 25: proceedings of the Twenty-fifth International Conference on Translating and the Computer, 20-21 November 2003, London. (London: Aslib, 2003); 15pp. [PDF, 71KB]

Usability of systems

(2004) Federico Gaspari: Online MT services and real users' needs: an empirical usability evaluation. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 74-85. [go to publisher details]

(2004) Youngjik Lee, Jun Park, & Seung-Shin Oh: Usability considerations of speech-to-speech translation system. Interspeech 2004 – ICSLP 8th International  Conference on  Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004; pp.369-372; abstract [PDF, 58KB]

(2001) Winfield Scott Bennett: Creating enterprise machine translation systems. International Journal of Translation 13 (1-2), Jan-Dec 2001; pp.209-215. [PDF,  35KB]

(2001) M. Fuji, N. Hatanaka, E. Ito, S. Kamei, H. Kumai, T. Sukehiro, T. Yoshimi & H. Isahara: Evaluation method for determining groups of users who find MT "useful". MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.103-108. [PDF, 303KB]

(2001) Sébastien Sauvé, Philippe Langlais, & Guy Lapalme: User interface aspects of a translation typing system.  Advances in Artificial Intelligence: 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence, AI 2001, Ottawa, Canada, June 7-9, 2001, Proceedings; pp. 246-256 [PDF, 267KB]