AMTA 2014

Proceedings of the Eleventh Conference of the

Association for Machine Translation in the Americas

Vancouver, BC

October 22-26, 2014

 

[amta2014.amtaweb.org]

 

Table of Contents

 

Vol.1: MT Researchers Track

Yaser Al-Onaizan, Michel Simard, editors

 

Expressive hierarchical rule extraction for left-to-right translation

Maryam Siahbani, Anoop Sarkar…………1-14

 

Bayesian iterative-cascade framework for hierarchical phrase-based translation

Baskaran Sankaran, Anoop Sarkar…………15-27

 

Coarse “split and lump” bilingual language models for richer source information in SMT

Darlene Stewart, Roland Kuhn, Eric Joanis, George Foster……………28-41

 

Using any machine translation source for fuzzy-match repair in a computer-aided translation setting

John E.Ortega, Felipe Sánchez-Martinez, Mikel L.Forcada…………..42-53

 

Enhancing statistical machine translation with bilingual terminology in a CAT environment

Mihael Arcan, Marco Turchi, Sara Topelli, Paul Buitelaar…………..54-68

 

Clean data for training statistical MT: the case of MT contamination

Michel Simard…………..69-82

 

Bilingual phrase-to-phrase alignment for arbitrarily-small datasets

Kevin Flanagan……………83-95

 

A probabilistic feature-based fill-up for SMT

Jian Zhang, Liangyou Li, Andy Way, Qun Liu…………..96-109

 

Document-level re-ranking with soft lexical and semantic features for statistical machine translation

Chenchen Ding, Masao Utiyama, Eiichiro Sumita……………110-123

 

A comparison of mixture and vector space techniques for translation model adaptation

Boxing Chen, Roland Kuhn, George Foster…………….124-138

 

Combining domain and topic adaptation for SMT

Eva Hasler, Barry Haddow, Philipp Koehn………….139-151

 

Online multi-user adaptive statistical machine translation

Prashant Mathur, Mauro Cettolo, Marcello Federico, José G.C.de Souza               …………..152-165

 

The repetition rate of text as a predictor of the effectiveness of machine translation adaptation

Mauro Cettolo, Nicola Bertoldi, Marcello Federico…………….166-179

 

Expanding machine translation training data with an out-of-domain corpus using language modeling based vocabulary saturation

Burak Aydın, Arzucan Özgür…………..180-192

 

Comparison of data selection techniques for the translation of video lectures

Joern Wuebker, Hermann Ney, Adrià Martínez-Villaronga, Adrià Giménez, Alfons Juan, Christophe Servan, Marc Dymetman, Shachar Mirkin…………….193-207

 

Review and analysis of China workshop on machine translation 2013 evaluation

Sitong Yang, Heng Yu, Hongmei Zhao, Qun Liu, Yajuan ……………208-221

 

Combining techniques from different NN-based language models for machine translation

Jan Niehues, Alexander Allauzen, François Yvon, Alex Waibel ………….222-233

 

Japanese-to-English patent translation system based on domain-adapted word segmentation and post-ordering

Katsuhito Sudoh, Masaaki Nagata, Shinsuke Mori, Tatsuya Kawahara……………234-248

 

A discriminative framework of integrating translation memory features into SMT

Liangyou Li, Andy Way, Qun Liu…………..249-260

 

Assessing the impact of speech recognition errors on machine translation quality

Nicholas Ruiz, Marcello Federico……………261-274

 

Using noun class information to model selectional preferences for translating prepositions in SMT

Marion Weller, Sabine Schulte im Walde, Alexander Fraser………….275-287

 

Predicting human translation quality

Lucia Specia, Kashif Shah…………….288-300

 

Data selection for compact adapted SMT models

Shachar Mirkin, Laurent Besacier……………301-314

 

Pivot-based triangulation for low-resource languages

Rohit Dholakia, Anoop Sarkar……………315-328

 

An Arabizi-English social media statistical machine translation system

Jonathan May, Yassine Benjira, Abdessamad Echihabi…………..329-341

 

Automatic dialect classification for statistical machine translation

Saab Mansour, Yaser Al-Onaizan, Graeme Blackwood, Christoph Tillmann……………342-355

 

A tunable language model for statistical machine translation

Junfei Guo, Juan Liu, Qi Han, Andreas Maletti…………….356-368

 

Vol.2: MT Users Track

Olga Beregovaya, Mike Dillinger, Jennifer Doyon, Raymond Flournoy, Patti O’Neill-Brown & Chuck Simmons, editors

 

Commercial MT users

 

Linguistic QA for MT of user-generated content at eBay

Jose Sanchez, Tanya Badeka (eBay Inc.)…………..1-24

 

Reducing time and tedium with translation technology: the six-pound challenge

Scott Gaskill (Sovee)…………..25-30

 

Machine translation for global e-commerce on eBay

Jyoti Guha, Carmen Heger (eBay Inc.)…………..31-37

 

When to choose SMT: typology of documents

François Lanctôt (SilexCreations Inc.)……………38-49

 

Machine translation and post-editing for user generated content: an LSP perspective

Elaine O’Curran   (Welocalize)…………..50-54

 

Challenges of machine translation for user generated content: queries from Brazilian users

Silvio Picinini…………..55-65

 

Real-world challenges in application of MT for localization: the Baltic case

Mārcis Pinnis, Raivis Skadiņš, Andrejs Vasiļjevs……………66-79

 

Machine translation is not one size fits all     

Lori Thicke (LexWorks)………….80-104

 

From the lab to the market: commercialising MT research

John Tinsley (Iconic Translation Machines)…………..105-130

 

Tools-driven content curation and engine tuning

Alex Yanishevsky (Welocalize)……………..131-151

 

Term translation central: up-to-date MT without frequent retraining

Ventsislav Zhechev (Autodesk)……………152-159

 

Government MT users

 

Translation technology in action: a US government use case

Vanesa Jurica…………….160-180

 

Machine translation for e-government – the Baltic case

Andrejs Vasiļjevs, Rihards Kalniņš, Mārcis Pinnis, Raivis Skadiņš…………..181-193

 

Panel: Inserting CAT tools into a government LSP environment

Tanya Helmen, Vanesa Jurica, Danielle Silverman, Elizabeth Richerson (NVTC)………….194-202

 

A novel use of MT in the development of a text level analytic for language learning

Carol Van Ess-Dykema, Salim Roukos, Amy Weinberg…………..203-212

 

Technology showcase guide

Jennifer DeCamp……………27pp

 

                                  Tutorials

 

Handling entities in MT/CAT/HLT

Keith Miller, Linda Moreau, Sherri Condon…………..88 slides

 

Interaction design for MT interfaces

Patricia O’Neill-Brown……………  38 slides

 

MateCat: an open source CAT tool for MT post-editing

Marcello Federico, Nicola Bertoldi, Marco Trombetti, Alessandro Cattelan……………98 slides

 

Working with MateCat: user manual and installation guide…………..75 slides

 

Statistical machine translation with the Moses toolkit

Hieu Hoang, Matthias Huck, Philipp Koehn……………146 slides

 

                                    Workshops

 

Workshop on interactive and adaptive machine translation

ed. Francisco Casacuberta, Marcello Federico, Philipp Koehn

 

Integrating online and active learning in a computer-assisted translation workbench

Vicent Alabau, Jesús González-Rubio, Daniel Ortiz-Martínez, Germán Sanchis-Trilles, Francisco Casacuberta, Mercedes García-Martínez, Bartolomé Mesa-Lao, Dan Cheung Petersen, Barbara Dragsted, Michael Carl…………1-8

 

Towards a combination of online and multitask learning for MT quality estimation: a preliminary study

José G.C. de Souza, Marco Turchi, Matteo Negri……………9-19

 

Dynamic phrase tables for machine translation in an interactive post-editing scenario

Ulrich Germann…………..20-31

 

Optimized MT online learning in computer assisted translation

Prashant Mathur, Mauro Cettolo……………               32-41

 

Behind the scenes in an interactive speech translation system

Mark Seligman, Mike Dillinger……………42-50

 

Predicting post-editor profiles from the translation process

Karan Singla, David Orrego-Carmona, Ashleigh Rhea Gonzales, Michael Carl, Srinivas Bangalore……51-60

 

Third workshop on post-editing technology and practice (WPTP-3)

ed. Sharon O’Brien, Michel Simard, Lucia Specia

 

MT post-editing into the mother tongue of into a foreign language?  Spanish-to-English MT translation output post-edited by translation trainees

Pilar Sánchez-Gijón, Olga Torres-Hostench……………5-19

 

Comparison of post-editing productivity between professional translators and lay users

Nora Aranberri, Gorka Labaka, Arantza Diaz de Ilarraza, Kepa Sarasola…………….20-33

 

Monolingual post-editing by a domain expert is highly effective for translation triage

Lane Schwartz…………….34-44

 

Perceived vs. measured performance in the post-editing of suggestions from machine translation and translation memories

Carlos S.C. Teixeira…………..45-59

 

Perception vs. reality: measuring machine translation post-editing productivity

Federico Gaspari, Antonio Toral, Sudip Kumar Naskar, Declan Groves, Andy Way…………..60-72

 

Cognitive demand and cognitive effort in post-editing

Isabel Lacruz, Michael Denkowski, Alon Lavie……………73-84

 

Vocabulary accuracy of statistical machine translation in the legal context

Jeffrey Killman……………85-98

 

Towards desktop-based CAT tool instrumentation

John Moran, Christian Saam, Dave Lewis…………..99-112

 

Translation quality in post-edited versus human-translated segments: a case study

Elaine O’Curran………….113-118

 

TAUS post-editing course

Attila Görög……………119

 

TAUS post-editing productivity tool

Attila Görög…………..120

 

QuEst: A framework for translation quality estimation

Lucia Specia, Kashif Shah…………..121

 

An open source desktop post-editing tool

Lane Schwartz…………..122

 

Real time adaptive machine translation: cdec and TransCenter

Michael Denkowski, Alon Lavie, Isabel Lacruz, Chris Dyer…………123

 

Post-editing user interface using visualization of a sentence structure

Yudai Kishimoto, Toshiaki Nakazawa, Daisuke Kawahara, Sadao Kurohashi…………..124

 

Kanjingo: a mobile app for post-editing

Sharon O’Brien, Joss Moorkens, Joris Vreeke……………125-127