The JHU Submission to the 2020 Duolingo Shared Task on Simulatneous Translation and Paraphrase for Language Education


This paper presents the Johns Hopkins University submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education (STAPLE). We participated in all five language tasks, placing first in each. Our approach involved a language-agnostic pipeline of three components: (1) building strong machine translation systems on general-domain data, (2) fine-tuning on Duolingo-provided data, and (3) generating n-best lists which are then filtered with various score-based techniques. In addition to the language-agnostic pipeline, we attempted a number of linguistically-motivated approaches, with, unfortunately, little success. We also find that improving BLEU performance of the beam-search generated translation does not necessarily improve on the task metric—weighted macro F1 of an n-best list.

Proceedings of the Fourth Workshop on Neural Generation and Translation, Association for Computational Linguistics
Jacob Bremerman
Jacob Bremerman
M.S. Student

My research interests include natural language processing, artificial intelligence and linguistics.