1

On the Evaluation of Machine Translation n-best Lists

The standard machine translation evaluation framework measures the single-best output of machine translation systems. There are, however, many situations where n-best lists are needed, yet there is no established way of evaluating them. This paper …

Two Test Collections for Retrieval Using Named Entity Markup

Studying the effects of semantic analysis on retrieval effectiveness can be difficult using standard test collections because both queries and documents typically lack semantic markup. This paper describes extensions to two test collections, CLEF …

The JHU Submission to the 2020 Duolingo Shared Task on Simulatneous Translation and Paraphrase for Language Education

This paper presents the Johns Hopkins University submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education (STAPLE). We participated in all five language tasks, placing first in each. Our approach …