Published in Empirical Methods in Natural Language Processing (EMNLP) (Short), 2024
We quantify performance gaps between training on captions that come from native German perception and captions that have been either machine-translated or human-translated from English into German. To address these gaps, we further propose and evaluate caption augmentation strategies.