Translating Literature with Large Language Models: Integrating External Knowledge and Maintaining Consistency in Named Entities

Presenter
Jiarui Liu
Campus
UMass Amherst
Sponsor
Mohit Iyyer, Department of Computer Science, UMass Amherst
Schedule
Session 2, 11:30 AM - 12:15 PM [Schedule by Time][Poster Grid for Time/Location]
Location
Poster Board A34, Campus Center Auditorium, Row 2 (A21-A40) [Poster Location Map]
Abstract
As the world grows increasingly interconnected, the demand for accessible long-form documents across languages escalates. This research delves into the capabilities of Large Language Models (LLMs) in translating literature works, with a focus on enhancing consistency in named entity translations and integrating external knowledge sources.

Central to literary translations is the accurate and consistent translation of named entities such as main character names and significant proper nouns. Recent Language Models, often restricted by their context window or budget, face challenges in maintaining this consistency in long-form text translations. This study proposes a novel approach that merges the sophisticated capabilities of LLMs with traditional human translation practices such as Termbase, translation memory, and leveraging external knowledge bases like Wikipedia & linguistic corpus.

The methodology involves developing an auxiliary retrieval system and integrating translation models with retrieval-augmented generation (RAG) and in-context learning (ICL) approaches. The study aims to ensure the coherent translation of named entities and to enhance the accuracy of these translations in contexts. 

The preliminary results have shown notable improvements in both the consistency and accuracy of named entity translations. This study increases the accessibility of literary works across languages, fostering a more inclusive global cultural exchange. It also encourages further exploration into the ways Large Language Models can be utilized and improved for complex tasks, bridging the gap between artificial intelligence and the nuanced art of human work.
Keywords
Natural language processing, Retrieval-Augmented Generation, Machine translation, Named Entity Translation
Research Area
Computer Science

SIMILAR ABSTRACTS (BY KEYWORD)

Research Area Presenter Title Keywords
Computer Science Jung, Hayun Natural Language Processing
Computer Science Kakkar, Ishita Natural Language Processing
Computer Science Petlin, Algis O. Natural Language Processing (NLP)