Researchers Raise Alarms on Vulnerabilities of Machine Translation Systems
Researchers highlight the vulnerabilities to malicious attacks of multilingual public datasets and the MT systems based on them, deliberately corrupting corpora to test it.
*Register Now* Join SlatorCon London on May 23, 2024
News and analysis of the latest developments in machine translation, computer-aided-translation, natural language processing, and other language-related areas in artificial intelligence.
Researchers highlight the vulnerabilities to malicious attacks of multilingual public datasets and the MT systems based on them, deliberately corrupting corpora to test it.
China-based researchers, including some from tech giant Huawei, publish a comprehensive overview of machine translation quality estimation.
A group of researchers propose a two-stage fine-tuning method to address off-target translations and boost LLMs performance in translation.
The world’s leading machine translation tool no longer accepts human input through its Contribute feature citing significantly evolved systems.
Researchers harness web crawls from Internet Archive and CommonCrawl to release new language resources aimed at supporting language modeling and machine translation training.
Researchers from academia and Unbabel introduce CONTEXT-MQM, an LLM-based metric tailored for machine-translated chats. The model uses context to improve the evaluation process.
Researchers from the University of Helsinki, Silo AI, and NVIDIA introduce MAMMOTH, a toolkit designed to simplify the training of massively multilingual modular machine translation systems at scale.
Researchers from Google suggest that embracing direct inference over pre-translation enhances large language model performance and facilitates seamless multilingual communication.
Alibaba researchers set out to test a model training and fine-tuning methodology that addresses the special characteristics of e-commerce machine translation, like keyword text.
Researchers highlight accuracy and fluency significance in translation. Separating the two may enhance machine translation evaluation metrics and optimize training and performance.
Singapore’s NTU and NVIDIA researchers introduce GenTranslate, a model using the capabilities of large language models to serve as a generative translator for speech and text.
Johns Hopkins University and Microsoft researchers underscore that gold-standard translations are not always “gold” and propose a method to enhance the performance of LLMs in machine translation.
The latest iteration of Google’s buzzy large language model differentiates itself with its “long-contextability” for quality assurance, automatic speech recognition, and translation.
Academia and industry team up to launch CroissantLLM, an open-source French-English LLM that shows strong translation performance and runs well on consumer-grade local hardware.
Researchers from Pennsylvania State University introduce MT-Ranker, a machine translation evaluation system, which reportedly shows state-of-the-art correlation with human judgments.
New research explores just how well large language models perform in “classic” machine translation challenges.
While some literary translators warn colleagues of companies hiring post-editors for machine-translated texts, some publishers take pride in what they see as innovative workflows.
Researchers from Monash University and Google demonstrate the effectiveness of fine-tuned large language models in document-level machine translation.
Amazon researchers flag the “garbage in, garbage out” problem when using low-quality, web-scraped, machine-translated data to train multilingual large language models.
Researchers from Google demonstrate that machine translation is more “conservative” than human translation, with less morphosyntactic diversity, more convergent patterns, and more one-to-one alignments.
Slator Weekly: Join 16,000 subscribers and get the latest language industry intelligence every Friday
Tool Box: Join 10,000 subscribers for a monthly linguist technology update
Your information will not be shared with third parties. No Spam.
This will close in 0 seconds