AI LQA Agents 2026: Automated Translation Quality Estimation

Q: How it works (The Workflow)

1. **Translation Agent:** Generates the raw translation. 2. **LQA Agent:** Reads the source and target. 3. **Evaluation:** It scores the translation (0-100) and checks for specific error types (Mistranslation, Omission, Hallucination). 4. **Decision:** - **Score > 90 (Pass):** Auto-approve and deploy. - **Score < 90 (Fail):** Auto-fix if the error is obvious (e.g., wrong terminology), or flag for human review if ambiguous. --- ## MTQE vs. AI LQA: Knowing the Difference There are

The 98% Problem

Here is the dirty secret of the localization industry: 95-98% of your localization budget is spent verifying content that is already correct.

Modern AI translation (GPT-4o, Claude 3.5) is accurate over 90% of the time. Yet, companies still pay human reviewers to read every single word, just to catch that remaining 10%.

This is the bottleneck. You cannot ship daily if you have a 3-day human review cycle.

The solution is AI-Powered LQA (Language Quality Assurance). In 2026, we don't just use AI to translate; we use AI to check the translation.

What is an LQA Agent?

An LQA Agent is an autonomous system that evaluates translations for accuracy, tone, glossaries, and style. Unlike simple spell-checkers, it "understands" the meaning.

It acts as a digital editor, providing a second pair of eyes.

How it works (The Workflow)

Translation Agent: Generates the raw translation.
LQA Agent: Reads the source and target.
Evaluation: It scores the translation (0-100) and checks for specific error types (Mistranslation, Omission, Hallucination).
Decision:
- Score > 90 (Pass): Auto-approve and deploy.
- Score < 90 (Fail): Auto-fix if the error is obvious (e.g., wrong terminology), or flag for human review if ambiguous.

MTQE vs. AI LQA: Knowing the Difference

There are two primary technologies for automated quality:

1. MTQE (Machine Translation Quality Estimation)

The "Probability" Approach. This is a statistical score (usually from the translation model itself) indicating how confident the model is.

Pros: Fast, cheap.
Cons: often "hallucinates" confidence. A model can be 99% confident about a completely wrong logical statement.

2. AI LQA (Agentic Review)

The "Critique" Approach. This uses a separate, often smarter, LLM to act as a critic. You prompt the model: "You are a professional editor. Find errors in this translation based on the MQM framework."

Pros: Explains why something is wrong. Can reference external glossaries. High accuracy.
Cons: More computationally expensive (requires more tokens).

IntlPull uses the Agentic Review approach. We believe that for enterprise software, you need explainable quality, not just a opaque probability score.

The MQM Framework + LLMs

The MQM (Multidimensional Quality Metrics) framework is the industry standard for categorizing translation errors.

In 2026, we have successfully taught LLMs to use MQM. When an IntlPull LQA Agent flags a string, it categorizes it:

Accuracy: Mistranslation, Untranslated text.
Fluency: Grammar, Spelling, Inconsistency.
Terminology: Non-adherence to glossary.
Style: Tone mismatch (e.g., too formal).

This structured data allows you to track "Quality Over Time" dashboards without manual data entry.

Building a Hybrid Human + AI Workflow

You shouldn't fire all your human reviewers. You should promote them.

The "Exception-Based" Workflow

Instead of reading 10,000 words, your human linguist only reads the 500 words the LQA Agent flagged as "Low Confidence" or "Ambiguous."

Impact:

Throughput: Increases by 10-20x.
Cost: Decreases by ~60%.
Job Satisfaction: Translators focus on tricky creative problems, not fixing typos.

IntlPull Implementation

IntlPull is the first platform to treat LQA as a first-class citizen.

YAML
1# .intlpull.yml
2lqa:
3  enabled: true
4  provider: claude-3-5-sonnet
5  thresholds:
6    auto_approve: 95
7    flag_for_human: 80
8  glossary_check: strict

When enabled, every translation pass automatically triggers an LQA pass. You see the LQA score right in the dashboard. High-scoring strings turn green instantly; low-scoring ones enter the "Review Queue."

Trusting the Machine

The biggest hurdle isn't technology; it's trust. "Can I really let AI deploy content to my production app without a human seeing it?"

In 2026, the answer is Yes, if you have the right guardrails.

Start with low-risk languages.
Manually audit 10% of "Auto-Approved" strings to verify agent performance.
Use LQA Agents to catch what humans miss (consistency errors across 50 files).

The Future is Verified. LQA Agents turn translation from a black box into a measurable, verifiable engineering process. It is the key to shipping global software at the speed of CI/CD.

Read our Technical Docs on LQA Configuration

Trusting the Machine: How LQA Agents Are Replacing Manual Review in 2026