Understanding Translation File Formats
Translation file formats are the foundation of any internationalization workflow, serving as the bridge between developers, translators, and the final localized application. Choosing the right format impacts everything from developer experience to translation quality, tool compatibility, and deployment workflows. While JSON has become ubiquitous in modern web development, formats like XLIFF (XML Localization Interchange File Format) dominate professional translation environments, PO (Portable Object) files remain the standard for open-source projects, YAML offers human-friendly syntax for configuration-heavy projects, and ARB (Application Resource Bundle) provides Flutter's native solution. Each format has distinct strengths: JSON excels in simplicity and JavaScript integration, XLIFF provides rich metadata for professional translators, PO offers robust pluralization and context handling, YAML balances readability with structure, and ARB combines JSON's simplicity with Flutter-specific features. Understanding these differences helps teams select the optimal format for their tech stack, workflow, and scalability requirements, while avoiding costly migrations later.
JSON Format Deep Dive
JSON (JavaScript Object Notation) has become the de facto standard for web application translations due to its native JavaScript support and simplicity. The basic structure uses key-value pairs:
JSON1{ 2 "welcome": "Welcome to our app", 3 "auth.login": "Log in", 4 "auth.signup": "Sign up", 5 "items.count": "{count, plural, one {# item} other {# items}}" 6}
Structured JSON Approach
Many teams prefer nested structures for better organization:
JSON1{ 2 "auth": { 3 "login": "Log in", 4 "signup": "Sign up", 5 "forgot_password": "Forgot password?" 6 }, 7 "dashboard": { 8 "welcome": "Welcome back, {name}", 9 "stats": { 10 "users": "Total users", 11 "revenue": "Revenue" 12 } 13 } 14}
JSON with Metadata
Some frameworks support metadata through extended formats:
JSON1{ 2 "welcome": { 3 "message": "Welcome to our app", 4 "description": "Greeting shown on homepage", 5 "maxLength": 50 6 }, 7 "auth.login": { 8 "message": "Log in", 9 "description": "Main login button text", 10 "screenshot": "https://example.com/screenshots/login.png" 11 } 12}
Pros and Cons of JSON
Advantages:
- Native JavaScript support—no parsing library needed
- Simple, readable syntax familiar to all web developers
- Excellent tooling support (linters, validators, IDE features)
- Easy version control with Git (good diff visibility)
- Direct import in Node.js and modern build tools
- Lightweight file size
Disadvantages:
- No standardized metadata support (comments, context, max length)
- No built-in pluralization or gender handling (requires ICU MessageFormat)
- Flat key-value structure can become unwieldy in large projects
- No explicit source/target language markers
- Limited support in professional CAT (Computer-Assisted Translation) tools
- No translation state tracking (approved, needs review, etc.)
XLIFF Format Deep Dive
XLIFF (XML Localization Interchange File Format) is the industry standard for professional translation workflows, supported by all major CAT tools like SDL Trados, MemoQ, and Phrase.
XLIFF 1.2 Structure
XML1<?xml version="1.0" encoding="UTF-8"?> 2<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2"> 3 <file source-language="en" target-language="es" datatype="plaintext" original="messages.json"> 4 <body> 5 <trans-unit id="welcome" resname="welcome"> 6 <source>Welcome to our app</source> 7 <target>Bienvenido a nuestra aplicación</target> 8 <note>Greeting shown on homepage</note> 9 </trans-unit> 10 <trans-unit id="auth.login" resname="auth.login"> 11 <source>Log in</source> 12 <target state="needs-translation"/> 13 <note>Main login button text</note> 14 <context-group purpose="location"> 15 <context context-type="sourcefile">src/components/LoginButton.tsx</context> 16 </context-group> 17 </trans-unit> 18 </body> 19 </file> 20</xliff>
XLIFF 2.0 Improvements
XLIFF 2.0 (released 2014) introduced cleaner syntax and better inline formatting:
XML1<?xml version="1.0" encoding="UTF-8"?> 2<xliff version="2.0" srcLang="en" trgLang="es" xmlns="urn:oasis:names:tc:xliff:document:2.0"> 3 <file id="messages"> 4 <unit id="welcome"> 5 <segment> 6 <source>Welcome to our app</source> 7 <target>Bienvenido a nuestra aplicación</target> 8 </segment> 9 <notes> 10 <note category="context">Greeting shown on homepage</note> 11 </notes> 12 </unit> 13 <unit id="profile.bio"> 14 <segment> 15 <source>Your bio with <pc id="1" type="fmt" subType="bold">bold text</pc></source> 16 <target>Tu biografía con <pc id="1" type="fmt" subType="bold">texto en negrita</pc></target> 17 </segment> 18 </unit> 19 </file> 20</xliff>
Pros and Cons of XLIFF
Advantages:
- Industry standard with universal CAT tool support
- Rich metadata (context, notes, max length, state tracking)
- Explicit source and target language specification
- Translation state management (new, translated, reviewed, final)
- Inline formatting preservation (bold, italic, links)
- Change tracking and revision history
- Supports complex workflows (translation, review, approval)
Disadvantages:
- Verbose XML syntax reduces human readability
- Requires parsing library (not natively supported in browsers)
- Larger file sizes compared to JSON
- Complex to hand-edit
- Git diffs are harder to read
- Not commonly used in developer tooling
- Learning curve for developers unfamiliar with XML
PO Format Deep Dive
PO (Portable Object) files originated from GNU gettext and remain the standard for Linux, open-source projects, and Python applications.
PO File Structure
PO1# Translation file for MyApp 2# Copyright (C) 2026 MyCompany 3msgid "" 4msgstr "" 5"Project-Id-Version: MyApp 1.0\n" 6"Language: es\n" 7"MIME-Version: 1.0\n" 8"Content-Type: text/plain; charset=UTF-8\n" 9"Plural-Forms: nplurals=2; plural=(n != 1);\n" 10 11#: src/components/LoginButton.tsx:15 12#. Main login button text 13msgid "Log in" 14msgstr "Iniciar sesión" 15 16#: src/components/Welcome.tsx:8 17#. Greeting shown on homepage 18msgctxt "homepage" 19msgid "Welcome" 20msgstr "Bienvenido" 21 22# Plural form example 23#: src/components/ItemList.tsx:22 24msgid "You have {count} item" 25msgid_plural "You have {count} items" 26msgstr[0] "Tienes {count} artículo" 27msgstr[1] "Tienes {count} artículos"
Context Handling in PO
PO files handle identical source strings with different meanings using msgctxt:
PO1# "Open" as in "open a file" 2msgctxt "file_menu" 3msgid "Open" 4msgstr "Abrir" 5 6# "Open" as in "shop is open" 7msgctxt "shop_status" 8msgid "Open" 9msgstr "Abierto"
Pros and Cons of PO
Advantages:
- Built-in pluralization support (language-specific rules)
- Context disambiguation with msgctxt
- Extensive tooling (Poedit, Lokalize, Gtranslator)
- Comments and translator notes built-in
- Wide adoption in open-source ecosystem
- Machine-readable and human-readable
- Source file location tracking
Disadvantages:
- Less common in JavaScript/web development
- Requires gettext library or compatible parser
- Two separate files needed (PO for translation, MO for compiled binary)
- Limited support in modern web frameworks
- Plural syntax differs from ICU MessageFormat
- No inline formatting preservation
- Header complexity for beginners
YAML Format Deep Dive
YAML (YAML Ain't Markup Language) prioritizes human readability and is popular in Ruby on Rails and configuration-heavy projects.
YAML Structure
YAML1en: 2 auth: 3 login: "Log in" 4 signup: "Sign up" 5 forgot_password: "Forgot password?" 6 dashboard: 7 welcome: "Welcome back, %{name}" 8 stats: 9 users: "Total users" 10 revenue: "Revenue" 11 items: 12 count: 13 one: "%{count} item" 14 other: "%{count} items"
YAML with Metadata
YAML1en: 2 welcome: 3 _message: "Welcome to our app" 4 _description: "Greeting shown on homepage" 5 _max_length: 50 6 auth: 7 login: 8 _message: "Log in" 9 _description: "Main login button text" 10 _screenshot: "https://example.com/screenshots/login.png"
Multi-Line Strings in YAML
YAML1en: 2 about: 3 description: | 4 This is a long description 5 that spans multiple lines. 6 Each line will be preserved. 7 terms: > 8 This is another long text, 9 but line breaks will be converted 10 to spaces in the final output.
Pros and Cons of YAML
Advantages:
- Highly readable, minimal syntax
- Supports nested structures naturally
- Comments allowed (using #)
- Multi-line string support
- Popular in Ruby on Rails ecosystem (default format)
- Anchors and aliases for reusing content
- Widely supported in config management tools
Disadvantages:
- Indentation-sensitive (whitespace errors common)
- No standardized metadata format
- Security risks if parsing untrusted YAML (code execution)
- Inconsistent parser behavior across languages
- Poor Git diff visibility for nested changes
- Quotes and escaping can be confusing
- Not native to JavaScript (requires parser)
ARB Format Deep Dive
ARB (Application Resource Bundle) is Google's format for Flutter applications, combining JSON structure with ICU MessageFormat.
ARB Structure
JSON1{ 2 "@@locale": "en", 3 "@@context": "MyApp translations", 4 "welcome": "Welcome to our app", 5 "@welcome": { 6 "type": "text", 7 "description": "Greeting shown on homepage", 8 "context": "Homepage header", 9 "maxLength": 50 10 }, 11 "greetUser": "Hello, {username}!", 12 "@greetUser": { 13 "type": "text", 14 "description": "Personalized greeting", 15 "placeholders": { 16 "username": { 17 "type": "String", 18 "example": "John" 19 } 20 } 21 }, 22 "itemCount": "{count, plural, one {{count} item} other {{count} items}}", 23 "@itemCount": { 24 "type": "text", 25 "description": "Number of items in cart", 26 "placeholders": { 27 "count": { 28 "type": "int", 29 "format": "compact" 30 } 31 } 32 } 33}
ARB Gender Support
JSON1{ 2 "genderMessage": "{gender, select, male {He liked this} female {She liked this} other {They liked this}}", 3 "@genderMessage": { 4 "type": "text", 5 "description": "Activity notification", 6 "placeholders": { 7 "gender": { 8 "type": "String" 9 } 10 } 11 } 12}
Pros and Cons of ARB
Advantages:
- Rich metadata support with @ prefix convention
- ICU MessageFormat built-in (plurals, gender, select)
- Type information for placeholders
- Native Flutter/Dart support
- Example values for translator context
- JSON compatibility (easy to parse)
- Strong parameter validation
Disadvantages:
- Flutter/Dart ecosystem only
- Doubled file size (metadata for each key)
- Verbose for simple translations
- Limited tooling outside Flutter ecosystem
- Metadata syntax not standardized beyond Google
- Complex for non-technical translators
- Manual synchronization of @ keys required
Framework Support Matrix
| Format | React | Vue | Angular | Flutter | Rails | Django | iOS | Android |
|---|---|---|---|---|---|---|---|---|
| JSON | ✅ Native | ✅ Native | ✅ Native | ⚠️ Manual | ✅ i18n-js | ✅ Django | ⚠️ Manual | ⚠️ Manual |
| XLIFF | 🔧 i18next | 🔧 vue-i18n | ✅ Native | ⚠️ Plugin | ⚠️ Manual | ⚠️ Manual | ✅ Native | ✅ Native |
| PO | 🔧 i18next | 🔧 Plugin | ⚠️ Manual | ⚠️ Manual | 🔧 GetText | ✅ Native | ⚠️ Manual | ⚠️ Manual |
| YAML | 🔧 Plugin | 🔧 Plugin | ⚠️ Manual | ⚠️ Manual | ✅ Native | ✅ Native | ⚠️ Manual | ⚠️ Manual |
| ARB | ⚠️ Manual | ⚠️ Manual | ⚠️ Manual | ✅ Native | ⚠️ Manual | ⚠️ Manual | ⚠️ Manual | ⚠️ Manual |
Legend:
- ✅ Native: First-class framework support
- 🔧 Plugin: Well-maintained plugin/library available
- ⚠️ Manual: Requires custom integration
Migration Between Formats
JSON to XLIFF Conversion
TypeScript1import { create } from 'xmlbuilder2'; 2 3interface Translation { 4 [key: string]: string | Translation; 5} 6 7function jsonToXliff( 8 json: Translation, 9 sourceLang: string, 10 targetLang: string, 11 prefix: string = '' 12): any { 13 const units: any[] = []; 14 15 Object.entries(json).forEach(([key, value]) => { 16 const fullKey = prefix ? `${prefix}.${key}` : key; 17 18 if (typeof value === 'string') { 19 units.push({ 20 '@id': fullKey, 21 segment: { 22 source: value, 23 target: { '@state': 'needs-translation' } 24 } 25 }); 26 } else { 27 units.push(...jsonToXliff(value, sourceLang, targetLang, fullKey)); 28 } 29 }); 30 31 return units; 32} 33 34// Usage 35const json = { 36 auth: { 37 login: "Log in", 38 signup: "Sign up" 39 } 40}; 41 42const xliff = create({ version: '1.0', encoding: 'UTF-8' }) 43 .ele('xliff', { 44 version: '2.0', 45 srcLang: 'en', 46 trgLang: 'es', 47 xmlns: 'urn:oasis:names:tc:xliff:document:2.0' 48 }) 49 .ele('file', { id: 'messages' }) 50 .ele(jsonToXliff(json, 'en', 'es')) 51 .end({ prettyPrint: true }); 52 53console.log(xliff);
PO to JSON Conversion
TypeScript1import { po } from 'gettext-parser'; 2import fs from 'fs'; 3 4function poToJson(poContent: string): Record<string, string> { 5 const parsed = po.parse(poContent); 6 const translations: Record<string, string> = {}; 7 8 Object.entries(parsed.translations['']).forEach(([msgid, data]) => { 9 if (msgid && data.msgstr[0]) { 10 translations[msgid] = data.msgstr[0]; 11 } 12 }); 13 14 return translations; 15} 16 17// Usage 18const poFile = fs.readFileSync('messages.po', 'utf-8'); 19const json = poToJson(poFile); 20fs.writeFileSync('messages.json', JSON.stringify(json, null, 2));
YAML to ARB Conversion
TypeScript1import yaml from 'js-yaml'; 2import fs from 'fs'; 3 4function yamlToArb(yamlContent: string, locale: string): string { 5 const parsed = yaml.load(yamlContent) as any; 6 const arb: Record<string, any> = { 7 '@@locale': locale 8 }; 9 10 function flatten(obj: any, prefix: string = '') { 11 Object.entries(obj).forEach(([key, value]) => { 12 const fullKey = prefix ? `${prefix}_${key}` : key; 13 14 if (typeof value === 'string') { 15 arb[fullKey] = value; 16 arb[`@${fullKey}`] = { 17 type: 'text', 18 description: `Translation for ${fullKey}` 19 }; 20 } else if (typeof value === 'object') { 21 flatten(value, fullKey); 22 } 23 }); 24 } 25 26 flatten(parsed[locale]); 27 return JSON.stringify(arb, null, 2); 28} 29 30// Usage 31const yamlFile = fs.readFileSync('translations.yml', 'utf-8'); 32const arb = yamlToArb(yamlFile, 'en'); 33fs.writeFileSync('app_en.arb', arb);
When to Use Each Format
Use JSON When:
- Building modern web applications (React, Vue, Angular)
- Developer experience is priority
- Simple key-value translations without complex metadata
- Native JavaScript integration matters
- Version control visibility is important
Use XLIFF When:
- Working with professional translation agencies
- Need rich metadata and translation state tracking
- Using CAT tools (SDL Trados, MemoQ, Phrase)
- Complex translation workflows (translate → review → approve)
- Multi-vendor translation projects
- Need inline formatting preservation
Use PO When:
- Building open-source projects
- Using Python (Django) or PHP frameworks
- Need built-in pluralization support
- Context disambiguation is critical
- Translators prefer Poedit or similar tools
- GNU gettext ecosystem compatibility required
Use YAML When:
- Building Ruby on Rails applications
- Configuration-heavy projects
- Human readability is top priority
- Need multi-line string support
- Translators hand-edit files
- Already using YAML for configuration
Use ARB When:
- Building Flutter applications exclusively
- Need strong type safety for placeholders
- ICU MessageFormat is required
- Rich metadata for translators is important
- Google ecosystem integration matters
IntlPull's Multi-Format Support
IntlPull supports all major translation formats with intelligent conversion:
- Import: Upload JSON, XLIFF, PO, YAML, or ARB files
- Export: Download in any format regardless of import format
- Conversion: Automatic format conversion with metadata preservation
- Validation: Format-specific validation and error reporting
- Preview: Side-by-side comparison of source and target
- Metadata: Unified metadata model across all formats (context, screenshots, character limits)
When you upload a JSON file to IntlPull, translators can download XLIFF for CAT tools, while developers export back to JSON for deployment. The platform handles format nuances automatically, ensuring plural forms, context, and metadata are preserved across conversions.
FAQ
Q: Which format should I use for a new project? A: For web projects, start with JSON for simplicity. If working with professional translators, use XLIFF. For Flutter, use ARB. For open-source or Python projects, use PO. For Rails, use YAML.
Q: Can I mix formats in the same project? A: Yes, but it's not recommended. Use a translation management system like IntlPull to handle format conversion transparently, keeping your source format consistent while exporting to translator-preferred formats.
Q: How do I preserve metadata when converting between formats? A: Use specialized conversion libraries or TMS platforms that understand metadata mappings. Direct conversion (e.g., JSON to XLIFF) requires custom logic to map comments, context, and constraints.
Q: Are there any formats to avoid? A: Avoid proprietary formats that lack tooling support. Stick to standards (JSON, XLIFF, PO, YAML, ARB) with active ecosystems and library support.
Q: How do I handle plurals in JSON?
A: Use ICU MessageFormat syntax: {count, plural, one {# item} other {# items}}. Libraries like FormatJS and i18next support this syntax in JSON files.
Q: Can I version control XLIFF files? A: Yes, but Git diffs are verbose due to XML structure. Some teams export XLIFF only for translators and keep JSON in version control, using a TMS as the source of truth.
Q: What's the best format for machine translation? A: XLIFF is best for professional MT services. JSON works fine for simple MT APIs. The format matters less than having proper context and metadata for accurate translations.
