Context Translator with Screenshot OCR: IntlPull vs Lokalise Visual Localization

The Problem: Translators Lack Context

One of the biggest challenges in localization is context. Translators often see a string like "Submit" without knowing:

Is it a button or a label?
Is it at the top or bottom of the screen?
What's the surrounding UI?
How much space is available?

Result: Generic translations that don't fit the UI, truncation issues, and endless back-and-forth between translators and developers.

The Solution: Screenshot Context

Screenshot context solves this by letting you:

Upload screenshots of your app or website
Tag translation keys with bounding boxes showing exactly where they appear
Auto-detect text using OCR to speed up the tagging process

When translators work on a key, they see the visual context immediately.

IntlPull vs Lokalise: The Screenshot Showdown

Both IntlPull and Lokalise offer screenshot features, but the implementation differs dramatically.

Feature Comparison

Feature	IntlPull	Lokalise
Screenshot upload	Included in all paid plans	Paid add-on
OCR auto-detection	Self-hosted Tesseract (free)	Cloud-based (per-request fees)
Batch upload	Yes	No
Fuzzy key matching	Levenshtein + substring matching	Basic matching
Key-to-screenshot lookup	Yes	Yes
Bounding box editor	Modern canvas-based	Basic overlay
API access	Full REST API	Limited

Cost Comparison

Here's where it gets interesting.

Lokalise's OCR pricing:

Cloud Vision API calls: $0.005-0.02 per request
For 1,000 screenshots with monthly OCR: $60-240/year in OCR fees alone
Plus the screenshot feature itself may be a paid add-on

IntlPull's OCR pricing:

Self-hosted Tesseract: $0 per request
No additional fees
Included in all paid plans

The Self-Hosted Advantage

IntlPull uses Tesseract OCR, the industry-standard open-source OCR engine, running on your own infrastructure:

Terminal
1# Tesseract is installed on the server
2tesseract --version
3# tesseract 5.3.0
4
5# No cloud API calls
6# No per-request fees
7# No data leaving your infrastructure

Benefits:

Privacy: Screenshot data never leaves your infrastructure
Speed: No network latency to cloud APIs
Cost: Zero marginal cost per OCR operation
Reliability: No third-party API outages

How Auto-Detection Works

IntlPull's OCR pipeline is surprisingly sophisticated:

Step 1: Text Detection

Tesseract scans the screenshot and extracts text regions with bounding boxes:

JSON
1{
2  "detected_texts": [
3    {
4      "text": "Submit Order",
5      "x": 350,
6      "y": 480,
7      "width": 120,
8      "height": 40,
9      "confidence": 0.95
10    },
11    {
12      "text": "Cancel",
13      "x": 200,
14      "y": 480,
15      "width": 80,
16      "height": 40,
17      "confidence": 0.92
18    }
19  ]
20}

Step 2: Fuzzy Key Matching

The detected text is matched against your translation keys using multiple strategies:

Exact match (score: 1.0)
Substring match (score: 0.9 x ratio)
Levenshtein distance (normalized similarity)

This handles common OCR errors like:

"Submit" vs "Submitt" (typo tolerance)
"SUBMIT" vs "Submit" (case insensitive)
"Submit Order" vs "Submit" (partial matching)

Step 3: Suggested Tags

The API returns suggested key-to-region mappings:

JSON
1{
2  "suggested_tags": [
3    {
4      "key_id": "abc123",
5      "key_name": "buttons.submit",
6      "x": 350,
7      "y": 480,
8      "width": 120,
9      "height": 40,
10      "confidence": 0.95,
11      "match_score": 1.0
12    }
13  ]
14}

Step 4: Human Review

Translators review and confirm the suggestions before they're applied. This prevents false positives and ensures accuracy.

Real-World Workflow

For Developers

Terminal
1# Upload screenshots from CI/CD
2for screenshot in screenshots/*.png; do
3  curl -X POST \
4    -H "X-API-Key: ip_live_xxx" \
5    -F "file=@$screenshot" \
6    -F "tags=v2.5.0,checkout-flow" \
7    https://api.intlpull.com/api/v1/projects/PROJECT_ID/screenshots
8done

For Project Managers

Navigate to the Screenshots tab in the dashboard
Click "Auto-Detect" on each screenshot
Review suggested tags and confirm
Translators now see visual context for all tagged keys

For Translators

When translating "buttons.submit", they see:

The actual button in the UI
Surrounding context (what's above/below)
Available space for the translation
Multiple screenshots if the key appears in different places

Performance Benchmarks

We tested OCR processing times on a typical server:

Image Size	Tesseract Time	Total API Response
< 1MB	~500ms	~1 second
1-5MB	~1-2 seconds	~2-3 seconds
5-20MB	~3-5 seconds	~5-7 seconds

For most screenshots (mobile screens, web pages), processing completes in under 2 seconds.

Comparison: Lokalise's Approach

Lokalise uses a cloud-based approach:

Screenshot -> Cloud API -> Text Detection -> Response
           |
     Network latency + API costs

Drawbacks:

Per-request costs that add up
Network latency for each operation
Dependency on third-party API availability
Screenshot data sent to external servers

Setting Up Screenshots in IntlPull

Prerequisites

For self-hosted deployments, install Tesseract:

Terminal
1# macOS
2brew install tesseract
3
4# Ubuntu/Debian
5apt-get install tesseract-ocr
6
7# Docker
8FROM alpine:3.19
9RUN apk add tesseract-ocr

API Usage

Upload a screenshot:

Terminal
1curl -X POST \
2  -H "X-API-Key: YOUR_KEY" \
3  -F "file=@screenshot.png" \
4  -F "name=Checkout Page" \
5  -F "tags=checkout,mobile" \
6  https://api.intlpull.com/api/v1/projects/{projectId}/screenshots

Trigger auto-detection:

Terminal
1curl -X POST \
2  -H "X-API-Key: YOUR_KEY" \
3  -H "Content-Type: application/json" \
4  -d '{"min_confidence": 0.7}' \
5  https://api.intlpull.com/api/v1/projects/{projectId}/screenshots/{screenshotId}/auto-detect

Bulk create key tags from OCR results:

Terminal
1curl -X POST \
2  -H "X-API-Key: YOUR_KEY" \
3  -H "Content-Type: application/json" \
4  -d '{
5    "key_maps": [
6      {"key_id": "abc123", "x": 100, "y": 200, "width": 80, "height": 30, "auto_detected": true},
7      {"key_id": "def456", "x": 300, "y": 200, "width": 60, "height": 30, "auto_detected": true}
8    ]
9  }' \
10  https://api.intlpull.com/api/v1/projects/{projectId}/screenshots/{screenshotId}/keys/bulk

SDK Integration

TypeScript SDK makes it even easier:

TypeScript
1import { IntlPull } from '@intlpullhq/sdk';
2
3const client = new IntlPull({ apiKey: 'ip_live_xxx' });
4
5// Upload and auto-tag in one flow
6async function processScreenshot(filePath: string) {
7  // 1. Upload
8  const screenshot = await client.screenshots.upload(projectId, {
9    file: fs.readFileSync(filePath),
10    name: path.basename(filePath),
11    tags: ['v2.5.0'],
12  });
13
14  // 2. Auto-detect
15  const detection = await client.screenshots.autoDetect(
16    projectId,
17    screenshot.id,
18    { minConfidence: 0.7 }
19  );
20
21  // 3. Apply high-confidence matches
22  const highConfidence = detection.suggestedTags.filter(t => t.match_score >= 0.9);
23
24  if (highConfidence.length > 0) {
25    await client.screenshots.bulkCreateKeyMaps(projectId, screenshot.id, {
26      keyMaps: highConfidence.map(t => ({
27        keyId: t.key_id,
28        x: t.x,
29        y: t.y,
30        width: t.width,
31        height: t.height,
32        autoDetected: true,
33        confidence: t.match_score,
34      })),
35    });
36  }
37
38  console.log('Tagged ' + highConfidence.length + ' keys automatically');
39}

Best Practices

1. Capture Representative Screenshots

Include all states (empty, loading, error, success)
Capture different screen sizes (mobile, tablet, desktop)
Tag version numbers in metadata for tracking

2. Use Tags for Organization

Terminal
1# Tag by feature area
2-F "tags=checkout,payment"
3
4# Tag by version
5-F "tags=v2.5.0,sprint-42"
6
7# Tag by platform
8-F "tags=ios,dark-mode"

3. Set Appropriate Confidence Thresholds

0.9+: Auto-apply without review (high confidence)
0.7-0.9: Suggest but require confirmation
<0.7: Don't suggest (too uncertain)

4. Review OCR Results

OCR isn't perfect. Common issues:

Similar characters (O vs 0, l vs I)
Stylized fonts
Low contrast text

Always have a human review automated tags.

Migration from Lokalise

If you're switching from Lokalise:

Terminal
1# Export screenshots from Lokalise (if available via API)
2# Then bulk upload to IntlPull
3
4npx @intlpullhq/cli migrate --screenshots --from lokalise

What transfers:

Screenshot images
Existing key-to-screenshot mappings
Tags and metadata

Conclusion: Why IntlPull Wins

Metric	IntlPull	Lokalise
OCR cost	$0	$60-240+/year
Data privacy	On-premise	Cloud (3rd party)
Setup complexity	Pre-installed	API key + billing
Batch operations	Yes	Limited
API access	Full	Partial

For teams that:

Process many screenshots
Care about data privacy
Want predictable costs
Need API automation

IntlPull's self-hosted Tesseract approach is the clear winner.

Ready to give translators the context they need? Start your free trial or read the full documentation.

Context Translator with Screenshot OCR: Visual Localization vs Lokalise