The Problem: Translators Lack Context
One of the biggest challenges in localization is context. Translators often see a string like "Submit" without knowing:
- Is it a button or a label?
- Is it at the top or bottom of the screen?
- What's the surrounding UI?
- How much space is available?
Result: Generic translations that don't fit the UI, truncation issues, and endless back-and-forth between translators and developers.
The Solution: Screenshot Context
Screenshot context solves this by letting you:
- Upload screenshots of your app or website
- Tag translation keys with bounding boxes showing exactly where they appear
- Auto-detect text using OCR to speed up the tagging process
When translators work on a key, they see the visual context immediately.
IntlPull vs Lokalise: The Screenshot Showdown
Both IntlPull and Lokalise offer screenshot features, but the implementation differs dramatically.
Feature Comparison
| Feature | IntlPull | Lokalise |
|---|---|---|
| Screenshot upload | Included in all paid plans | Paid add-on |
| OCR auto-detection | Self-hosted Tesseract (free) | Cloud-based (per-request fees) |
| Batch upload | Yes | No |
| Fuzzy key matching | Levenshtein + substring matching | Basic matching |
| Key-to-screenshot lookup | Yes | Yes |
| Bounding box editor | Modern canvas-based | Basic overlay |
| API access | Full REST API | Limited |
Cost Comparison
Here's where it gets interesting.
Lokalise's OCR pricing:
- Cloud Vision API calls: $0.005-0.02 per request
- For 1,000 screenshots with monthly OCR: $60-240/year in OCR fees alone
- Plus the screenshot feature itself may be a paid add-on
IntlPull's OCR pricing:
- Self-hosted Tesseract: $0 per request
- No additional fees
- Included in all paid plans
The Self-Hosted Advantage
IntlPull uses Tesseract OCR, the industry-standard open-source OCR engine, running on your own infrastructure:
Terminal1# Tesseract is installed on the server 2tesseract --version 3# tesseract 5.3.0 4 5# No cloud API calls 6# No per-request fees 7# No data leaving your infrastructure
Benefits:
- Privacy: Screenshot data never leaves your infrastructure
- Speed: No network latency to cloud APIs
- Cost: Zero marginal cost per OCR operation
- Reliability: No third-party API outages
How Auto-Detection Works
IntlPull's OCR pipeline is surprisingly sophisticated:
Step 1: Text Detection
Tesseract scans the screenshot and extracts text regions with bounding boxes:
JSON1{ 2 "detected_texts": [ 3 { 4 "text": "Submit Order", 5 "x": 350, 6 "y": 480, 7 "width": 120, 8 "height": 40, 9 "confidence": 0.95 10 }, 11 { 12 "text": "Cancel", 13 "x": 200, 14 "y": 480, 15 "width": 80, 16 "height": 40, 17 "confidence": 0.92 18 } 19 ] 20}
Step 2: Fuzzy Key Matching
The detected text is matched against your translation keys using multiple strategies:
- Exact match (score: 1.0)
- Substring match (score: 0.9 x ratio)
- Levenshtein distance (normalized similarity)
This handles common OCR errors like:
- "Submit" vs "Submitt" (typo tolerance)
- "SUBMIT" vs "Submit" (case insensitive)
- "Submit Order" vs "Submit" (partial matching)
Step 3: Suggested Tags
The API returns suggested key-to-region mappings:
JSON1{ 2 "suggested_tags": [ 3 { 4 "key_id": "abc123", 5 "key_name": "buttons.submit", 6 "x": 350, 7 "y": 480, 8 "width": 120, 9 "height": 40, 10 "confidence": 0.95, 11 "match_score": 1.0 12 } 13 ] 14}
Step 4: Human Review
Translators review and confirm the suggestions before they're applied. This prevents false positives and ensures accuracy.
Real-World Workflow
For Developers
Terminal1# Upload screenshots from CI/CD 2for screenshot in screenshots/*.png; do 3 curl -X POST \ 4 -H "X-API-Key: ip_live_xxx" \ 5 -F "file=@$screenshot" \ 6 -F "tags=v2.5.0,checkout-flow" \ 7 https://api.intlpull.com/api/v1/projects/PROJECT_ID/screenshots 8done
For Project Managers
- Navigate to the Screenshots tab in the dashboard
- Click "Auto-Detect" on each screenshot
- Review suggested tags and confirm
- Translators now see visual context for all tagged keys
For Translators
When translating "buttons.submit", they see:
- The actual button in the UI
- Surrounding context (what's above/below)
- Available space for the translation
- Multiple screenshots if the key appears in different places
Performance Benchmarks
We tested OCR processing times on a typical server:
| Image Size | Tesseract Time | Total API Response |
|---|---|---|
| < 1MB | ~500ms | ~1 second |
| 1-5MB | ~1-2 seconds | ~2-3 seconds |
| 5-20MB | ~3-5 seconds | ~5-7 seconds |
For most screenshots (mobile screens, web pages), processing completes in under 2 seconds.
Comparison: Lokalise's Approach
Lokalise uses a cloud-based approach:
Screenshot -> Cloud API -> Text Detection -> Response
|
Network latency + API costs
Drawbacks:
- Per-request costs that add up
- Network latency for each operation
- Dependency on third-party API availability
- Screenshot data sent to external servers
Setting Up Screenshots in IntlPull
Prerequisites
For self-hosted deployments, install Tesseract:
Terminal1# macOS 2brew install tesseract 3 4# Ubuntu/Debian 5apt-get install tesseract-ocr 6 7# Docker 8FROM alpine:3.19 9RUN apk add tesseract-ocr
API Usage
Upload a screenshot:
Terminal1curl -X POST \ 2 -H "X-API-Key: YOUR_KEY" \ 3 -F "file=@screenshot.png" \ 4 -F "name=Checkout Page" \ 5 -F "tags=checkout,mobile" \ 6 https://api.intlpull.com/api/v1/projects/{projectId}/screenshots
Trigger auto-detection:
Terminal1curl -X POST \ 2 -H "X-API-Key: YOUR_KEY" \ 3 -H "Content-Type: application/json" \ 4 -d '{"min_confidence": 0.7}' \ 5 https://api.intlpull.com/api/v1/projects/{projectId}/screenshots/{screenshotId}/auto-detect
Bulk create key tags from OCR results:
Terminal1curl -X POST \ 2 -H "X-API-Key: YOUR_KEY" \ 3 -H "Content-Type: application/json" \ 4 -d '{ 5 "key_maps": [ 6 {"key_id": "abc123", "x": 100, "y": 200, "width": 80, "height": 30, "auto_detected": true}, 7 {"key_id": "def456", "x": 300, "y": 200, "width": 60, "height": 30, "auto_detected": true} 8 ] 9 }' \ 10 https://api.intlpull.com/api/v1/projects/{projectId}/screenshots/{screenshotId}/keys/bulk
SDK Integration
TypeScript SDK makes it even easier:
TypeScript1import { IntlPull } from '@intlpullhq/sdk'; 2 3const client = new IntlPull({ apiKey: 'ip_live_xxx' }); 4 5// Upload and auto-tag in one flow 6async function processScreenshot(filePath: string) { 7 // 1. Upload 8 const screenshot = await client.screenshots.upload(projectId, { 9 file: fs.readFileSync(filePath), 10 name: path.basename(filePath), 11 tags: ['v2.5.0'], 12 }); 13 14 // 2. Auto-detect 15 const detection = await client.screenshots.autoDetect( 16 projectId, 17 screenshot.id, 18 { minConfidence: 0.7 } 19 ); 20 21 // 3. Apply high-confidence matches 22 const highConfidence = detection.suggestedTags.filter(t => t.match_score >= 0.9); 23 24 if (highConfidence.length > 0) { 25 await client.screenshots.bulkCreateKeyMaps(projectId, screenshot.id, { 26 keyMaps: highConfidence.map(t => ({ 27 keyId: t.key_id, 28 x: t.x, 29 y: t.y, 30 width: t.width, 31 height: t.height, 32 autoDetected: true, 33 confidence: t.match_score, 34 })), 35 }); 36 } 37 38 console.log('Tagged ' + highConfidence.length + ' keys automatically'); 39}
Best Practices
1. Capture Representative Screenshots
- Include all states (empty, loading, error, success)
- Capture different screen sizes (mobile, tablet, desktop)
- Tag version numbers in metadata for tracking
2. Use Tags for Organization
Terminal1# Tag by feature area 2-F "tags=checkout,payment" 3 4# Tag by version 5-F "tags=v2.5.0,sprint-42" 6 7# Tag by platform 8-F "tags=ios,dark-mode"
3. Set Appropriate Confidence Thresholds
- 0.9+: Auto-apply without review (high confidence)
- 0.7-0.9: Suggest but require confirmation
- <0.7: Don't suggest (too uncertain)
4. Review OCR Results
OCR isn't perfect. Common issues:
- Similar characters (O vs 0, l vs I)
- Stylized fonts
- Low contrast text
Always have a human review automated tags.
Migration from Lokalise
If you're switching from Lokalise:
Terminal1# Export screenshots from Lokalise (if available via API) 2# Then bulk upload to IntlPull 3 4npx @intlpullhq/cli migrate --screenshots --from lokalise
What transfers:
- Screenshot images
- Existing key-to-screenshot mappings
- Tags and metadata
Conclusion: Why IntlPull Wins
| Metric | IntlPull | Lokalise |
|---|---|---|
| OCR cost | $0 | $60-240+/year |
| Data privacy | On-premise | Cloud (3rd party) |
| Setup complexity | Pre-installed | API key + billing |
| Batch operations | Yes | Limited |
| API access | Full | Partial |
For teams that:
- Process many screenshots
- Care about data privacy
- Want predictable costs
- Need API automation
IntlPull's self-hosted Tesseract approach is the clear winner.
Ready to give translators the context they need? Start your free trial or read the full documentation.
