An intelligent manga/comic OCR and translation pipeline designed for accurate text extraction and multi-language translation support. Optimized for macOS with Apple Silicon support.

✨ Key Features

Dual OCR Support: EasyOCR (primary) with automatic fallback to PaddleOCR
Smart Bubble Detection: Advanced speech bubble clustering with line-level precision
Robust Text Recognition: Multi-pass preprocessing with rotation-based reread for accuracy
Intelligent Noise Filtering: Removes debug artifacts, garbage tokens, and unwanted symbols
Reading Order Detection: Automatic LTR/RTL detection for proper translation sequencing
Multi-Language Translation: Powered by Deep Translator
Structured Output: JSON metadata for bubble locations and properties
Visual Debugging: Detailed debug overlays for quality control
Batch Processing: Shell script support for processing multiple pages

📋 Requirements

OS: macOS (Apple Silicon M1/M2/M3 supported)
Python: 3.11+ (recommended 3.11.x)
Package Manager: Homebrew (for Python installation)
Disk Space: ~2-3GB for dependencies (OCR models, ML libraries)

🚀 Quick Start

1. Create Virtual Environment

cd /path/to/manga-translator

# Create venv with Python 3.11
/opt/homebrew/bin/python3.11 -m venv venv

# Activate environment
source venv/bin/activate

# Verify correct Python version
python -V
# Expected output: Python 3.11.x

2. Install Dependencies

# Upgrade pip and build tools
python -m pip install --upgrade pip setuptools wheel

# Install required packages
python -m pip install -r requirements.txt

# Optional: Install PaddleOCR fallback
python -m pip install paddlepaddle || true

3. Prepare Your Manga

Place manga page images in a directory (e.g., your-manga-series/)

📖 Usage

Single Page Translation

python manga-translator.py --input path/to/page.png --output output_dir/

Batch Processing Multiple Pages

bash batch-translate.sh input_folder/ output_folder/

Generate Rendered Output

python manga-renderer.py --bubbles bubbles.json --original input.png --output rendered.png

📂 Project Structure

manga-translator/
├── manga-translator.py       # Main OCR + translation pipeline
├── manga-renderer.py         # Visualization & debug rendering
├── batch-translate.sh        # Batch processing script
├── requirements.txt          # Python dependencies
│
├── fonts/                    # Custom fonts for rendering
├── pages-for-tests/          # Test data
│   └── translated/           # Sample outputs
│
├── Dandadan_059/             # Sample manga series
├── Spy_x_Family_076/         # Sample manga series
│
└── older-code/               # Legacy scripts & experiments

📤 Output Files

For each processed page, the pipeline generates:

bubbles.json – Structured metadata with bubble coordinates, text, and properties
output.txt – Translated text in reading order
debug_clusters.png – Visual overlay showing detected bubbles and processing
rendered_output.png – Final rendered manga with translations overlaid

🔧 Configuration

Key processing parameters (adjustable in manga-translator.py):

OCR Engine: EasyOCR with auto-fallback to Manga-OCR
Bubble Clustering: Adaptive threshold-based grouping
Text Preprocessing: Multi-pass noise reduction and enhancement
Translation Target: Configurable language (default: English)

🐛 Troubleshooting

"ModuleNotFoundError" Errors

# Ensure venv is activated
source venv/bin/activate

# Reinstall dependencies
python -m pip install -r requirements.txt --force-reinstall

OCR Accuracy Issues

Ensure images are high quality (300+ DPI recommended)
Check that manga is not rotated
Try adjusting clustering parameters in the code

Out of Memory Errors

Process pages in smaller batches
Reduce image resolution before processing
Check available RAM: vm_stat on macOS

Translation Issues

Verify internet connection (translations require API calls)
Check language codes in Deep Translator documentation
Test with a single page first

🛠️ Development

Running Tests

Test data is available in pages-for-tests/translated/

python manga-translator.py --input pages-for-tests/example.png --output test-output/

Debugging

Enable verbose output by modifying the logging level in manga-translator.py

📝 Notes

Processing time: ~10-30 seconds per page (varies by image size and hardware)
ML models are downloaded automatically on first run
GPU acceleration available with compatible CUDA setup (optional)
Tested on macOS 13+ with Python 3.11

README.md Unescape Escape

🎨 Manga Translator OCR Pipeline