185 lines
4.9 KiB
Markdown
185 lines
4.9 KiB
Markdown
# 🎨 Manga Translator OCR Pipeline
|
||
|
||
An intelligent manga/comic OCR and translation pipeline designed for accurate text extraction and multi-language translation support. Optimized for macOS with Apple Silicon support.
|
||
|
||
---
|
||
|
||
## ✨ Key Features
|
||
|
||
- **Dual OCR Support**: EasyOCR (primary) with automatic fallback to PaddleOCR
|
||
- **Smart Bubble Detection**: Advanced speech bubble clustering with line-level precision
|
||
- **Robust Text Recognition**: Multi-pass preprocessing with rotation-based reread for accuracy
|
||
- **Intelligent Noise Filtering**: Removes debug artifacts, garbage tokens, and unwanted symbols
|
||
- **Reading Order Detection**: Automatic LTR/RTL detection for proper translation sequencing
|
||
- **Multi-Language Translation**: Powered by Deep Translator
|
||
- **Structured Output**: JSON metadata for bubble locations and properties
|
||
- **Visual Debugging**: Detailed debug overlays for quality control
|
||
- **Batch Processing**: Shell script support for processing multiple pages
|
||
|
||
---
|
||
|
||
## 📋 Requirements
|
||
|
||
- **OS**: macOS (Apple Silicon M1/M2/M3 supported)
|
||
- **Python**: 3.11+ (recommended 3.11.x)
|
||
- **Package Manager**: Homebrew (for Python installation)
|
||
- **Disk Space**: ~2-3GB for dependencies (OCR models, ML libraries)
|
||
|
||
---
|
||
|
||
## 🚀 Quick Start
|
||
|
||
### 1. **Create Virtual Environment**
|
||
|
||
```bash
|
||
cd /path/to/manga-translator
|
||
|
||
# Create venv with Python 3.11
|
||
/opt/homebrew/bin/python3.11 -m venv venv
|
||
|
||
# Activate environment
|
||
source venv/bin/activate
|
||
|
||
# Verify correct Python version
|
||
python -V
|
||
# Expected output: Python 3.11.x
|
||
```
|
||
|
||
### 2. **Install Dependencies**
|
||
|
||
```bash
|
||
# Upgrade pip and build tools
|
||
python -m pip install --upgrade pip setuptools wheel
|
||
|
||
# Install required packages
|
||
python -m pip install -r requirements.txt
|
||
|
||
# Optional: Install PaddleOCR fallback
|
||
python -m pip install paddlepaddle || true
|
||
```
|
||
|
||
### 3. **Prepare Your Manga**
|
||
|
||
Place manga page images in a directory (e.g., `your-manga-series/`)
|
||
|
||
---
|
||
|
||
## 📖 Usage
|
||
|
||
### Single Page Translation
|
||
|
||
```bash
|
||
python manga-translator.py --input path/to/page.png --output output_dir/
|
||
```
|
||
|
||
### Batch Processing Multiple Pages
|
||
|
||
```bash
|
||
bash batch-translate.sh input_folder/ output_folder/
|
||
```
|
||
|
||
### Generate Rendered Output
|
||
|
||
```bash
|
||
python manga-renderer.py --bubbles bubbles.json --original input.png --output rendered.png
|
||
```
|
||
|
||
---
|
||
|
||
## 📂 Project Structure
|
||
|
||
```
|
||
manga-translator/
|
||
├── manga-translator.py # Main OCR + translation pipeline
|
||
├── manga-renderer.py # Visualization & debug rendering
|
||
├── batch-translate.sh # Batch processing script
|
||
├── requirements.txt # Python dependencies
|
||
│
|
||
├── fonts/ # Custom fonts for rendering
|
||
├── pages-for-tests/ # Test data
|
||
│ └── translated/ # Sample outputs
|
||
│
|
||
├── Dandadan_059/ # Sample manga series
|
||
├── Spy_x_Family_076/ # Sample manga series
|
||
│
|
||
└── older-code/ # Legacy scripts & experiments
|
||
```
|
||
|
||
---
|
||
|
||
## 📤 Output Files
|
||
|
||
For each processed page, the pipeline generates:
|
||
|
||
- **`bubbles.json`** – Structured metadata with bubble coordinates, text, and properties
|
||
- **`output.txt`** – Translated text in reading order
|
||
- **`debug_clusters.png`** – Visual overlay showing detected bubbles and processing
|
||
- **`rendered_output.png`** – Final rendered manga with translations overlaid
|
||
|
||
---
|
||
|
||
## 🔧 Configuration
|
||
|
||
Key processing parameters (adjustable in `manga-translator.py`):
|
||
|
||
- **OCR Engine**: EasyOCR with auto-fallback to Manga-OCR
|
||
- **Bubble Clustering**: Adaptive threshold-based grouping
|
||
- **Text Preprocessing**: Multi-pass noise reduction and enhancement
|
||
- **Translation Target**: Configurable language (default: English)
|
||
|
||
---
|
||
|
||
## 🐛 Troubleshooting
|
||
|
||
### "ModuleNotFoundError" Errors
|
||
|
||
```bash
|
||
# Ensure venv is activated
|
||
source venv/bin/activate
|
||
|
||
# Reinstall dependencies
|
||
python -m pip install -r requirements.txt --force-reinstall
|
||
```
|
||
|
||
### OCR Accuracy Issues
|
||
|
||
- Ensure images are high quality (300+ DPI recommended)
|
||
- Check that manga is not rotated
|
||
- Try adjusting clustering parameters in the code
|
||
|
||
### Out of Memory Errors
|
||
|
||
- Process pages in smaller batches
|
||
- Reduce image resolution before processing
|
||
- Check available RAM: `vm_stat` on macOS
|
||
|
||
### Translation Issues
|
||
|
||
- Verify internet connection (translations require API calls)
|
||
- Check language codes in Deep Translator documentation
|
||
- Test with a single page first
|
||
|
||
---
|
||
|
||
## 🛠️ Development
|
||
|
||
### Running Tests
|
||
|
||
Test data is available in `pages-for-tests/translated/`
|
||
|
||
```bash
|
||
python manga-translator.py --input pages-for-tests/example.png --output test-output/
|
||
```
|
||
|
||
### Debugging
|
||
|
||
Enable verbose output by modifying the logging level in `manga-translator.py`
|
||
|
||
---
|
||
|
||
## 📝 Notes
|
||
|
||
- Processing time: ~10-30 seconds per page (varies by image size and hardware)
|
||
- ML models are downloaded automatically on first run
|
||
- GPU acceleration available with compatible CUDA setup (optional)
|
||
- Tested on macOS 13+ with Python 3.11 |