Added all

This commit is contained in:
Guillem Hernandez Sola
2026-04-23 16:20:37 +02:00
parent 3ca01dae8c
commit 243e5bad47
5 changed files with 500 additions and 579 deletions

186
README.md
View File

@@ -1,53 +1,185 @@
# Manga Translator OCR Pipeline
# 🎨 Manga Translator OCR Pipeline
A robust manga/comic OCR + translation pipeline with:
- EasyOCR (default, reliable on macOS M1)
- Optional PaddleOCR (auto-fallback if unavailable)
- Bubble clustering and line-level boxes
- Robust reread pass (multi-preprocessing + slight rotation)
- Translation export + debug overlays
An intelligent manga/comic OCR and translation pipeline designed for accurate text extraction and multi-language translation support. Optimized for macOS with Apple Silicon support.
---
## ✨ Features
## ✨ Key Features
- OCR from raw manga pages
- Noise filtering (`BOX` debug artifacts, tiny garbage tokens, symbols)
- Speech bubble grouping
- Reading order estimation (`ltr` / `rtl`)
- Translation output (`output.txt`)
- Structured bubble metadata (`bubbles.json`)
- Visual debug output (`debug_clusters.png`)
- **Dual OCR Support**: EasyOCR (primary) with automatic fallback to PaddleOCR
- **Smart Bubble Detection**: Advanced speech bubble clustering with line-level precision
- **Robust Text Recognition**: Multi-pass preprocessing with rotation-based reread for accuracy
- **Intelligent Noise Filtering**: Removes debug artifacts, garbage tokens, and unwanted symbols
- **Reading Order Detection**: Automatic LTR/RTL detection for proper translation sequencing
- **Multi-Language Translation**: Powered by Deep Translator
- **Structured Output**: JSON metadata for bubble locations and properties
- **Visual Debugging**: Detailed debug overlays for quality control
- **Batch Processing**: Shell script support for processing multiple pages
---
## 🧰 Requirements
## 📋 Requirements
- macOS (Apple Silicon supported)
- Python **3.11** recommended
- Homebrew (for Python install)
- **OS**: macOS (Apple Silicon M1/M2/M3 supported)
- **Python**: 3.11+ (recommended 3.11.x)
- **Package Manager**: Homebrew (for Python installation)
- **Disk Space**: ~2-3GB for dependencies (OCR models, ML libraries)
---
## 🚀 Setup (Python 3.11 venv)
## 🚀 Quick Start
### 1. **Create Virtual Environment**
```bash
cd /path/to/manga-translator
# 1) Create venv with 3.11
# Create venv with Python 3.11
/opt/homebrew/bin/python3.11 -m venv venv
# 2) Activate
# Activate environment
source venv/bin/activate
# 3) Verify interpreter
# Verify correct Python version
python -V
# expected: Python 3.11.x
# Expected output: Python 3.11.x
```
# 4) Install dependencies
### 2. **Install Dependencies**
```bash
# Upgrade pip and build tools
python -m pip install --upgrade pip setuptools wheel
# Install required packages
python -m pip install -r requirements.txt
# Optional Paddle runtime
# Optional: Install PaddleOCR fallback
python -m pip install paddlepaddle || true
```
### 3. **Prepare Your Manga**
Place manga page images in a directory (e.g., `your-manga-series/`)
---
## 📖 Usage
### Single Page Translation
```bash
python manga-translator.py --input path/to/page.png --output output_dir/
```
### Batch Processing Multiple Pages
```bash
bash batch-translate.sh input_folder/ output_folder/
```
### Generate Rendered Output
```bash
python manga-renderer.py --bubbles bubbles.json --original input.png --output rendered.png
```
---
## 📂 Project Structure
```
manga-translator/
├── manga-translator.py # Main OCR + translation pipeline
├── manga-renderer.py # Visualization & debug rendering
├── batch-translate.sh # Batch processing script
├── requirements.txt # Python dependencies
├── fonts/ # Custom fonts for rendering
├── pages-for-tests/ # Test data
│ └── translated/ # Sample outputs
├── Dandadan_059/ # Sample manga series
├── Spy_x_Family_076/ # Sample manga series
└── older-code/ # Legacy scripts & experiments
```
---
## 📤 Output Files
For each processed page, the pipeline generates:
- **`bubbles.json`** Structured metadata with bubble coordinates, text, and properties
- **`output.txt`** Translated text in reading order
- **`debug_clusters.png`** Visual overlay showing detected bubbles and processing
- **`rendered_output.png`** Final rendered manga with translations overlaid
---
## 🔧 Configuration
Key processing parameters (adjustable in `manga-translator.py`):
- **OCR Engine**: EasyOCR with auto-fallback to Manga-OCR
- **Bubble Clustering**: Adaptive threshold-based grouping
- **Text Preprocessing**: Multi-pass noise reduction and enhancement
- **Translation Target**: Configurable language (default: English)
---
## 🐛 Troubleshooting
### "ModuleNotFoundError" Errors
```bash
# Ensure venv is activated
source venv/bin/activate
# Reinstall dependencies
python -m pip install -r requirements.txt --force-reinstall
```
### OCR Accuracy Issues
- Ensure images are high quality (300+ DPI recommended)
- Check that manga is not rotated
- Try adjusting clustering parameters in the code
### Out of Memory Errors
- Process pages in smaller batches
- Reduce image resolution before processing
- Check available RAM: `vm_stat` on macOS
### Translation Issues
- Verify internet connection (translations require API calls)
- Check language codes in Deep Translator documentation
- Test with a single page first
---
## 🛠️ Development
### Running Tests
Test data is available in `pages-for-tests/translated/`
```bash
python manga-translator.py --input pages-for-tests/example.png --output test-output/
```
### Debugging
Enable verbose output by modifying the logging level in `manga-translator.py`
---
## 📝 Notes
- Processing time: ~10-30 seconds per page (varies by image size and hardware)
- ML models are downloaded automatically on first run
- GPU acceleration available with compatible CUDA setup (optional)
- Tested on macOS 13+ with Python 3.11