Added all

2026-04-23 16:20:37 +02:00
parent 3ca01dae8c
commit 243e5bad47
5 changed files with 500 additions and 579 deletions
--- a/README.md
+++ b/README.md
@@ -1,53 +1,185 @@
-# Manga Translator OCR Pipeline
+# 🎨 Manga Translator OCR Pipeline

-A robust manga/comic OCR + translation pipeline with:
-
- EasyOCR (default, reliable on macOS M1)
- Optional PaddleOCR (auto-fallback if unavailable)
- Bubble clustering and line-level boxes
- Robust reread pass (multi-preprocessing + slight rotation)
- Translation export + debug overlays
+An intelligent manga/comic OCR and translation pipeline designed for accurate text extraction and multi-language translation support. Optimized for macOS with Apple Silicon support.

 ---

-## ✨ Features
+## ✨ Key Features

- OCR from raw manga pages
- Noise filtering (`BOX` debug artifacts, tiny garbage tokens, symbols)
- Speech bubble grouping
- Reading order estimation (`ltr` / `rtl`)
- Translation output (`output.txt`)
- Structured bubble metadata (`bubbles.json`)
- Visual debug output (`debug_clusters.png`)
+- **Dual OCR Support**: EasyOCR (primary) with automatic fallback to PaddleOCR
+- **Smart Bubble Detection**: Advanced speech bubble clustering with line-level precision
+- **Robust Text Recognition**: Multi-pass preprocessing with rotation-based reread for accuracy
+- **Intelligent Noise Filtering**: Removes debug artifacts, garbage tokens, and unwanted symbols
+- **Reading Order Detection**: Automatic LTR/RTL detection for proper translation sequencing
+- **Multi-Language Translation**: Powered by Deep Translator
+- **Structured Output**: JSON metadata for bubble locations and properties
+- **Visual Debugging**: Detailed debug overlays for quality control
+- **Batch Processing**: Shell script support for processing multiple pages

 ---

-## 🧰 Requirements
+## 📋 Requirements

- macOS (Apple Silicon supported)
- Python **3.11** recommended
- Homebrew (for Python install)
+- **OS**: macOS (Apple Silicon M1/M2/M3 supported)
+- **Python**: 3.11+ (recommended 3.11.x)
+- **Package Manager**: Homebrew (for Python installation)
+- **Disk Space**: ~2-3GB for dependencies (OCR models, ML libraries)

 ---

-## 🚀 Setup (Python 3.11 venv)
+## 🚀 Quick Start
+
+### 1. **Create Virtual Environment**

 ```bash
 cd /path/to/manga-translator

-# 1) Create venv with 3.11
+# Create venv with Python 3.11
 /opt/homebrew/bin/python3.11 -m venv venv

-# 2) Activate
+# Activate environment
 source venv/bin/activate

-# 3) Verify interpreter
+# Verify correct Python version
 python -V
-# expected: Python 3.11.x
+# Expected output: Python 3.11.x
+```

-# 4) Install dependencies
+### 2. **Install Dependencies**
+
+```bash
+# Upgrade pip and build tools
 python -m pip install --upgrade pip setuptools wheel
+
+# Install required packages
 python -m pip install -r requirements.txt

-# Optional Paddle runtime
+# Optional: Install PaddleOCR fallback
 python -m pip install paddlepaddle || true
+```
+
+### 3. **Prepare Your Manga**
+
+Place manga page images in a directory (e.g., `your-manga-series/`)
+
+---
+
+## 📖 Usage
+
+### Single Page Translation
+
+```bash
+python manga-translator.py --input path/to/page.png --output output_dir/
+```
+
+### Batch Processing Multiple Pages
+
+```bash
+bash batch-translate.sh input_folder/ output_folder/
+```
+
+### Generate Rendered Output
+
+```bash
+python manga-renderer.py --bubbles bubbles.json --original input.png --output rendered.png
+```
+
+---
+
+## 📂 Project Structure
+
+```
+manga-translator/
+├── manga-translator.py       # Main OCR + translation pipeline
+├── manga-renderer.py         # Visualization & debug rendering
+├── batch-translate.sh        # Batch processing script
+├── requirements.txt          # Python dependencies
+│
+├── fonts/                    # Custom fonts for rendering
+├── pages-for-tests/          # Test data
+│   └── translated/           # Sample outputs
+│
+├── Dandadan_059/             # Sample manga series
+├── Spy_x_Family_076/         # Sample manga series
+│
+└── older-code/               # Legacy scripts & experiments
+```
+
+---
+
+## 📤 Output Files
+
+For each processed page, the pipeline generates:
+
+- **`bubbles.json`** – Structured metadata with bubble coordinates, text, and properties
+- **`output.txt`** – Translated text in reading order
+- **`debug_clusters.png`** – Visual overlay showing detected bubbles and processing
+- **`rendered_output.png`** – Final rendered manga with translations overlaid
+
+---
+
+## 🔧 Configuration
+
+Key processing parameters (adjustable in `manga-translator.py`):
+
+- **OCR Engine**: EasyOCR with auto-fallback to Manga-OCR
+- **Bubble Clustering**: Adaptive threshold-based grouping
+- **Text Preprocessing**: Multi-pass noise reduction and enhancement
+- **Translation Target**: Configurable language (default: English)
+
+---
+
+## 🐛 Troubleshooting
+
+### "ModuleNotFoundError" Errors
+
+```bash
+# Ensure venv is activated
+source venv/bin/activate
+
+# Reinstall dependencies
+python -m pip install -r requirements.txt --force-reinstall
+```
+
+### OCR Accuracy Issues
+
+- Ensure images are high quality (300+ DPI recommended)
+- Check that manga is not rotated
+- Try adjusting clustering parameters in the code
+
+### Out of Memory Errors
+
+- Process pages in smaller batches
+- Reduce image resolution before processing
+- Check available RAM: `vm_stat` on macOS
+
+### Translation Issues
+
+- Verify internet connection (translations require API calls)
+- Check language codes in Deep Translator documentation
+- Test with a single page first
+
+---
+
+## 🛠️ Development
+
+### Running Tests
+
+Test data is available in `pages-for-tests/translated/`
+
+```bash
+python manga-translator.py --input pages-for-tests/example.png --output test-output/
+```
+
+### Debugging
+
+Enable verbose output by modifying the logging level in `manga-translator.py`
+
+---
+
+## 📝 Notes
+
+- Processing time: ~10-30 seconds per page (varies by image size and hardware)
+- ML models are downloaded automatically on first run
+- GPU acceleration available with compatible CUDA setup (optional)
+- Tested on macOS 13+ with Python 3.11