Added all
This commit is contained in:
186
README.md
186
README.md
@@ -1,53 +1,185 @@
|
||||
# Manga Translator OCR Pipeline
|
||||
# 🎨 Manga Translator OCR Pipeline
|
||||
|
||||
A robust manga/comic OCR + translation pipeline with:
|
||||
|
||||
- EasyOCR (default, reliable on macOS M1)
|
||||
- Optional PaddleOCR (auto-fallback if unavailable)
|
||||
- Bubble clustering and line-level boxes
|
||||
- Robust reread pass (multi-preprocessing + slight rotation)
|
||||
- Translation export + debug overlays
|
||||
An intelligent manga/comic OCR and translation pipeline designed for accurate text extraction and multi-language translation support. Optimized for macOS with Apple Silicon support.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Features
|
||||
## ✨ Key Features
|
||||
|
||||
- OCR from raw manga pages
|
||||
- Noise filtering (`BOX` debug artifacts, tiny garbage tokens, symbols)
|
||||
- Speech bubble grouping
|
||||
- Reading order estimation (`ltr` / `rtl`)
|
||||
- Translation output (`output.txt`)
|
||||
- Structured bubble metadata (`bubbles.json`)
|
||||
- Visual debug output (`debug_clusters.png`)
|
||||
- **Dual OCR Support**: EasyOCR (primary) with automatic fallback to PaddleOCR
|
||||
- **Smart Bubble Detection**: Advanced speech bubble clustering with line-level precision
|
||||
- **Robust Text Recognition**: Multi-pass preprocessing with rotation-based reread for accuracy
|
||||
- **Intelligent Noise Filtering**: Removes debug artifacts, garbage tokens, and unwanted symbols
|
||||
- **Reading Order Detection**: Automatic LTR/RTL detection for proper translation sequencing
|
||||
- **Multi-Language Translation**: Powered by Deep Translator
|
||||
- **Structured Output**: JSON metadata for bubble locations and properties
|
||||
- **Visual Debugging**: Detailed debug overlays for quality control
|
||||
- **Batch Processing**: Shell script support for processing multiple pages
|
||||
|
||||
---
|
||||
|
||||
## 🧰 Requirements
|
||||
## 📋 Requirements
|
||||
|
||||
- macOS (Apple Silicon supported)
|
||||
- Python **3.11** recommended
|
||||
- Homebrew (for Python install)
|
||||
- **OS**: macOS (Apple Silicon M1/M2/M3 supported)
|
||||
- **Python**: 3.11+ (recommended 3.11.x)
|
||||
- **Package Manager**: Homebrew (for Python installation)
|
||||
- **Disk Space**: ~2-3GB for dependencies (OCR models, ML libraries)
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Setup (Python 3.11 venv)
|
||||
## 🚀 Quick Start
|
||||
|
||||
### 1. **Create Virtual Environment**
|
||||
|
||||
```bash
|
||||
cd /path/to/manga-translator
|
||||
|
||||
# 1) Create venv with 3.11
|
||||
# Create venv with Python 3.11
|
||||
/opt/homebrew/bin/python3.11 -m venv venv
|
||||
|
||||
# 2) Activate
|
||||
# Activate environment
|
||||
source venv/bin/activate
|
||||
|
||||
# 3) Verify interpreter
|
||||
# Verify correct Python version
|
||||
python -V
|
||||
# expected: Python 3.11.x
|
||||
# Expected output: Python 3.11.x
|
||||
```
|
||||
|
||||
# 4) Install dependencies
|
||||
### 2. **Install Dependencies**
|
||||
|
||||
```bash
|
||||
# Upgrade pip and build tools
|
||||
python -m pip install --upgrade pip setuptools wheel
|
||||
|
||||
# Install required packages
|
||||
python -m pip install -r requirements.txt
|
||||
|
||||
# Optional Paddle runtime
|
||||
# Optional: Install PaddleOCR fallback
|
||||
python -m pip install paddlepaddle || true
|
||||
```
|
||||
|
||||
### 3. **Prepare Your Manga**
|
||||
|
||||
Place manga page images in a directory (e.g., `your-manga-series/`)
|
||||
|
||||
---
|
||||
|
||||
## 📖 Usage
|
||||
|
||||
### Single Page Translation
|
||||
|
||||
```bash
|
||||
python manga-translator.py --input path/to/page.png --output output_dir/
|
||||
```
|
||||
|
||||
### Batch Processing Multiple Pages
|
||||
|
||||
```bash
|
||||
bash batch-translate.sh input_folder/ output_folder/
|
||||
```
|
||||
|
||||
### Generate Rendered Output
|
||||
|
||||
```bash
|
||||
python manga-renderer.py --bubbles bubbles.json --original input.png --output rendered.png
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📂 Project Structure
|
||||
|
||||
```
|
||||
manga-translator/
|
||||
├── manga-translator.py # Main OCR + translation pipeline
|
||||
├── manga-renderer.py # Visualization & debug rendering
|
||||
├── batch-translate.sh # Batch processing script
|
||||
├── requirements.txt # Python dependencies
|
||||
│
|
||||
├── fonts/ # Custom fonts for rendering
|
||||
├── pages-for-tests/ # Test data
|
||||
│ └── translated/ # Sample outputs
|
||||
│
|
||||
├── Dandadan_059/ # Sample manga series
|
||||
├── Spy_x_Family_076/ # Sample manga series
|
||||
│
|
||||
└── older-code/ # Legacy scripts & experiments
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📤 Output Files
|
||||
|
||||
For each processed page, the pipeline generates:
|
||||
|
||||
- **`bubbles.json`** – Structured metadata with bubble coordinates, text, and properties
|
||||
- **`output.txt`** – Translated text in reading order
|
||||
- **`debug_clusters.png`** – Visual overlay showing detected bubbles and processing
|
||||
- **`rendered_output.png`** – Final rendered manga with translations overlaid
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Configuration
|
||||
|
||||
Key processing parameters (adjustable in `manga-translator.py`):
|
||||
|
||||
- **OCR Engine**: EasyOCR with auto-fallback to Manga-OCR
|
||||
- **Bubble Clustering**: Adaptive threshold-based grouping
|
||||
- **Text Preprocessing**: Multi-pass noise reduction and enhancement
|
||||
- **Translation Target**: Configurable language (default: English)
|
||||
|
||||
---
|
||||
|
||||
## 🐛 Troubleshooting
|
||||
|
||||
### "ModuleNotFoundError" Errors
|
||||
|
||||
```bash
|
||||
# Ensure venv is activated
|
||||
source venv/bin/activate
|
||||
|
||||
# Reinstall dependencies
|
||||
python -m pip install -r requirements.txt --force-reinstall
|
||||
```
|
||||
|
||||
### OCR Accuracy Issues
|
||||
|
||||
- Ensure images are high quality (300+ DPI recommended)
|
||||
- Check that manga is not rotated
|
||||
- Try adjusting clustering parameters in the code
|
||||
|
||||
### Out of Memory Errors
|
||||
|
||||
- Process pages in smaller batches
|
||||
- Reduce image resolution before processing
|
||||
- Check available RAM: `vm_stat` on macOS
|
||||
|
||||
### Translation Issues
|
||||
|
||||
- Verify internet connection (translations require API calls)
|
||||
- Check language codes in Deep Translator documentation
|
||||
- Test with a single page first
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Development
|
||||
|
||||
### Running Tests
|
||||
|
||||
Test data is available in `pages-for-tests/translated/`
|
||||
|
||||
```bash
|
||||
python manga-translator.py --input pages-for-tests/example.png --output test-output/
|
||||
```
|
||||
|
||||
### Debugging
|
||||
|
||||
Enable verbose output by modifying the logging level in `manga-translator.py`
|
||||
|
||||
---
|
||||
|
||||
## 📝 Notes
|
||||
|
||||
- Processing time: ~10-30 seconds per page (varies by image size and hardware)
|
||||
- ML models are downloaded automatically on first run
|
||||
- GPU acceleration available with compatible CUDA setup (optional)
|
||||
- Tested on macOS 13+ with Python 3.11
|
||||
|
||||
Reference in New Issue
Block a user