54 lines
1.2 KiB
Markdown
54 lines
1.2 KiB
Markdown
# Manga Translator OCR Pipeline
|
|
|
|
A robust manga/comic OCR + translation pipeline with:
|
|
|
|
- EasyOCR (default, reliable on macOS M1)
|
|
- Optional PaddleOCR (auto-fallback if unavailable)
|
|
- Bubble clustering and line-level boxes
|
|
- Robust reread pass (multi-preprocessing + slight rotation)
|
|
- Translation export + debug overlays
|
|
|
|
---
|
|
|
|
## ✨ Features
|
|
|
|
- OCR from raw manga pages
|
|
- Noise filtering (`BOX` debug artifacts, tiny garbage tokens, symbols)
|
|
- Speech bubble grouping
|
|
- Reading order estimation (`ltr` / `rtl`)
|
|
- Translation output (`output.txt`)
|
|
- Structured bubble metadata (`bubbles.json`)
|
|
- Visual debug output (`debug_clusters.png`)
|
|
|
|
---
|
|
|
|
## 🧰 Requirements
|
|
|
|
- macOS (Apple Silicon supported)
|
|
- Python **3.11** recommended
|
|
- Homebrew (for Python install)
|
|
|
|
---
|
|
|
|
## 🚀 Setup (Python 3.11 venv)
|
|
|
|
```bash
|
|
cd /path/to/manga-translator
|
|
|
|
# 1) Create venv with 3.11
|
|
/opt/homebrew/bin/python3.11 -m venv venv
|
|
|
|
# 2) Activate
|
|
source venv/bin/activate
|
|
|
|
# 3) Verify interpreter
|
|
python -V
|
|
# expected: Python 3.11.x
|
|
|
|
# 4) Install dependencies
|
|
python -m pip install --upgrade pip setuptools wheel
|
|
python -m pip install -r requirements.txt
|
|
|
|
# Optional Paddle runtime
|
|
python -m pip install paddlepaddle || true
|