Compare commits
15 Commits
f00647e668
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
37bdc25bf6 | ||
|
|
2f61814971 | ||
|
|
853d497559 | ||
|
|
243e5bad47 | ||
|
|
3ca01dae8c | ||
|
|
037dadd920 | ||
|
|
285e9ca393 | ||
|
|
d77db83cfe | ||
|
|
b730037a06 | ||
|
|
7837aeaa9b | ||
|
|
455b4ad82c | ||
|
|
b6b0df4774 | ||
|
|
512bb32f66 | ||
|
|
494631c967 | ||
|
|
27a3e6f98a |
5
.gitignore
vendored
5
.gitignore
vendored
@@ -9,6 +9,11 @@
|
|||||||
|
|
||||||
.venv311/
|
.venv311/
|
||||||
|
|
||||||
|
#Folders to test
|
||||||
|
Spy_x_Family_076/
|
||||||
|
Dandadan_059/
|
||||||
|
Lv999/
|
||||||
|
|
||||||
# Icon must end with two \r
|
# Icon must end with two \r
|
||||||
Icon
|
Icon
|
||||||
|
|
||||||
|
|||||||
186
README.md
186
README.md
@@ -1,53 +1,185 @@
|
|||||||
# Manga Translator OCR Pipeline
|
# 🎨 Manga Translator OCR Pipeline
|
||||||
|
|
||||||
A robust manga/comic OCR + translation pipeline with:
|
An intelligent manga/comic OCR and translation pipeline designed for accurate text extraction and multi-language translation support. Optimized for macOS with Apple Silicon support.
|
||||||
|
|
||||||
- EasyOCR (default, reliable on macOS M1)
|
|
||||||
- Optional PaddleOCR (auto-fallback if unavailable)
|
|
||||||
- Bubble clustering and line-level boxes
|
|
||||||
- Robust reread pass (multi-preprocessing + slight rotation)
|
|
||||||
- Translation export + debug overlays
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## ✨ Features
|
## ✨ Key Features
|
||||||
|
|
||||||
- OCR from raw manga pages
|
- **Dual OCR Support**: EasyOCR (primary) with automatic fallback to PaddleOCR
|
||||||
- Noise filtering (`BOX` debug artifacts, tiny garbage tokens, symbols)
|
- **Smart Bubble Detection**: Advanced speech bubble clustering with line-level precision
|
||||||
- Speech bubble grouping
|
- **Robust Text Recognition**: Multi-pass preprocessing with rotation-based reread for accuracy
|
||||||
- Reading order estimation (`ltr` / `rtl`)
|
- **Intelligent Noise Filtering**: Removes debug artifacts, garbage tokens, and unwanted symbols
|
||||||
- Translation output (`output.txt`)
|
- **Reading Order Detection**: Automatic LTR/RTL detection for proper translation sequencing
|
||||||
- Structured bubble metadata (`bubbles.json`)
|
- **Multi-Language Translation**: Powered by Deep Translator
|
||||||
- Visual debug output (`debug_clusters.png`)
|
- **Structured Output**: JSON metadata for bubble locations and properties
|
||||||
|
- **Visual Debugging**: Detailed debug overlays for quality control
|
||||||
|
- **Batch Processing**: Shell script support for processing multiple pages
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🧰 Requirements
|
## 📋 Requirements
|
||||||
|
|
||||||
- macOS (Apple Silicon supported)
|
- **OS**: macOS (Apple Silicon M1/M2/M3 supported)
|
||||||
- Python **3.11** recommended
|
- **Python**: 3.11+ (recommended 3.11.x)
|
||||||
- Homebrew (for Python install)
|
- **Package Manager**: Homebrew (for Python installation)
|
||||||
|
- **Disk Space**: ~2-3GB for dependencies (OCR models, ML libraries)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🚀 Setup (Python 3.11 venv)
|
## 🚀 Quick Start
|
||||||
|
|
||||||
|
### 1. **Create Virtual Environment**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
cd /path/to/manga-translator
|
cd /path/to/manga-translator
|
||||||
|
|
||||||
# 1) Create venv with 3.11
|
# Create venv with Python 3.11
|
||||||
/opt/homebrew/bin/python3.11 -m venv venv
|
/opt/homebrew/bin/python3.11 -m venv venv
|
||||||
|
|
||||||
# 2) Activate
|
# Activate environment
|
||||||
source venv/bin/activate
|
source venv/bin/activate
|
||||||
|
|
||||||
# 3) Verify interpreter
|
# Verify correct Python version
|
||||||
python -V
|
python -V
|
||||||
# expected: Python 3.11.x
|
# Expected output: Python 3.11.x
|
||||||
|
```
|
||||||
|
|
||||||
# 4) Install dependencies
|
### 2. **Install Dependencies**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Upgrade pip and build tools
|
||||||
python -m pip install --upgrade pip setuptools wheel
|
python -m pip install --upgrade pip setuptools wheel
|
||||||
|
|
||||||
|
# Install required packages
|
||||||
python -m pip install -r requirements.txt
|
python -m pip install -r requirements.txt
|
||||||
|
|
||||||
# Optional Paddle runtime
|
# Optional: Install PaddleOCR fallback
|
||||||
python -m pip install paddlepaddle || true
|
python -m pip install paddlepaddle || true
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. **Prepare Your Manga**
|
||||||
|
|
||||||
|
Place manga page images in a directory (e.g., `your-manga-series/`)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📖 Usage
|
||||||
|
|
||||||
|
### Single Page Translation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python manga-translator.py --input path/to/page.png --output output_dir/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Batch Processing Multiple Pages
|
||||||
|
|
||||||
|
```bash
|
||||||
|
bash batch-translate.sh input_folder/ output_folder/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Generate Rendered Output
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python manga-renderer.py --bubbles bubbles.json --original input.png --output rendered.png
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📂 Project Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
manga-translator/
|
||||||
|
├── manga-translator.py # Main OCR + translation pipeline
|
||||||
|
├── manga-renderer.py # Visualization & debug rendering
|
||||||
|
├── batch-translate.sh # Batch processing script
|
||||||
|
├── requirements.txt # Python dependencies
|
||||||
|
│
|
||||||
|
├── fonts/ # Custom fonts for rendering
|
||||||
|
├── pages-for-tests/ # Test data
|
||||||
|
│ └── translated/ # Sample outputs
|
||||||
|
│
|
||||||
|
├── Dandadan_059/ # Sample manga series
|
||||||
|
├── Spy_x_Family_076/ # Sample manga series
|
||||||
|
│
|
||||||
|
└── older-code/ # Legacy scripts & experiments
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📤 Output Files
|
||||||
|
|
||||||
|
For each processed page, the pipeline generates:
|
||||||
|
|
||||||
|
- **`bubbles.json`** – Structured metadata with bubble coordinates, text, and properties
|
||||||
|
- **`output.txt`** – Translated text in reading order
|
||||||
|
- **`debug_clusters.png`** – Visual overlay showing detected bubbles and processing
|
||||||
|
- **`rendered_output.png`** – Final rendered manga with translations overlaid
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🔧 Configuration
|
||||||
|
|
||||||
|
Key processing parameters (adjustable in `manga-translator.py`):
|
||||||
|
|
||||||
|
- **OCR Engine**: EasyOCR with auto-fallback to Manga-OCR
|
||||||
|
- **Bubble Clustering**: Adaptive threshold-based grouping
|
||||||
|
- **Text Preprocessing**: Multi-pass noise reduction and enhancement
|
||||||
|
- **Translation Target**: Configurable language (default: English)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🐛 Troubleshooting
|
||||||
|
|
||||||
|
### "ModuleNotFoundError" Errors
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Ensure venv is activated
|
||||||
|
source venv/bin/activate
|
||||||
|
|
||||||
|
# Reinstall dependencies
|
||||||
|
python -m pip install -r requirements.txt --force-reinstall
|
||||||
|
```
|
||||||
|
|
||||||
|
### OCR Accuracy Issues
|
||||||
|
|
||||||
|
- Ensure images are high quality (300+ DPI recommended)
|
||||||
|
- Check that manga is not rotated
|
||||||
|
- Try adjusting clustering parameters in the code
|
||||||
|
|
||||||
|
### Out of Memory Errors
|
||||||
|
|
||||||
|
- Process pages in smaller batches
|
||||||
|
- Reduce image resolution before processing
|
||||||
|
- Check available RAM: `vm_stat` on macOS
|
||||||
|
|
||||||
|
### Translation Issues
|
||||||
|
|
||||||
|
- Verify internet connection (translations require API calls)
|
||||||
|
- Check language codes in Deep Translator documentation
|
||||||
|
- Test with a single page first
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🛠️ Development
|
||||||
|
|
||||||
|
### Running Tests
|
||||||
|
|
||||||
|
Test data is available in `pages-for-tests/translated/`
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python manga-translator.py --input pages-for-tests/example.png --output test-output/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Debugging
|
||||||
|
|
||||||
|
Enable verbose output by modifying the logging level in `manga-translator.py`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📝 Notes
|
||||||
|
|
||||||
|
- Processing time: ~10-30 seconds per page (varies by image size and hardware)
|
||||||
|
- ML models are downloaded automatically on first run
|
||||||
|
- GPU acceleration available with compatible CUDA setup (optional)
|
||||||
|
- Tested on macOS 13+ with Python 3.11
|
||||||
269
batch-translate.sh
Executable file
269
batch-translate.sh
Executable file
@@ -0,0 +1,269 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# ============================================================
|
||||||
|
# batch-translate.sh
|
||||||
|
# Batch manga OCR + translation for all images in a folder.
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# ./batch-translate.sh <folder>
|
||||||
|
# ./batch-translate.sh <folder> --source en --target es
|
||||||
|
# ./batch-translate.sh <folder> --start 3 --end 7
|
||||||
|
# ./batch-translate.sh <folder> -s en -t fr --start 2
|
||||||
|
#
|
||||||
|
# Output per page lands in:
|
||||||
|
# <folder>/translated/<page_stem>/
|
||||||
|
# ├── bubbles.json
|
||||||
|
# ├── output.txt
|
||||||
|
# └── debug_clusters.png
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
set -uo pipefail
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# CONFIGURATION
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
SOURCE_LANG="en"
|
||||||
|
TARGET_LANG="ca"
|
||||||
|
START_PAGE=1
|
||||||
|
END_PAGE=999999
|
||||||
|
PYTHON_BIN="python"
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
TRANSLATOR="${SCRIPT_DIR}/manga-translator.py"
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# COLOURS
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
RED='\033[0;31m'
|
||||||
|
GREEN='\033[0;32m'
|
||||||
|
YELLOW='\033[1;33m'
|
||||||
|
CYAN='\033[0;36m'
|
||||||
|
BOLD='\033[1m'
|
||||||
|
RESET='\033[0m'
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# HELPERS
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
usage() {
|
||||||
|
echo ""
|
||||||
|
echo -e "${BOLD}Usage:${RESET}"
|
||||||
|
echo " $0 <folder> [options]"
|
||||||
|
echo ""
|
||||||
|
echo -e "${BOLD}Options:${RESET}"
|
||||||
|
echo " --source, -s Source language code (default: en)"
|
||||||
|
echo " --target, -t Target language code (default: ca)"
|
||||||
|
echo " --start First page number (default: 1)"
|
||||||
|
echo " --end Last page number (default: all)"
|
||||||
|
echo " --python Python binary (default: python)"
|
||||||
|
echo " --help, -h Show this help"
|
||||||
|
echo ""
|
||||||
|
echo -e "${BOLD}Examples:${RESET}"
|
||||||
|
echo " $0 pages-for-tests"
|
||||||
|
echo " $0 pages-for-tests --source en --target es"
|
||||||
|
echo " $0 pages-for-tests --start 3 --end 7"
|
||||||
|
echo " $0 pages-for-tests -s en -t fr --start 2"
|
||||||
|
echo ""
|
||||||
|
}
|
||||||
|
|
||||||
|
log_info() { echo -e "${CYAN}ℹ️ $*${RESET}"; }
|
||||||
|
log_ok() { echo -e "${GREEN}✅ $*${RESET}"; }
|
||||||
|
log_warn() { echo -e "${YELLOW}⚠️ $*${RESET}"; }
|
||||||
|
log_error() { echo -e "${RED}❌ $*${RESET}"; }
|
||||||
|
log_section() {
|
||||||
|
echo -e "\n${BOLD}${CYAN}══════════════════════════════════════════${RESET}"
|
||||||
|
echo -e "${BOLD}${CYAN} 📖 $*${RESET}"
|
||||||
|
echo -e "${BOLD}${CYAN}══════════════════════════════════════════${RESET}"
|
||||||
|
}
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# ARGUMENT PARSING
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
if [[ $# -eq 0 ]]; then
|
||||||
|
log_error "No folder specified."
|
||||||
|
usage
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
FOLDER="$1"
|
||||||
|
shift
|
||||||
|
|
||||||
|
while [[ $# -gt 0 ]]; do
|
||||||
|
case "$1" in
|
||||||
|
--source|-s) SOURCE_LANG="$2"; shift 2 ;;
|
||||||
|
--target|-t) TARGET_LANG="$2"; shift 2 ;;
|
||||||
|
--start) START_PAGE="$2"; shift 2 ;;
|
||||||
|
--end) END_PAGE="$2"; shift 2 ;;
|
||||||
|
--python) PYTHON_BIN="$2"; shift 2 ;;
|
||||||
|
--help|-h) usage; exit 0 ;;
|
||||||
|
*)
|
||||||
|
log_error "Unknown option: $1"
|
||||||
|
usage
|
||||||
|
exit 1
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# VALIDATION
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
if [[ ! -d "$FOLDER" ]]; then
|
||||||
|
log_error "Folder not found: $FOLDER"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ ! -f "$TRANSLATOR" ]]; then
|
||||||
|
log_error "manga-translator.py not found at: $TRANSLATOR"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
if ! command -v "$PYTHON_BIN" &>/dev/null; then
|
||||||
|
log_error "Python binary not found: $PYTHON_BIN"
|
||||||
|
log_error "Try --python python3"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# PURGE BYTECODE CACHE
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
log_info "🗑️ Purging Python bytecode caches..."
|
||||||
|
find "${SCRIPT_DIR}" -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true
|
||||||
|
log_ok "Cache cleared."
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# DISCOVER IMAGES
|
||||||
|
# NOTE: uses while-read loop instead of mapfile for Bash 3.2
|
||||||
|
# compatibility (macOS default shell)
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
ALL_IMAGES=()
|
||||||
|
while IFS= read -r -d '' img; do
|
||||||
|
ALL_IMAGES+=("$img")
|
||||||
|
done < <(
|
||||||
|
find "$FOLDER" -maxdepth 1 -type f \
|
||||||
|
\( -iname "*.jpg" -o -iname "*.jpeg" \
|
||||||
|
-o -iname "*.png" -o -iname "*.webp" \) \
|
||||||
|
-print0 | sort -z
|
||||||
|
)
|
||||||
|
|
||||||
|
TOTAL=${#ALL_IMAGES[@]}
|
||||||
|
|
||||||
|
if [[ $TOTAL -eq 0 ]]; then
|
||||||
|
log_error "No image files found in: $FOLDER"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# SLICE TO REQUESTED PAGE RANGE (1-based)
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
PAGES=()
|
||||||
|
for i in "${!ALL_IMAGES[@]}"; do
|
||||||
|
PAGE_NUM=$(( i + 1 ))
|
||||||
|
if [[ $PAGE_NUM -ge $START_PAGE && $PAGE_NUM -le $END_PAGE ]]; then
|
||||||
|
PAGES+=("${ALL_IMAGES[$i]}")
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
if [[ ${#PAGES[@]} -eq 0 ]]; then
|
||||||
|
log_error "No pages in range [${START_PAGE}, ${END_PAGE}] (total: ${TOTAL})"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# SUMMARY HEADER
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
log_section "BATCH MANGA TRANSLATOR"
|
||||||
|
log_info "📂 Folder : $(realpath "$FOLDER")"
|
||||||
|
log_info "📄 Pages : ${#PAGES[@]} of ${TOTAL} total"
|
||||||
|
log_info "🔢 Range : ${START_PAGE} → ${END_PAGE}"
|
||||||
|
log_info "🌐 Source : ${SOURCE_LANG}"
|
||||||
|
log_info "🎯 Target : ${TARGET_LANG}"
|
||||||
|
log_info "💾 Output : ${FOLDER}/translated/<page>/"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# PROCESS EACH PAGE
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
PASS=0
|
||||||
|
FAIL=0
|
||||||
|
FAIL_LIST=()
|
||||||
|
|
||||||
|
for i in "${!PAGES[@]}"; do
|
||||||
|
IMAGE="${PAGES[$i]}"
|
||||||
|
PAGE_NUM=$(( START_PAGE + i ))
|
||||||
|
STEM="$(basename "${IMAGE%.*}")"
|
||||||
|
WORKDIR="${FOLDER}/translated/${STEM}"
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo -e "${BOLD}──────────────────────────────────────────${RESET}"
|
||||||
|
echo -e "${BOLD} 🖼️ [${PAGE_NUM}/${TOTAL}] ${STEM}${RESET}"
|
||||||
|
echo -e "${BOLD}──────────────────────────────────────────${RESET}"
|
||||||
|
|
||||||
|
mkdir -p "$WORKDIR"
|
||||||
|
|
||||||
|
OUTPUT_JSON="${WORKDIR}/bubbles.json"
|
||||||
|
OUTPUT_TXT="${WORKDIR}/output.txt"
|
||||||
|
OUTPUT_DEBUG="${WORKDIR}/debug_clusters.png"
|
||||||
|
|
||||||
|
log_info "🗂️ Image : $(basename "$IMAGE")"
|
||||||
|
log_info "📁 Out : ${WORKDIR}"
|
||||||
|
|
||||||
|
# ── Run the translator ────────────────────────────────────
|
||||||
|
if "$PYTHON_BIN" "$TRANSLATOR" \
|
||||||
|
"$IMAGE" \
|
||||||
|
--source "$SOURCE_LANG" \
|
||||||
|
--target "$TARGET_LANG" \
|
||||||
|
--json "$OUTPUT_JSON" \
|
||||||
|
--txt "$OUTPUT_TXT" \
|
||||||
|
--debug "$OUTPUT_DEBUG"; then
|
||||||
|
|
||||||
|
# Verify outputs exist and are non-empty
|
||||||
|
MISSING=0
|
||||||
|
for FNAME in "bubbles.json" "output.txt"; do
|
||||||
|
FPATH="${WORKDIR}/${FNAME}"
|
||||||
|
if [[ ! -f "$FPATH" || ! -s "$FPATH" ]]; then
|
||||||
|
log_warn "${FNAME} is missing or empty."
|
||||||
|
MISSING=$(( MISSING + 1 ))
|
||||||
|
else
|
||||||
|
SIZE=$(wc -c < "$FPATH" | tr -d ' ')
|
||||||
|
log_ok "${FNAME} → ${SIZE} bytes"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
if [[ -f "$OUTPUT_DEBUG" ]]; then
|
||||||
|
log_ok "debug_clusters.png written."
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ $MISSING -eq 0 ]]; then
|
||||||
|
log_ok "Page ${PAGE_NUM} complete."
|
||||||
|
PASS=$(( PASS + 1 ))
|
||||||
|
else
|
||||||
|
log_warn "Page ${PAGE_NUM} finished with warnings."
|
||||||
|
FAIL=$(( FAIL + 1 ))
|
||||||
|
FAIL_LIST+=("${STEM}")
|
||||||
|
fi
|
||||||
|
|
||||||
|
else
|
||||||
|
log_error "Page ${PAGE_NUM} FAILED — check output above."
|
||||||
|
FAIL=$(( FAIL + 1 ))
|
||||||
|
FAIL_LIST+=("${STEM}")
|
||||||
|
fi
|
||||||
|
|
||||||
|
done
|
||||||
|
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
# FINAL SUMMARY
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
log_section "BATCH COMPLETE"
|
||||||
|
echo -e " ✅ ${GREEN}Passed : ${PASS}${RESET}"
|
||||||
|
echo -e " ❌ ${RED}Failed : ${FAIL}${RESET}"
|
||||||
|
|
||||||
|
if [[ ${#FAIL_LIST[@]} -gt 0 ]]; then
|
||||||
|
echo ""
|
||||||
|
log_warn "Failed pages:"
|
||||||
|
for NAME in "${FAIL_LIST[@]}"; do
|
||||||
|
echo -e " ❌ ${RED}${NAME}${RESET}"
|
||||||
|
done
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
log_info "📦 Output folder: $(realpath "${FOLDER}/translated")"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
[[ $FAIL -eq 0 ]] && exit 0 || exit 1
|
||||||
94
draw_debug_json.py
Normal file
94
draw_debug_json.py
Normal file
@@ -0,0 +1,94 @@
|
|||||||
|
import cv2
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import argparse
|
||||||
|
|
||||||
|
def draw_boxes_from_json(image_path: str, json_path: str, output_path: str):
|
||||||
|
# 1. Load the image
|
||||||
|
image_bgr = cv2.imread(image_path)
|
||||||
|
if image_bgr is None:
|
||||||
|
print(f"❌ Error: Cannot load image at {image_path}")
|
||||||
|
return
|
||||||
|
|
||||||
|
ih, iw = image_bgr.shape[:2]
|
||||||
|
|
||||||
|
# 2. Load the JSON data
|
||||||
|
if not os.path.exists(json_path):
|
||||||
|
print(f"❌ Error: JSON file not found at {json_path}")
|
||||||
|
return
|
||||||
|
|
||||||
|
with open(json_path, 'r', encoding='utf-8') as f:
|
||||||
|
data = json.load(f)
|
||||||
|
|
||||||
|
# Color map for different region types (BGR format)
|
||||||
|
COLOR_MAP = {
|
||||||
|
"dialogue": (0, 200, 0), # Green
|
||||||
|
"narration": (0, 165, 255), # Orange
|
||||||
|
"reaction": (255, 200, 0), # Cyan/Blue
|
||||||
|
"sfx": (0, 0, 220), # Red
|
||||||
|
"unknown": (120, 120, 120), # Gray
|
||||||
|
}
|
||||||
|
|
||||||
|
# 3. Iterate through the JSON and draw boxes
|
||||||
|
# Sort by order to keep numbering consistent
|
||||||
|
sorted_items = sorted(data.values(), key=lambda x: x.get("order", 0))
|
||||||
|
|
||||||
|
for item in sorted_items:
|
||||||
|
bid = item.get("order", "?")
|
||||||
|
rtype = item.get("region_type", "unknown")
|
||||||
|
box = item.get("box", {})
|
||||||
|
text = item.get("corrected_ocr", "")
|
||||||
|
|
||||||
|
if not box:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Extract xywh and convert to xyxy
|
||||||
|
x1, y1 = int(box.get("x", 0)), int(box.get("y", 0))
|
||||||
|
w, h = int(box.get("w", 0)), int(box.get("h", 0))
|
||||||
|
x2, y2 = x1 + w, y1 + h
|
||||||
|
|
||||||
|
color = COLOR_MAP.get(rtype, (120, 120, 120))
|
||||||
|
|
||||||
|
# Draw the main bounding box
|
||||||
|
cv2.rectangle(image_bgr, (x1, y1), (x2, y2), color, 2)
|
||||||
|
|
||||||
|
# Prepare labels
|
||||||
|
label = f"BOX#{bid} [{rtype}]"
|
||||||
|
preview = (text[:40] + "...") if len(text) > 40 else text
|
||||||
|
|
||||||
|
font = cv2.FONT_HERSHEY_SIMPLEX
|
||||||
|
font_scale = 0.38
|
||||||
|
thickness = 1
|
||||||
|
|
||||||
|
# Draw Label Background
|
||||||
|
(lw, lh), _ = cv2.getTextSize(label, font, font_scale, thickness)
|
||||||
|
cv2.rectangle(image_bgr,
|
||||||
|
(x1, max(0, y1 - lh - 6)),
|
||||||
|
(x1 + lw + 4, y1),
|
||||||
|
color, -1)
|
||||||
|
|
||||||
|
# Draw Label Text (Box ID + Type)
|
||||||
|
cv2.putText(image_bgr, label,
|
||||||
|
(x1 + 2, max(lh, y1 - 3)),
|
||||||
|
font, font_scale, (255, 255, 255), thickness,
|
||||||
|
cv2.LINE_AA)
|
||||||
|
|
||||||
|
# Draw Preview Text below the box
|
||||||
|
cv2.putText(image_bgr, preview,
|
||||||
|
(x1 + 2, min(ih - 5, y2 + 12)),
|
||||||
|
font, font_scale * 0.85, color, thickness,
|
||||||
|
cv2.LINE_AA)
|
||||||
|
|
||||||
|
# 4. Save the final image
|
||||||
|
cv2.imwrite(output_path, image_bgr)
|
||||||
|
print(f"✅ Debug image successfully saved to: {output_path}")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
parser = argparse.ArgumentParser(description="Draw bounding boxes from bubbles.json onto an image.")
|
||||||
|
parser.add_argument("image", help="Path to the original manga page image")
|
||||||
|
parser.add_argument("json", help="Path to the bubbles.json file")
|
||||||
|
parser.add_argument("--output", "-o", default="debug_clusters_from_json.png", help="Output image path")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
draw_boxes_from_json(args.image, args.json, args.output)
|
||||||
3721
manga-translator.py
3721
manga-translator.py
File diff suppressed because it is too large
Load Diff
119
older-code/patch_manga_translator.py
Normal file
119
older-code/patch_manga_translator.py
Normal file
@@ -0,0 +1,119 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
|
import re
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
TARGET = Path("manga-translator.py")
|
||||||
|
|
||||||
|
def cut_after_first_entrypoint(text: str) -> str:
|
||||||
|
"""
|
||||||
|
Keep only first full __main__ block and remove duplicated tail if present.
|
||||||
|
"""
|
||||||
|
m = re.search(r'(?m)^if __name__ == "__main__":\s*$', text)
|
||||||
|
if not m:
|
||||||
|
return text
|
||||||
|
|
||||||
|
start = m.start()
|
||||||
|
# Keep entrypoint block plus indented lines after it
|
||||||
|
lines = text[start:].splitlines(True)
|
||||||
|
keep = []
|
||||||
|
keep.append(lines[0]) # if __name__...
|
||||||
|
i = 1
|
||||||
|
while i < len(lines):
|
||||||
|
ln = lines[i]
|
||||||
|
if ln.strip() == "":
|
||||||
|
keep.append(ln)
|
||||||
|
i += 1
|
||||||
|
continue
|
||||||
|
# if dedented back to col 0 => end of block
|
||||||
|
if not ln.startswith((" ", "\t")):
|
||||||
|
break
|
||||||
|
keep.append(ln)
|
||||||
|
i += 1
|
||||||
|
|
||||||
|
cleaned = text[:start] + "".join(keep)
|
||||||
|
return cleaned
|
||||||
|
|
||||||
|
def replace_bad_vars(text: str) -> str:
|
||||||
|
text = text.replace(
|
||||||
|
"merge_micro_boxes_relaxed(bubbles, bubble_boxes, bubble_quads, bubble_indices, ocr, image_bgr)",
|
||||||
|
"merge_micro_boxes_relaxed(bubbles, bubble_boxes, bubble_quads, bubble_indices, filtered, image)"
|
||||||
|
)
|
||||||
|
text = text.replace(
|
||||||
|
"reattach_orphan_short_tokens(bubbles, bubble_boxes, bubble_quads, bubble_indices, ocr)",
|
||||||
|
"reattach_orphan_short_tokens(bubbles, bubble_boxes, bubble_quads, bubble_indices, filtered)"
|
||||||
|
)
|
||||||
|
return text
|
||||||
|
|
||||||
|
def ensure_autofix_chain(text: str) -> str:
|
||||||
|
old = (
|
||||||
|
" # ── Auto-fix (split + merge) ──────────────────────────────────────────\n"
|
||||||
|
" if auto_fix_bubbles:\n"
|
||||||
|
" bubbles, bubble_boxes, bubble_quads, bubble_indices = merge_micro_boxes_relaxed(bubbles, bubble_boxes, bubble_quads, bubble_indices, filtered, image)\n"
|
||||||
|
)
|
||||||
|
new = (
|
||||||
|
" # ── Auto-fix (split + merge) ──────────────────────────────────────────\n"
|
||||||
|
" if auto_fix_bubbles:\n"
|
||||||
|
" bubbles, bubble_boxes, bubble_quads, bubble_indices = auto_fix_bubble_detection(\n"
|
||||||
|
" bubble_boxes, bubble_indices, bubble_quads, bubbles, filtered, image)\n"
|
||||||
|
" bubbles, bubble_boxes, bubble_quads, bubble_indices = merge_micro_boxes_relaxed(\n"
|
||||||
|
" bubbles, bubble_boxes, bubble_quads, bubble_indices, filtered, image)\n"
|
||||||
|
)
|
||||||
|
return text.replace(old, new)
|
||||||
|
|
||||||
|
def ensure_split_commit(text: str) -> str:
|
||||||
|
marker = " # ── Remove nested / duplicate boxes ──────────────────────────────────\n"
|
||||||
|
if marker not in text:
|
||||||
|
return text
|
||||||
|
|
||||||
|
if "bubbles = new_bubbles" in text:
|
||||||
|
return text
|
||||||
|
|
||||||
|
inject = (
|
||||||
|
" bubbles = new_bubbles\n"
|
||||||
|
" bubble_boxes = new_bubble_boxes\n"
|
||||||
|
" bubble_quads = new_bubble_quads\n"
|
||||||
|
" bubble_indices = new_bubble_indices\n\n"
|
||||||
|
)
|
||||||
|
return text.replace(marker, inject + marker)
|
||||||
|
|
||||||
|
def ensure_rescue_pipeline(text: str) -> str:
|
||||||
|
anchor = ' print(f"Kept: {len(filtered)} | Skipped: {skipped}")\n'
|
||||||
|
if anchor not in text:
|
||||||
|
return text
|
||||||
|
|
||||||
|
if "rescue_name_and_short_tokens(raw" in text:
|
||||||
|
return text
|
||||||
|
|
||||||
|
block = (
|
||||||
|
' print(f"Kept: {len(filtered)} | Skipped: {skipped}")\n'
|
||||||
|
' # Protect short dialogue tokens confidence\n'
|
||||||
|
' tmp = []\n'
|
||||||
|
' for bbox, t, conf in filtered:\n'
|
||||||
|
' tmp.append((bbox, t, maybe_conf_floor_for_protected(t, conf, floor=0.40)))\n'
|
||||||
|
' filtered = tmp\n'
|
||||||
|
' # Rescue names/short tokens dropped by strict filters\n'
|
||||||
|
' rescued = rescue_name_and_short_tokens(raw, min_conf=0.20)\n'
|
||||||
|
' filtered = merge_rescued_items(filtered, rescued, iou_threshold=0.55)\n'
|
||||||
|
)
|
||||||
|
return text.replace(anchor, block)
|
||||||
|
|
||||||
|
def main():
|
||||||
|
if not TARGET.exists():
|
||||||
|
raise FileNotFoundError(f"Not found: {TARGET}")
|
||||||
|
|
||||||
|
src = TARGET.read_text(encoding="utf-8")
|
||||||
|
out = src
|
||||||
|
|
||||||
|
out = cut_after_first_entrypoint(out)
|
||||||
|
out = replace_bad_vars(out)
|
||||||
|
out = ensure_autofix_chain(out)
|
||||||
|
out = ensure_split_commit(out)
|
||||||
|
out = ensure_rescue_pipeline(out)
|
||||||
|
|
||||||
|
TARGET.write_text(out, encoding="utf-8")
|
||||||
|
print("✅ Patched manga-translator.py")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -1,159 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
pipeline_render.py
|
|
||||||
───────────────────────────────────────────────────────────────
|
|
||||||
Standalone Rendering Pipeline
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
python pipeline-render.py /path/to/chapter/folder
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import zipfile
|
|
||||||
import importlib.util
|
|
||||||
from pathlib import Path
|
|
||||||
import cv2 # ✅ Added OpenCV to load the image
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# CONFIG
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
DEFAULT_FONT_PATH = "fonts/ComicNeue-Regular.ttf"
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# DYNAMIC MODULE LOADER
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
def load_module(name, filepath):
|
|
||||||
spec = importlib.util.spec_from_file_location(name, filepath)
|
|
||||||
if spec is None or spec.loader is None:
|
|
||||||
raise FileNotFoundError(f"Cannot load spec for {filepath}")
|
|
||||||
module = importlib.util.module_from_spec(spec)
|
|
||||||
spec.loader.exec_module(module)
|
|
||||||
return module
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# HELPERS
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
def sorted_pages(chapter_dir):
|
|
||||||
exts = {".jpg", ".jpeg", ".png", ".webp"}
|
|
||||||
pages = [
|
|
||||||
p for p in Path(chapter_dir).iterdir()
|
|
||||||
if p.is_file() and p.suffix.lower() in exts
|
|
||||||
]
|
|
||||||
return sorted(pages, key=lambda p: p.stem)
|
|
||||||
|
|
||||||
def pack_rendered_cbz(chapter_dir, output_cbz, rendered_files):
|
|
||||||
if not rendered_files:
|
|
||||||
print("⚠️ No rendered pages found — CBZ not created.")
|
|
||||||
return
|
|
||||||
|
|
||||||
with zipfile.ZipFile(output_cbz, "w", compression=zipfile.ZIP_STORED) as zf:
|
|
||||||
for rp in rendered_files:
|
|
||||||
arcname = rp.name
|
|
||||||
zf.write(rp, arcname)
|
|
||||||
|
|
||||||
print(f"\n✅ Rendered CBZ saved → {output_cbz}")
|
|
||||||
print(f"📦 Contains: {len(rendered_files)} translated pages ready to read.")
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# PER-PAGE PIPELINE
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
def process_render(page_path, workdir, renderer_module, font_path):
|
|
||||||
print(f"\n{'─' * 70}")
|
|
||||||
print(f"🎨 RENDERING: {page_path.name}")
|
|
||||||
print(f"{'─' * 70}")
|
|
||||||
|
|
||||||
txt_path = workdir / "output.txt"
|
|
||||||
json_path = workdir / "bubbles.json"
|
|
||||||
out_img = workdir / page_path.name
|
|
||||||
|
|
||||||
if not txt_path.exists() or not json_path.exists():
|
|
||||||
print(" ⚠️ Missing output.txt or bubbles.json. Did you run the OCR pipeline first?")
|
|
||||||
return None
|
|
||||||
|
|
||||||
# ✅ FIX: Load the image into memory (as a NumPy array) before passing it
|
|
||||||
img_array = cv2.imread(str(page_path.resolve()))
|
|
||||||
if img_array is None:
|
|
||||||
print(f" ❌ Failed to load image: {page_path.name}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
orig_dir = os.getcwd()
|
|
||||||
try:
|
|
||||||
os.chdir(workdir)
|
|
||||||
|
|
||||||
# Pass the loaded image array instead of the string path
|
|
||||||
renderer_module.render_translations(
|
|
||||||
img_array, # 1st arg: Image Data (NumPy array)
|
|
||||||
str(out_img.resolve()), # 2nd arg: Output image path
|
|
||||||
str(txt_path.resolve()), # 3rd arg: Translations text
|
|
||||||
str(json_path.resolve()), # 4th arg: Bubbles JSON
|
|
||||||
font_path # 5th arg: Font Path
|
|
||||||
)
|
|
||||||
print(" ✅ Render complete")
|
|
||||||
return out_img
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
print(f" ❌ Failed: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
finally:
|
|
||||||
os.chdir(orig_dir)
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# MAIN
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(description="Manga Rendering Pipeline")
|
|
||||||
parser.add_argument("chapter_dir", help="Path to the folder containing original manga pages")
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
chapter_dir = Path(args.chapter_dir).resolve()
|
|
||||||
output_cbz = chapter_dir.parent / f"{chapter_dir.name}_rendered.cbz"
|
|
||||||
|
|
||||||
script_dir = Path(__file__).parent
|
|
||||||
absolute_font_path = str((script_dir / DEFAULT_FONT_PATH).resolve())
|
|
||||||
|
|
||||||
print("Loading renderer module...")
|
|
||||||
try:
|
|
||||||
renderer = load_module("manga_renderer", str(script_dir / "manga-renderer.py"))
|
|
||||||
except Exception as e:
|
|
||||||
print(f"❌ Could not load manga-renderer.py: {e}")
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
pages = sorted_pages(chapter_dir)
|
|
||||||
if not pages:
|
|
||||||
print(f"❌ No images found in: {chapter_dir}")
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
print(f"\n📖 Chapter : {chapter_dir}")
|
|
||||||
print(f" Pages : {len(pages)}\n")
|
|
||||||
|
|
||||||
succeeded, failed = [], []
|
|
||||||
rendered_files = []
|
|
||||||
|
|
||||||
for i, page_path in enumerate(pages, start=1):
|
|
||||||
print(f"[{i}/{len(pages)}] Checking data for {page_path.name}...")
|
|
||||||
workdir = Path(chapter_dir) / "translated" / page_path.stem
|
|
||||||
|
|
||||||
out_file = process_render(page_path, workdir, renderer, absolute_font_path)
|
|
||||||
if out_file:
|
|
||||||
succeeded.append(page_path.name)
|
|
||||||
rendered_files.append(out_file)
|
|
||||||
else:
|
|
||||||
failed.append(page_path.name)
|
|
||||||
|
|
||||||
print(f"\n{'═' * 70}")
|
|
||||||
print("RENDER PIPELINE COMPLETE")
|
|
||||||
print(f"✅ {len(succeeded)} page(s) rendered successfully")
|
|
||||||
if failed:
|
|
||||||
print(f"❌ {len(failed)} page(s) skipped or failed:")
|
|
||||||
for f in failed:
|
|
||||||
print(f" • {f}")
|
|
||||||
print(f"{'═' * 70}\n")
|
|
||||||
|
|
||||||
print("Packing final CBZ...")
|
|
||||||
pack_rendered_cbz(chapter_dir, output_cbz, rendered_files)
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
@@ -1,127 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
pipeline-translator.py
|
|
||||||
───────────────────────────────────────────────────────────────
|
|
||||||
Translation OCR pipeline (Batch Processing Only)
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
python pipeline-translator.py /path/to/chapter/folder
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import importlib.util
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# DYNAMIC MODULE LOADER
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
def load_module(name, filepath):
|
|
||||||
spec = importlib.util.spec_from_file_location(name, filepath)
|
|
||||||
if spec is None or spec.loader is None:
|
|
||||||
raise FileNotFoundError(f"Cannot load spec for {filepath}")
|
|
||||||
module = importlib.util.module_from_spec(spec)
|
|
||||||
spec.loader.exec_module(module)
|
|
||||||
return module
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# HELPERS
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
def sorted_pages(chapter_dir):
|
|
||||||
exts = {".jpg", ".jpeg", ".png", ".webp"}
|
|
||||||
pages = [
|
|
||||||
p for p in Path(chapter_dir).iterdir()
|
|
||||||
if p.is_file() and p.suffix.lower() in exts
|
|
||||||
]
|
|
||||||
return sorted(pages, key=lambda p: p.stem)
|
|
||||||
|
|
||||||
def make_page_workdir(chapter_dir, page_stem):
|
|
||||||
workdir = Path(chapter_dir) / "translated" / page_stem
|
|
||||||
workdir.mkdir(parents=True, exist_ok=True)
|
|
||||||
return workdir
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# PER-PAGE PIPELINE
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
def process_page(page_path, workdir, translator_module):
|
|
||||||
print(f"\n{'─' * 70}")
|
|
||||||
print(f"PAGE: {page_path.name}")
|
|
||||||
print(f"{'─' * 70}")
|
|
||||||
|
|
||||||
orig_dir = os.getcwd()
|
|
||||||
try:
|
|
||||||
# Isolate execution to the specific page's folder
|
|
||||||
os.chdir(workdir)
|
|
||||||
|
|
||||||
print(" ⏳ Extracting text and translating...")
|
|
||||||
|
|
||||||
# 1) Translate using ONLY the required path arguments.
|
|
||||||
# This forces the function to use its own internal default variables
|
|
||||||
# (like source_lang, target_lang, confidence_threshold) directly from manga-translator.py
|
|
||||||
translator_module.translate_manga_text(
|
|
||||||
image_path=str(page_path.resolve()),
|
|
||||||
export_to_file="output.txt",
|
|
||||||
export_bubbles_to="bubbles.json"
|
|
||||||
)
|
|
||||||
print(" ✅ Translation and OCR data saved successfully")
|
|
||||||
|
|
||||||
return True
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
print(f" ❌ Failed: {e}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
finally:
|
|
||||||
os.chdir(orig_dir)
|
|
||||||
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
# MAIN
|
|
||||||
# ─────────────────────────────────────────────
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(description="Manga Translation OCR Batch Pipeline")
|
|
||||||
parser.add_argument("chapter_dir", help="Path to the folder containing manga pages")
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
chapter_dir = Path(args.chapter_dir).resolve()
|
|
||||||
|
|
||||||
print("Loading translator module...")
|
|
||||||
script_dir = Path(__file__).parent
|
|
||||||
|
|
||||||
try:
|
|
||||||
translator = load_module("manga_translator", str(script_dir / "manga-translator.py"))
|
|
||||||
except Exception as e:
|
|
||||||
print(f"❌ Could not load manga-translator.py: {e}")
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
pages = sorted_pages(chapter_dir)
|
|
||||||
if not pages:
|
|
||||||
print(f"❌ No images found in: {chapter_dir}")
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
print(f"\n📖 Chapter : {chapter_dir.name}")
|
|
||||||
print(f" Pages : {len(pages)}")
|
|
||||||
print(" Note : Using translation settings directly from manga-translator.py\n")
|
|
||||||
|
|
||||||
succeeded, failed = [], []
|
|
||||||
|
|
||||||
for i, page_path in enumerate(pages, start=1):
|
|
||||||
print(f"[{i}/{len(pages)}] Processing...")
|
|
||||||
workdir = make_page_workdir(chapter_dir, page_path.stem)
|
|
||||||
|
|
||||||
if process_page(page_path, workdir, translator):
|
|
||||||
succeeded.append(page_path.name)
|
|
||||||
else:
|
|
||||||
failed.append(page_path.name)
|
|
||||||
|
|
||||||
print(f"\n{'═' * 70}")
|
|
||||||
print("PIPELINE COMPLETE")
|
|
||||||
print(f"✅ {len(succeeded)} page(s) succeeded")
|
|
||||||
if failed:
|
|
||||||
print(f"❌ {len(failed)} page(s) failed:")
|
|
||||||
for f in failed:
|
|
||||||
print(f" • {f}")
|
|
||||||
print(f"{'═' * 70}\n")
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
79
requirements
79
requirements
@@ -1,79 +0,0 @@
|
|||||||
aistudio-sdk==0.3.8
|
|
||||||
annotated-doc==0.0.4
|
|
||||||
annotated-types==0.7.0
|
|
||||||
anyio==4.13.0
|
|
||||||
bce-python-sdk==0.9.70
|
|
||||||
beautifulsoup4==4.14.3
|
|
||||||
certifi==2026.2.25
|
|
||||||
chardet==7.4.3
|
|
||||||
charset-normalizer==3.4.7
|
|
||||||
click==8.3.2
|
|
||||||
colorlog==6.10.1
|
|
||||||
crc32c==2.8
|
|
||||||
deep-translator==1.11.4
|
|
||||||
easyocr==1.7.2
|
|
||||||
filelock==3.28.0
|
|
||||||
fsspec==2026.3.0
|
|
||||||
future==1.0.0
|
|
||||||
h11==0.16.0
|
|
||||||
hf-xet==1.4.3
|
|
||||||
httpcore==1.0.9
|
|
||||||
httpx==0.28.1
|
|
||||||
huggingface_hub==1.10.2
|
|
||||||
idna==3.11
|
|
||||||
ImageIO==2.37.3
|
|
||||||
imagesize==2.0.0
|
|
||||||
Jinja2==3.1.6
|
|
||||||
lazy-loader==0.5
|
|
||||||
markdown-it-py==4.0.0
|
|
||||||
MarkupSafe==3.0.3
|
|
||||||
mdurl==0.1.2
|
|
||||||
modelscope==1.35.4
|
|
||||||
mpmath==1.3.0
|
|
||||||
networkx==3.6.1
|
|
||||||
ninja==1.13.0
|
|
||||||
numpy==1.26.4
|
|
||||||
opencv-contrib-python==4.10.0.84
|
|
||||||
opencv-python==4.11.0.86
|
|
||||||
opencv-python-headless==4.11.0.86
|
|
||||||
opt-einsum==3.3.0
|
|
||||||
packaging==26.1
|
|
||||||
paddleocr==3.4.1
|
|
||||||
paddlepaddle==3.3.1
|
|
||||||
paddlex==3.4.3
|
|
||||||
pandas==3.0.2
|
|
||||||
pillow==12.2.0
|
|
||||||
prettytable==3.17.0
|
|
||||||
protobuf==7.34.1
|
|
||||||
psutil==7.2.2
|
|
||||||
py-cpuinfo==9.0.0
|
|
||||||
pyclipper==1.4.0
|
|
||||||
pycryptodome==3.23.0
|
|
||||||
pydantic==2.13.1
|
|
||||||
pydantic_core==2.46.1
|
|
||||||
Pygments==2.20.0
|
|
||||||
pypdfium2==5.7.0
|
|
||||||
python-bidi==0.6.7
|
|
||||||
python-dateutil==2.9.0.post0
|
|
||||||
PyYAML==6.0.2
|
|
||||||
requests==2.33.1
|
|
||||||
rich==15.0.0
|
|
||||||
ruamel.yaml==0.19.1
|
|
||||||
safetensors==0.7.0
|
|
||||||
scikit-image==0.26.0
|
|
||||||
scipy==1.17.1
|
|
||||||
shapely==2.1.2
|
|
||||||
shellingham==1.5.4
|
|
||||||
six==1.17.0
|
|
||||||
soupsieve==2.8.3
|
|
||||||
sympy==1.14.0
|
|
||||||
tifffile==2026.3.3
|
|
||||||
torch==2.11.0
|
|
||||||
torchvision==0.26.0
|
|
||||||
tqdm==4.67.3
|
|
||||||
typer==0.24.1
|
|
||||||
typing-inspection==0.4.2
|
|
||||||
typing_extensions==4.15.0
|
|
||||||
ujson==5.12.0
|
|
||||||
urllib3==2.6.3
|
|
||||||
wcwidth==0.6.0
|
|
||||||
@@ -1,12 +0,0 @@
|
|||||||
numpy<2.0
|
|
||||||
opencv-python>=4.8
|
|
||||||
easyocr>=1.7.1
|
|
||||||
deep-translator>=1.11.4
|
|
||||||
manga-ocr>=0.1.14
|
|
||||||
torch
|
|
||||||
torchvision
|
|
||||||
Pillow
|
|
||||||
transformers
|
|
||||||
fugashi
|
|
||||||
unidic-lite
|
|
||||||
|
|
||||||
Reference in New Issue
Block a user