Guillem Hernandez Sola 7b73a40ed7 fix(jenkins): isolate per-feed state/cooldown, cap posts, stagger batches
- Pass --state-path and --cooldown-path per feed so a rate limit on one
  feed no longer triggers a global cooldown that blocks all other feeds
- Add --max-posts 5 cap to prevent burst posting (e.g. 22 posts in one
  run) that was causing HTTP 429 errors on eurosky.social
- Add 15s sleep between batches to reduce cumulative API pressure
- Increase MAX_PARALLEL_FEEDS from 4 to 6 now that cooldowns are isolated
- Add MAX_POSTS_PER_FEED env var for central control of the post cap
- Fix missing line continuation backslash on --service argument

Fixes: batch 2-7 feeds being skipped due to shared cooldown state
2026-05-13 07:05:46 +02:00
2026-03-29 18:25:29 +02:00
2026-03-29 17:44:14 +02:00
2026-05-08 19:49:18 +00:00
2026-03-29 17:44:14 +02:00
2026-04-18 10:50:47 +02:00

post2bsky

A Python-based automation tool for reposting content to Bluesky from RSS feeds and Twitter accounts. Includes a daemon mode for continuous operation with comprehensive media support, deduplication, and extensive logging.

Note: This tool is designed for content creators and maintainers who need to automatically synchronize feeds/accounts to Bluesky. Ensure you have permission to repost content and comply with all platform terms of service.

Features

  • RSS → Bluesky: Parse RSS feeds and automatically post new entries with proper formatting
  • Twitter → Bluesky: Scrape tweets from Twitter accounts and repost to Bluesky (with media)
  • Daemon Mode: Run continuously as a background service for unattended operation
  • Media Support: Handle images, videos, and other media with automatic optimization
  • Deduplication: Track posted content to prevent duplicates across runs
  • Configurable Workflows: YAML-based pipelines for each source with scheduling
  • Media Constraints: Auto-handles Bluesky's limits (300 chars, 4 images, 45MB video, etc.)
  • Error Recovery: Automatic retries with exponential backoff for transient failures
  • Comprehensive Logging: Detailed logs for monitoring and troubleshooting

📋 Prerequisites

  • Python 3.9 or higher
  • macOS, Linux, or Windows with Chromium support (for Twitter scraping)
  • Bluesky account with credentials
  • Twitter account (if using Twitter→Bluesky syncing)

🚀 Quick Start

1. Clone & Setup Environment

git clone <repository-url>
cd post2bsky
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requeriments.txt

2. Configure Credentials

Create a .env file in the project root:

# Bluesky Authentication
BSKY_USERNAME=your_bluesky_handle
BSKY_PASSWORD=your_bluesky_password

# Optional: Custom Bluesky instance (default: https://bsky.social)
BSKY_BASE_URL=https://bsky.social

# Twitter Authentication (if using Twitter syncing)
TWITTER_USERNAME=your_twitter_handle
TWITTER_PASSWORD=your_twitter_password

3. Run a Quick Test

# Test RSS feed posting
python rss2bsky.py --feed-url https://example.com/rss

# Test Twitter account scraping
python twitter2bsky_daemon.py --test

Installation

Standard Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/post2bsky.git
    cd post2bsky
    
  2. Create and activate virtual environment:

    python3 -m venv venv
    source venv/bin/activate  # macOS/Linux
    # or
    venv\Scripts\activate     # Windows
    
  3. Install dependencies:

    pip install -r requeriments.txt
    
  4. Set up environment variables: Create a .env file in the root directory (see Credentials section)

⚙️ Configuration

Credentials

Your credentials should be stored in .env file at the project root. This file should never be committed to version control (already in .gitignore):

BSKY_USERNAME=your_bluesky_handle
BSKY_PASSWORD=your_bluesky_password

# For Twitter scraping (email or username, and password)
TWITTER_USERNAME=your_twitter_username_or_email
TWITTER_PASSWORD=your_twitter_password

Security Note: Never commit credentials to Git. The .env file is automatically ignored.

RSS Feed Configuration

Run rss2bsky.py to post from RSS feeds:

# Basic usage
python rss2bsky.py --feed-url https://example.com/rss --bsky-handle your_handle

# With advanced options
python rss2bsky.py \
  --feed-url https://example.com/rss \
  --bsky-handle your_handle \
  --max-posts 5 \
  --limit-age 3  # Only posts from last 3 days

State Management: The tool tracks posted entries in twitter2bsky_state.json to prevent duplicates. This file is updated automatically on each run.

Twitter Account Configuration

Configure Twitter accounts in twitter2bsky_daemon.py. The script uses Playwright for browser automation to scrape tweets:

# Run Twitter daemon
python twitter2bsky_daemon.py

# Run with test mode (dry-run, no posting)
python twitter2bsky_daemon.py --test

# Specify custom state file
python twitter2bsky_daemon.py --state-file custom_state.json

Twitter Scraping Details:

  • Uses Playwright Chromium for headless browser automation
  • Handles t.co URL redirects and link metadata
  • Includes screenshot capture for error debugging
  • Automatic retry with exponential backoff on failures

Workflow Pipelines

The workflows/ directory contains YAML pipeline configurations that define:

  • Data source (RSS feed URL or Twitter handle)
  • Posting schedule and frequency
  • Content filtering rules
  • Target Bluesky account

Example: workflows/324.yml defines the pipeline for the "324" RSS feed.

Each workflow typically has a corresponding Jenkins configuration in jenkins/ for CI/CD integration.

Running Workflows:

# Manual execution
./sync_runner.sh

# Run specific workflow
python rss2bsky.py --feed-url $(grep 'url:' workflows/324.yml | head -1 | cut -d' ' -f2)

Media Handling

The tool automatically optimizes media for Bluesky's constraints:

Constraint Value
Image size limit 950 KB per image
Image max dimension 2000px (width or height)
Max images per post 4
Video size limit 45 MB
Video max duration 3 minutes
Thumbnail size 950 KB
Text length 300 characters (grapheme clusters)

Images are automatically converted to JPEG with quality optimization (min 40-45 JPEG quality).

💻 Usage

RSS to Bluesky (rss2bsky.py)

Post entries from RSS feeds to Bluesky:

# Simple usage
python rss2bsky.py --feed-url https://example.com/feed.xml --bsky-handle @your_handle

# Limit to recent posts
python rss2bsky.py --feed-url https://example.com/feed.xml --limit-age 7

# Dry run (preview without posting)
python rss2bsky.py --feed-url https://example.com/feed.xml --dry-run

Output: The script logs all actions to twitter2bsky.log and maintains state in twitter2bsky_state.json.

Twitter to Bluesky (twitter2bsky_daemon.py)

Run continuously to sync tweets from specified accounts:

# Start daemon mode (continuous monitoring)
python twitter2bsky_daemon.py

# Run once and exit
python twitter2bsky_daemon.py --once

# Test mode (no actual posts to Bluesky)
python twitter2bsky_daemon.py --test

# Custom configuration
python twitter2bsky_daemon.py --max-retries 5 --timeout 30

Features:

  • Automatically fetches new tweets from configured accounts
  • Handles retweets, quotes, and threaded tweets
  • Downloads and optimizes media attachments
  • Resolves shortened t.co links to actual URLs
  • Prevents duplicate posts with state tracking

Running with Sync Runner

./sync_runner.sh

This script can orchestrate multiple sources and is suitable for integration with cron jobs or systemd timers.

Daemon Mode Setup (systemd)

To run twitter2bsky_daemon.py continuously as a system service on Linux:

  1. Create service file /etc/systemd/system/post2bsky.service:

    [Unit]
    Description=post2bsky Twitter to Bluesky Daemon
    After=network.target
    
    [Service]
    Type=simple
    User=your_user
    WorkingDirectory=/path/to/post2bsky
    Environment="PATH=/path/to/post2bsky/venv/bin"
    ExecStart=/path/to/post2bsky/venv/bin/python twitter2bsky_daemon.py
    Restart=always
    RestartSec=60
    
    [Install]
    WantedBy=multi-user.target
    
  2. Enable and start:

    sudo systemctl daemon-reload
    sudo systemctl enable post2bsky
    sudo systemctl start post2bsky
    sudo systemctl status post2bsky
    
  3. View logs:

    tail -f twitter2bsky.log
    

Cron Job Integration

Add to crontab with crontab -e:

# Run RSS sync every 30 minutes
*/30 * * * * cd /path/to/post2bsky && source venv/bin/activate && python rss2bsky.py --feed-url https://example.com/rss

# Run all workflows at 9 AM daily
0 9 * * * cd /path/to/post2bsky && ./sync_runner.sh

📦 Dependencies

All Python dependencies are listed in requeriments.txt. Key packages:

Package Purpose
atproto Bluesky API client for posting
fastfeedparser RSS/Atom feed parsing
playwright Browser automation for Twitter scraping
beautifulsoup4 HTML parsing and content extraction
pillow Image optimization and processing
moviepy Video processing and duration detection
grapheme Unicode grapheme cluster counting for Bluesky's text limits
httpx HTTP client for URL resolution and media downloads
python-dotenv Environment variable management
arrow Date/time handling with timezone support

Install all dependencies with:

pip install -r requeriments.txt

📁 Project Structure

post2bsky/
├── rss2bsky.py                    # RSS feed → Bluesky posting script
├── twitter2bsky_daemon.py         # Twitter → Bluesky daemon (main logic)
├── twitter_login.py               # Twitter authentication helper
├── cookie_login.py                # Alternative login method
├── sync_runner.sh                 # Orchestration script for multiple sources
├── twitter2bsky_state.json        # State file tracking posted content (auto-generated)
├── twitter2bsky.log               # Application logs (auto-generated)
├── requeriments.txt               # Python dependencies
├── README.md                      # This file
├── LICENSE                        # GNU GPLv3 license
├── jenkins/                       # Jenkins CI/CD configurations
│   └── [account_name]Tw/         # Config for each account
├── workflows/                     # YAML pipeline definitions
│   ├── 324.yml                   # Example: RSS feed for "324"
│   ├── fcbarcelona.yml           # Example: Twitter account for FC Barcelona
│   └── ...
└── venv/                         # Python virtual environment (created during setup)

🔧 Troubleshooting

Authentication Issues

Problem: Login failed: Invalid credentials

Solution:

  1. Verify credentials in .env are correct (no extra spaces)
  2. Check if Bluesky account requires app password (Settings → App passwords)
  3. If using 2FA, generate an app-specific password
  4. For Twitter, ensure account isn't rate-limited or restricted

Twitter Scraping Issues

Problem: Playwright browser failed or screenshot errors

Solution:

  1. Ensure Chromium is properly installed: playwright install chromium
  2. Check available disk space (Playwright requires ~500MB)
  3. Run script with --debug flag for detailed output
  4. Check browser error screenshots in screenshot_*.png files

Problem: No tweets found or Tweets already posted

Solution:

  1. Verify Twitter account handle is correct in configuration
  2. Check twitter2bsky_state.json for deduplication data
  3. Delete state file to reset tracking (careful: may cause re-posting)
  4. Review twitter2bsky.log for detailed debugging

Media Processing Issues

Problem: Image upload failed or Video too large

Solution:

  1. Images are auto-optimized, but source should be <100MB
  2. Videos must be <45MB and <3 minutes
  3. Check available disk space for temporary files
  4. Enable debug logging in the script for detailed info

Performance Issues

Problem: Script runs slowly or times out

Solution:

  1. Check network connectivity
  2. Reduce SCRAPE_TWEET_LIMIT in twitter2bsky_daemon.py (default: 30)
  3. Increase timeout constants if on slow connection
  4. Run with --once instead of daemon mode to diagnose
  5. Check system resources (CPU, memory, disk I/O)

Log Analysis

Check twitter2bsky.log for detailed debugging:

# View recent errors
grep ERROR twitter2bsky.log | tail -20

# View all warnings
grep WARNING twitter2bsky.log | tail -20

# Watch logs in real-time
tail -f twitter2bsky.log

# Count posts by status
grep -c "✅ Posted to Bluesky" twitter2bsky.log

🐛 Debugging

Enable debug logging by modifying the logging level in the script:

# In twitter2bsky_daemon.py, change:
level=logging.INFO,
# To:
level=logging.DEBUG,

Run with verbose output:

python twitter2bsky_daemon.py 2>&1 | tee debug.log

Error screenshots are automatically saved as screenshot_YYYYMMDD_HHMMSS.png for investigation.

📄 License

This project is licensed under the GNU General Public License v3.0. See LICENSE for details.

Summary: You are free to use, modify, and distribute this software, but any modifications must also be open-source under GPLv3.

🤝 Contributing

Contributions are welcome! To contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes with clear commit messages
  4. Push to your fork (git push origin feature/amazing-feature)
  5. Open a Pull Request with a description of your changes

Before submitting:

  • Test your changes thoroughly
  • Ensure code follows existing style conventions
  • Add comments for complex logic
  • Update README if adding new features

FAQ

Q: Can I use this on Windows?
A: Yes, but ensure you have Python 3.9+ and Chromium/Playwright support. Use venv\Scripts\activate instead of source venv/bin/activate.

Q: How do I avoid posting duplicates?
A: The state file (twitter2bsky_state.json) tracks all posted content. It's automatically maintained; just don't delete it between runs.

Q: Can I post to multiple Bluesky accounts?
A: Currently, the tool posts to one account per instance. Run multiple instances with different .env configurations to handle multiple accounts.

Q: What happens if posting fails?
A: The script has automatic retry logic with exponential backoff. Failed posts are logged but the state file is NOT updated, so retries on next run.

Q: Is my content optimized for Bluesky?
A: Yes. The tool automatically:

  • Truncates text to 300 characters (grapheme-aware)
  • Optimizes images to Bluesky specs
  • Handles video conversion and compression
  • Resolves shortened URLs

Q: How do I run this on a server?
A: Use the systemd service example in the Usage section, or set up a cron job.

Q: Can I schedule posts?
A: Not directly through this tool. Instead, use cron/scheduler to run the script at desired times.

🎯 Use Cases

  • Content Creators: Automatically repost your RSS feeds to Bluesky for wider reach
  • News Aggregation: Create Bluesky bots that share news from multiple RSS sources
  • Account Management: Keep social media accounts synchronized across platforms
  • Content Distribution: Distribute content from Twitter to Bluesky without manual copying

🔐 Security Notes

  • Never commit .env: Credentials are automatically gitignored
  • Secure your state file: twitter2bsky_state.json may contain URLs; protect it like credentials
  • Use app passwords: For Bluesky, use app-specific passwords instead of main account password
  • Monitor logs: Regularly review twitter2bsky.log for unauthorized access attempts

📞 Support

  • Issues: Open an issue on GitHub with detailed reproduction steps
  • Documentation: Check this README and inline code comments
  • Logs: Attach relevant log excerpts when reporting issues
  • Testing: Test with --test flag before running in production

📝 Changelog

See Git commit history for detailed changes. Notable versions:

  • v2.0: Added Twitter scraping with media support, daemon mode
  • v1.5: Improved RSS parsing and media handling
  • v1.0: Initial release with basic RSS→Bluesky posting

Disclaimer

This tool is for personal use and automation. Ensure compliance with the terms of service of Bluesky, Twitter, and any RSS sources you use. Respect rate limits and avoid spamming.

Description
No description provided
Readme 35 MiB
Languages
Python 100%