- Add quick start section with 3-step setup instructions - Include prerequisites and platform compatibility information - Expand credentials configuration with security best practices - Add detailed configuration section with media constraints table - Provide concrete usage examples for RSS, Twitter daemon, and systemd - Include cron job integration examples for scheduling - Add project structure diagram showing all key files and directories - Create extensive troubleshooting section with common issues and solutions - Add debugging guide with log analysis tips - Include FAQ section addressing typical user questions - Document use cases and real-world scenarios - Add security notes for credential management - Improve contributing guidelines with step-by-step workflow - Enhance formatting with emojis, tables, and better organization - Replace vague descriptions with actionable, specific guidance This makes the documentation suitable for both beginner and advanced users while providing clear paths for setup, usage, and troubleshooting. Co-authored-by: Copilot <copilot@github.com>
16 KiB
post2bsky
A Python-based automation tool for reposting content to Bluesky from RSS feeds and Twitter accounts. Includes a daemon mode for continuous operation with comprehensive media support, deduplication, and extensive logging.
Note: This tool is designed for content creators and maintainers who need to automatically synchronize feeds/accounts to Bluesky. Ensure you have permission to repost content and comply with all platform terms of service.
✨ Features
- RSS → Bluesky: Parse RSS feeds and automatically post new entries with proper formatting
- Twitter → Bluesky: Scrape tweets from Twitter accounts and repost to Bluesky (with media)
- Daemon Mode: Run continuously as a background service for unattended operation
- Media Support: Handle images, videos, and other media with automatic optimization
- Deduplication: Track posted content to prevent duplicates across runs
- Configurable Workflows: YAML-based pipelines for each source with scheduling
- Media Constraints: Auto-handles Bluesky's limits (300 chars, 4 images, 45MB video, etc.)
- Error Recovery: Automatic retries with exponential backoff for transient failures
- Comprehensive Logging: Detailed logs for monitoring and troubleshooting
📋 Prerequisites
- Python 3.9 or higher
- macOS, Linux, or Windows with Chromium support (for Twitter scraping)
- Bluesky account with credentials
- Twitter account (if using Twitter→Bluesky syncing)
🚀 Quick Start
1. Clone & Setup Environment
git clone <repository-url>
cd post2bsky
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requeriments.txt
2. Configure Credentials
Create a .env file in the project root:
# Bluesky Authentication
BSKY_USERNAME=your_bluesky_handle
BSKY_PASSWORD=your_bluesky_password
# Optional: Custom Bluesky instance (default: https://bsky.social)
BSKY_BASE_URL=https://bsky.social
# Twitter Authentication (if using Twitter syncing)
TWITTER_USERNAME=your_twitter_handle
TWITTER_PASSWORD=your_twitter_password
3. Run a Quick Test
# Test RSS feed posting
python rss2bsky.py --feed-url https://example.com/rss
# Test Twitter account scraping
python twitter2bsky_daemon.py --test
Installation
Standard Installation
-
Clone the repository:
git clone https://github.com/yourusername/post2bsky.git cd post2bsky -
Create and activate virtual environment:
python3 -m venv venv source venv/bin/activate # macOS/Linux # or venv\Scripts\activate # Windows -
Install dependencies:
pip install -r requeriments.txt -
Set up environment variables: Create a
.envfile in the root directory (see Credentials section)
⚙️ Configuration
Credentials
Your credentials should be stored in .env file at the project root. This file should never be committed to version control (already in .gitignore):
BSKY_USERNAME=your_bluesky_handle
BSKY_PASSWORD=your_bluesky_password
# For Twitter scraping (email or username, and password)
TWITTER_USERNAME=your_twitter_username_or_email
TWITTER_PASSWORD=your_twitter_password
Security Note: Never commit credentials to Git. The .env file is automatically ignored.
RSS Feed Configuration
Run rss2bsky.py to post from RSS feeds:
# Basic usage
python rss2bsky.py --feed-url https://example.com/rss --bsky-handle your_handle
# With advanced options
python rss2bsky.py \
--feed-url https://example.com/rss \
--bsky-handle your_handle \
--max-posts 5 \
--limit-age 3 # Only posts from last 3 days
State Management: The tool tracks posted entries in twitter2bsky_state.json to prevent duplicates. This file is updated automatically on each run.
Twitter Account Configuration
Configure Twitter accounts in twitter2bsky_daemon.py. The script uses Playwright for browser automation to scrape tweets:
# Run Twitter daemon
python twitter2bsky_daemon.py
# Run with test mode (dry-run, no posting)
python twitter2bsky_daemon.py --test
# Specify custom state file
python twitter2bsky_daemon.py --state-file custom_state.json
Twitter Scraping Details:
- Uses Playwright Chromium for headless browser automation
- Handles t.co URL redirects and link metadata
- Includes screenshot capture for error debugging
- Automatic retry with exponential backoff on failures
Workflow Pipelines
The workflows/ directory contains YAML pipeline configurations that define:
- Data source (RSS feed URL or Twitter handle)
- Posting schedule and frequency
- Content filtering rules
- Target Bluesky account
Example: workflows/324.yml defines the pipeline for the "324" RSS feed.
Each workflow typically has a corresponding Jenkins configuration in jenkins/ for CI/CD integration.
Running Workflows:
# Manual execution
./sync_runner.sh
# Run specific workflow
python rss2bsky.py --feed-url $(grep 'url:' workflows/324.yml | head -1 | cut -d' ' -f2)
Media Handling
The tool automatically optimizes media for Bluesky's constraints:
| Constraint | Value |
|---|---|
| Image size limit | 950 KB per image |
| Image max dimension | 2000px (width or height) |
| Max images per post | 4 |
| Video size limit | 45 MB |
| Video max duration | 3 minutes |
| Thumbnail size | 950 KB |
| Text length | 300 characters (grapheme clusters) |
Images are automatically converted to JPEG with quality optimization (min 40-45 JPEG quality).
💻 Usage
RSS to Bluesky (rss2bsky.py)
Post entries from RSS feeds to Bluesky:
# Simple usage
python rss2bsky.py --feed-url https://example.com/feed.xml --bsky-handle @your_handle
# Limit to recent posts
python rss2bsky.py --feed-url https://example.com/feed.xml --limit-age 7
# Dry run (preview without posting)
python rss2bsky.py --feed-url https://example.com/feed.xml --dry-run
Output: The script logs all actions to twitter2bsky.log and maintains state in twitter2bsky_state.json.
Twitter to Bluesky (twitter2bsky_daemon.py)
Run continuously to sync tweets from specified accounts:
# Start daemon mode (continuous monitoring)
python twitter2bsky_daemon.py
# Run once and exit
python twitter2bsky_daemon.py --once
# Test mode (no actual posts to Bluesky)
python twitter2bsky_daemon.py --test
# Custom configuration
python twitter2bsky_daemon.py --max-retries 5 --timeout 30
Features:
- Automatically fetches new tweets from configured accounts
- Handles retweets, quotes, and threaded tweets
- Downloads and optimizes media attachments
- Resolves shortened t.co links to actual URLs
- Prevents duplicate posts with state tracking
Running with Sync Runner
./sync_runner.sh
This script can orchestrate multiple sources and is suitable for integration with cron jobs or systemd timers.
Daemon Mode Setup (systemd)
To run twitter2bsky_daemon.py continuously as a system service on Linux:
-
Create service file
/etc/systemd/system/post2bsky.service:[Unit] Description=post2bsky Twitter to Bluesky Daemon After=network.target [Service] Type=simple User=your_user WorkingDirectory=/path/to/post2bsky Environment="PATH=/path/to/post2bsky/venv/bin" ExecStart=/path/to/post2bsky/venv/bin/python twitter2bsky_daemon.py Restart=always RestartSec=60 [Install] WantedBy=multi-user.target -
Enable and start:
sudo systemctl daemon-reload sudo systemctl enable post2bsky sudo systemctl start post2bsky sudo systemctl status post2bsky -
View logs:
tail -f twitter2bsky.log
Cron Job Integration
Add to crontab with crontab -e:
# Run RSS sync every 30 minutes
*/30 * * * * cd /path/to/post2bsky && source venv/bin/activate && python rss2bsky.py --feed-url https://example.com/rss
# Run all workflows at 9 AM daily
0 9 * * * cd /path/to/post2bsky && ./sync_runner.sh
📦 Dependencies
All Python dependencies are listed in requeriments.txt. Key packages:
| Package | Purpose |
|---|---|
atproto |
Bluesky API client for posting |
fastfeedparser |
RSS/Atom feed parsing |
playwright |
Browser automation for Twitter scraping |
beautifulsoup4 |
HTML parsing and content extraction |
pillow |
Image optimization and processing |
moviepy |
Video processing and duration detection |
grapheme |
Unicode grapheme cluster counting for Bluesky's text limits |
httpx |
HTTP client for URL resolution and media downloads |
python-dotenv |
Environment variable management |
arrow |
Date/time handling with timezone support |
Install all dependencies with:
pip install -r requeriments.txt
📁 Project Structure
post2bsky/
├── rss2bsky.py # RSS feed → Bluesky posting script
├── twitter2bsky_daemon.py # Twitter → Bluesky daemon (main logic)
├── twitter_login.py # Twitter authentication helper
├── cookie_login.py # Alternative login method
├── sync_runner.sh # Orchestration script for multiple sources
├── twitter2bsky_state.json # State file tracking posted content (auto-generated)
├── twitter2bsky.log # Application logs (auto-generated)
├── requeriments.txt # Python dependencies
├── README.md # This file
├── LICENSE # GNU GPLv3 license
├── jenkins/ # Jenkins CI/CD configurations
│ └── [account_name]Tw/ # Config for each account
├── workflows/ # YAML pipeline definitions
│ ├── 324.yml # Example: RSS feed for "324"
│ ├── fcbarcelona.yml # Example: Twitter account for FC Barcelona
│ └── ...
└── venv/ # Python virtual environment (created during setup)
🔧 Troubleshooting
Authentication Issues
Problem: Login failed: Invalid credentials
Solution:
- Verify credentials in
.envare correct (no extra spaces) - Check if Bluesky account requires app password (Settings → App passwords)
- If using 2FA, generate an app-specific password
- For Twitter, ensure account isn't rate-limited or restricted
Twitter Scraping Issues
Problem: Playwright browser failed or screenshot errors
Solution:
- Ensure Chromium is properly installed:
playwright install chromium - Check available disk space (Playwright requires ~500MB)
- Run script with
--debugflag for detailed output - Check browser error screenshots in
screenshot_*.pngfiles
Problem: No tweets found or Tweets already posted
Solution:
- Verify Twitter account handle is correct in configuration
- Check
twitter2bsky_state.jsonfor deduplication data - Delete state file to reset tracking (careful: may cause re-posting)
- Review
twitter2bsky.logfor detailed debugging
Media Processing Issues
Problem: Image upload failed or Video too large
Solution:
- Images are auto-optimized, but source should be <100MB
- Videos must be <45MB and <3 minutes
- Check available disk space for temporary files
- Enable debug logging in the script for detailed info
Performance Issues
Problem: Script runs slowly or times out
Solution:
- Check network connectivity
- Reduce
SCRAPE_TWEET_LIMITintwitter2bsky_daemon.py(default: 30) - Increase timeout constants if on slow connection
- Run with
--onceinstead of daemon mode to diagnose - Check system resources (CPU, memory, disk I/O)
Log Analysis
Check twitter2bsky.log for detailed debugging:
# View recent errors
grep ERROR twitter2bsky.log | tail -20
# View all warnings
grep WARNING twitter2bsky.log | tail -20
# Watch logs in real-time
tail -f twitter2bsky.log
# Count posts by status
grep -c "✅ Posted to Bluesky" twitter2bsky.log
🐛 Debugging
Enable debug logging by modifying the logging level in the script:
# In twitter2bsky_daemon.py, change:
level=logging.INFO,
# To:
level=logging.DEBUG,
Run with verbose output:
python twitter2bsky_daemon.py 2>&1 | tee debug.log
Error screenshots are automatically saved as screenshot_YYYYMMDD_HHMMSS.png for investigation.
📄 License
This project is licensed under the GNU General Public License v3.0. See LICENSE for details.
Summary: You are free to use, modify, and distribute this software, but any modifications must also be open-source under GPLv3.
🤝 Contributing
Contributions are welcome! To contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes with clear commit messages
- Push to your fork (
git push origin feature/amazing-feature) - Open a Pull Request with a description of your changes
Before submitting:
- Test your changes thoroughly
- Ensure code follows existing style conventions
- Add comments for complex logic
- Update README if adding new features
❓ FAQ
Q: Can I use this on Windows?
A: Yes, but ensure you have Python 3.9+ and Chromium/Playwright support. Use venv\Scripts\activate instead of source venv/bin/activate.
Q: How do I avoid posting duplicates?
A: The state file (twitter2bsky_state.json) tracks all posted content. It's automatically maintained; just don't delete it between runs.
Q: Can I post to multiple Bluesky accounts?
A: Currently, the tool posts to one account per instance. Run multiple instances with different .env configurations to handle multiple accounts.
Q: What happens if posting fails?
A: The script has automatic retry logic with exponential backoff. Failed posts are logged but the state file is NOT updated, so retries on next run.
Q: Is my content optimized for Bluesky?
A: Yes. The tool automatically:
- Truncates text to 300 characters (grapheme-aware)
- Optimizes images to Bluesky specs
- Handles video conversion and compression
- Resolves shortened URLs
Q: How do I run this on a server?
A: Use the systemd service example in the Usage section, or set up a cron job.
Q: Can I schedule posts?
A: Not directly through this tool. Instead, use cron/scheduler to run the script at desired times.
🎯 Use Cases
- Content Creators: Automatically repost your RSS feeds to Bluesky for wider reach
- News Aggregation: Create Bluesky bots that share news from multiple RSS sources
- Account Management: Keep social media accounts synchronized across platforms
- Content Distribution: Distribute content from Twitter to Bluesky without manual copying
🔐 Security Notes
- Never commit
.env: Credentials are automatically gitignored - Secure your state file:
twitter2bsky_state.jsonmay contain URLs; protect it like credentials - Use app passwords: For Bluesky, use app-specific passwords instead of main account password
- Monitor logs: Regularly review
twitter2bsky.logfor unauthorized access attempts
📞 Support
- Issues: Open an issue on GitHub with detailed reproduction steps
- Documentation: Check this README and inline code comments
- Logs: Attach relevant log excerpts when reporting issues
- Testing: Test with
--testflag before running in production
📝 Changelog
See Git commit history for detailed changes. Notable versions:
- v2.0: Added Twitter scraping with media support, daemon mode
- v1.5: Improved RSS parsing and media handling
- v1.0: Initial release with basic RSS→Bluesky posting
Disclaimer
This tool is for personal use and automation. Ensure compliance with the terms of service of Bluesky, Twitter, and any RSS sources you use. Respect rate limits and avoid spamming.