A powerful Reddit data scraping tool with a user-friendly Streamlit interface. Extract posts and comments from subreddits or specific posts with ease.
- 📱 User-friendly web interface
- 🔍 Scrape posts from any subreddit
- 💬 Extract comments from specific posts
- 📊 Export data to CSV
- ⏱️ Time-based filtering
- 🔄 Caching for better performance
- Python - Core programming language
- Streamlit - Web interface framework
- PRAW - Reddit API wrapper
- Pandas - Data manipulation and analysis
- python-dotenv - Environment variable management
- Python 3.9 or higher
- Reddit API credentials (Get them here)
- Clone the repository:
git clone https://github.com/pakagronglb/reddit-scraper.git
cd reddit-scraper
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
Create a
.env
file in the project root:
REDDIT_CLIENT_ID=your_client_id
REDDIT_CLIENT_SECRET=your_client_secret
REDDIT_USER_AGENT=your_user_agent
- Start the application:
streamlit run main.py
-
Access the web interface at
http://localhost:8501
-
Choose your scraping option:
- Subreddit Posts: Enter subreddit name, post limit, and time filter
- Specific Post: Enter the Reddit post URL
-
Click "Scrape" and download the results as CSV
- Push your code to GitHub
- Visit share.streamlit.io
- Connect your repository
- Add your Reddit API credentials in Streamlit secrets
- Create a Heroku app:
heroku create your-app-name
- Set environment variables:
heroku config:set REDDIT_CLIENT_ID=your_client_id
heroku config:set REDDIT_CLIENT_SECRET=your_client_secret
heroku config:set REDDIT_USER_AGENT=your_user_agent
- Deploy:
git push heroku main
requirements.txt
- Project dependencies.env
- Local environment variablesProcfile
- Heroku deployment configurationruntime.txt
- Python runtime specification
- Never commit your
.env
file or.streamlit/secrets.toml
- Use environment variables for sensitive data
- Keep your Reddit API credentials secure
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Your Name - @pakagronglb
Project Link: https://github.com/pakagronglb/reddit-scraper