This is a Python-based project that generates a detailed user persona by analyzing a Reddit user’s posts and comments using LLM-based natural language analysis. The output includes characteristics like interests, personality traits, writing tone, and more — along with proper citations from Reddit.
This project was built as part of the Generative AI Internship Assignment for BeyondChats.
- Scrapes recent Reddit posts and comments of a user
- Generates an LLM-backed psychological and stylistic persona
- Cites exact URLs for every inferred trait
- Clean CLI interface
- Fully PEP-8 compliant code structure
- Python 3.10+
- PRAW – Reddit API wrapper
- Google Gemini API
python-dotenvfor secure environment variable management
git clone https://github.com/your-username/reddit-user-persona-generator.git
cd RedditMindMapWe recommend using a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtCreate a .env file in the project root directory with the following content:
# Reddit API (Create from https://www.reddit.com/prefs/apps)
REDDIT_CLIENT_ID=your_reddit_client_id
REDDIT_CLIENT_SECRET=your_reddit_client_secret
# Gemini API (Get from https://ai.google.dev/)
GOOGLE_API_KEY=your_google_gemini_api_keyYou can run the script in two ways:
python main.py https://www.reddit.com/user/kojied/python main.pyYou will be prompted to enter a Reddit profile URL or just the username.
-
Extracts the username from the provided URL (or direct input).
-
Scrapes up to 30 posts and 30 comments using the Reddit API.
-
Uses Google Gemini LLM to generate a structured persona.
-
Saves the output in two formats:
- 📄
kojied_persona.txt(text-based for terminal and evaluation) - 📝
kojied_persona.md(Markdown-formatted for GitHub)
- 📄
Each persona file (.txt and .md) includes:
- 🎯 Interests
- 🤔 Personality traits
- 🗣️ Tone of writing
- 👨🎓 Possible profession or education
- 😂 Language style or humor
- 🌎 Political/social leanings (if any)
- 🚫 Limitations
- 🔗 Citations for each trait (Reddit post or comment URL)
- 💬 (Optional) Representative quote
- ✅ (Optional) Goals and needs
reddit-user-persona-generator/
│
├── main.py # Entry point for CLI or prompt-based input
├── persona_utils.py # All scraping, LLM generation, saving logic
├── requirements.txt # Python dependencies
├── .env # Stores API keys (excluded from Git)
├── kojied_persona.txt # Sample output (text format)
├── kojied_persona.md # Sample output (Markdown format)
├── Hungry-Move-6603_persona.txt
├── Hungry-Move-6603_persona.md
└── README.md # README file
All code follows PEP-8 standards for style and formatting. Verified using:
flake8 main.py persona_utils.py- Ensure your Reddit app is created as a script app, not web or installed.
- If Gemini API throws a quota error, try reducing post/comment limit or use a smaller model.
- Only public Reddit data is used; no login or upvote activity is tracked.