A fully local, end-to-end pipeline to transform real Reddit activity into structured user personas — using Reddit API, Python, and the Mistral LLM running via Ollama.
Imagine being able to understand any user deeply — just by reading how they talk online.
Reddit Persona Generator does exactly that.
It takes a Reddit profile, scrapes public posts/comments, and uses a local LLM to generate a structured persona — complete with:
- Behavior patterns
- Frustrations
- Goals
- Motivations
- Tone
- Interests
…and even citations for every insight from real Reddit activity.
LLMs are powerful — but without real-world grounding, they're just words.
This project shows how LLMs can extract genuine human insight from everyday conversations online — for use cases like:
- Personalized marketing
- User segmentation
- Behavioral analysis
- Agent customization
And the best part? It runs entirely offline, using Mistral with Ollama.
- Full Reddit post + comment scraping via
praw - Structured personas with 6 rich categories
- Citations for every insight (Reddit URLs)
- LLM-based generation using
MistralviaOllama(local) - Invalid profile handling (e.g. 404s)
- Support for multiple personas per user (prompt depth demo)
- PEP8-compliant, modular Python code
- No OpenAI or cloud dependency
git clone https://github.com/Divyansh-git10/Reddit-Persona-Generator.git
cd Reddit-Persona-Generatorpip install -r requirements.txtCLIENT_ID=your_client_id
CLIENT_SECRET=your_client_secret
USER_AGENT=your_user_agent
ollama run mistralUpdate username = "..." in main.py and run:
python main.pyOutput will be saved in /output/username_persona.txt
| Reddit Username | Persona Type |
|---|---|
kojied |
Urban, reflective tech worker |
Hungry-Move-6603 |
Civic voice, realism, political concern |
St0rmCh4ser |
Tech & crypto guide |
drywallwizard |
DIY expert, knowledge sharer |
TalesOfTheLost |
(x2) Spiritual / WWII historian |
GallowBoob |
Meme king, pet lover, humorous content creator |
Multiple personas for the same user (
TalesOfTheLost) demonstrate prompt adaptability and persona diversity.
The script gracefully exits if a profile doesn't exist or has no data:
Error fetching data for MasterOfNone_92: received 404 HTTP response
No data found. Skipping persona generation.Reddit-Persona-Generator/
├── main.py
├── reddit_scraper.py
├── persona_generator.py
├── requirements.txt
├── README.md
├── .env # (excluded)
├── raw/
│ ├── kojied_raw.txt
│ ├── ...
├── output/
│ ├── kojied_persona.txt
│ ├── TalesOfTheLost_spiritual.txt
│ ├── GallowBoob_persona.txt
│ └── ...
Python3.11+prawfor Reddit scrapingollamafor local model runtimeMistral(7B open-weight LLM).envconfig viapython-dotenv
Divyansh Gautam Internship Applicant @ BeyondChats GitHub
This project was built in under 48 hours as part of the BeyondChats Generative AI Internship Assignment.
The goal was to demonstrate:
- Real-world LLM application
- Prompt design intuition
- Local-first inference
- Strong persona modeling
Thanks for the opportunity 🙏