Skip to content

Latest commit

 

History

History
40 lines (30 loc) · 1.06 KB

File metadata and controls

40 lines (30 loc) · 1.06 KB

Gemini JSONL Dataset Generator

A lightweight Python script that uses the Google Gemini API to generate structured datasets in JSONL format.
It reads the base prompt from a .txt file defined in your .env file and writes each generated object line-by-line into a .jsonl file.


Installation

git clone https://github.com/erkerem2/gemini-jsonl-generator.git
cd gemini-jsonl-generator
pip install -r requirements.txt

Create a .env file

GEMINI_KEY="your_api_key"
PROMPT_FILE="prompt.txt path"

Usage

python main.py
--total 100
--per_call 20
--outfile output.jsonl
--temperature 0.7

Arguments

Argument Description Default
--api_key Gemini API key (can be read from .env) .env:GEMINI_KEY
--model Model name gemini-2.5-flash-lite
--total Total number of JSON objects to generate 6
--per_call Number of objects per API call 2
--outfile Output file name output.jsonl
--temperature Sampling temperature 0.8