A recursive wordlist generator written in Python. For each string position, custom character sets can be defined.
Python 3.0 or higher
From PyPI:
pip install wlgenFrom GitHub:
git clone https://github.com/tehw0lf/wlgen.git
cd wlgen
pip install .The generate_wordlist() function automatically selects the optimal algorithm based on problem size:
import wlgen
# Auto-select optimal algorithm
charset = {0: '123', 1: 'ABC', 2: 'xyz'}
wordlist = wlgen.generate_wordlist(charset)
# Use the result (list or iterator depending on size)
for word in wordlist if not isinstance(wordlist, list) else wordlist:
print(word)Selection Strategy:
- Tiny (<1K combinations): Uses
gen_wordlist(fastest, low memory) - Small (1K-100K): Uses
gen_wordlist(fast and convenient) - Medium+ (100K+): Uses
gen_wordlist_iter(optimal throughput, constant memory)
Options:
# Force memory-efficient mode for any size
wordlist = wlgen.generate_wordlist(charset, prefer_memory_efficient=True)
# Manual algorithm selection
wordlist = wlgen.generate_wordlist(charset, method='iter') # Force iterator
wordlist = wlgen.generate_wordlist(charset, method='list') # Force list
wordlist = wlgen.generate_wordlist(charset, method='words') # Force gen_words
# Skip input validation for pre-cleaned data (performance optimization)
wordlist = wlgen.generate_wordlist(charset, method='iter', clean_input=True)Estimate wordlist size before generation:
size = wlgen.estimate_wordlist_size(charset)
print(f"Will generate {size:,} combinations")Three implementations are available for direct use:
gen_wordlist_iter: Fast generator usingitertools.product(~780-810K comb/s).gen_wordlist: Builds entire list in memory. Fastest for small lists (~900K-1.6M comb/s).gen_words: Memory-efficient recursive generator (~210-230K comb/s).
All implementations calculate the n-ary Cartesian product of input character sets.
import wlgen
charset = {0: '123', 1: 'ABC'}
# Use specific implementation
for word in wlgen.gen_wordlist_iter(charset):
print(word)Note: NumPy, CUDA, and multiprocessing were investigated but found to provide no performance benefit for this workload. String operations are fundamentally CPU-bound and too fast for parallelization overhead. See issues #17, #18, #20 for detailed analysis.
This project uses uv for dependency management and development workflow.
Install dependencies:
uv sync --all-extras --group lintRun tests:
uv run python -m unittest discoverRun performance benchmarks:
uv run python wlgen/benchmarks/benchmark.pyLint code:
uv run ruff checkAuto-fix linting issues:
uv run ruff check --fixFormat code:
uv run ruff formatBuild wheel and source distribution:
uv build