Folder Word Counter

A high-performance Python utility designed to aggregate word counts across multiple .docx files. Unlike standard counters, this tool is built for massive datasets, using streaming generators to keep memory usage near zero while identifying specific text "milestones" (e.g., every millionth word).

🚀 Key Features

Zero-RAM Streaming: Processes billions of words without crashing by reading files one word at a time rather than loading them all into memory.
Milestone Snapshots: Automatically identifies and captures a 10-word searchable snippet every time a global word count milestone (default: 1,000,000 words) is reached.
Universal Folder Picker: Includes a cross-platform GUI (Windows/macOS) to select directories, with a terminal fallback for headless environments.
Table Extraction: Unlike basic counters, this script extracts and counts text hidden inside Word Tables.
Clean CLI Output: Generates a formatted table showing individual file counts and a final global total.

🛠️ Requirements

Python 3.x
Library: python-docx
OS: Windows or macOS (GUI folder picker supported on both).

Installation

pip install python-docx

💻 Configuration

You can customize the script behavior by editing the globals at the top of counter.py:

Global	Purpose	Default
`MILESTONE_INTERVAL`	Frequency of snippets (in words).	`1,000_000`
`SNIPPET_SIZE`	Length of the searchable string captured.	`10`
`COL_WIDTH_FILE`	Adjusts terminal table width for long filenames.	`45`

📖 Usage

Run the script:

python counter.py

A folder selection dialog will appear. Select the folder containing your .docx files.
The script will process files alphabetically, printing a live tally to the terminal.
View your Milestone Snapshots at the end of the report to see exactly where each million-word mark was hit.

⚠️ Limitations

Encrypted Files: The script will skip password-protected .docx files as they cannot be read via XML streaming.
Non-Docx: Only files ending in .docx are processed; .doc (Legacy) or .rtf files are ignored.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
counter.py		counter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Folder Word Counter

🚀 Key Features

🛠️ Requirements

Installation

💻 Configuration

📖 Usage

⚠️ Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Folder Word Counter

🚀 Key Features

🛠️ Requirements

Installation

💻 Configuration

📖 Usage

⚠️ Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages