Categorize

Small Python script to categorize data using RegEx.

Install

Simply clone this repo:

git clone git@gitlab.com:databulle/categorize.git

Use

You'll need:

A file containing the rules you want to apply (see syntax below),
A file containing the items you want to categorize (one item per line).

It's quite simple:

python categorize.py -r example-rules.txt -i example-data.csv

Default outputs to a CSV file with semicolon separator: categorized.csv.

Command line arguments:

-h, --help shows help message and exits
-r RULES, --rules RULES indicates rules file (required)
-i INPUT, --input INPUT indicates input file (required)
-o OUTPUT, --output OUTPUT indicates output file (optional, default: categorized.csv)
-s SEP, --sep SEP indicates output file CSV separator (optional, default: ; )

Rules syntax

This script uses standard Python RegEx syntax, and associates each pattern with a category name.
Each line must start with the regular expression pattern, followed by a semicolumn, and end with the category name:

REGEX;CATEGORY_NAME

You'll find some examples in the example-rules.txt file.

Contributing

If you wish to contribute to this repository or to report an issue, please use GitLab.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENCE.txt		LICENCE.txt
README.md		README.md
categorize.py		categorize.py
example-data.csv		example-data.csv
example-rules.txt		example-rules.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Categorize

Install

Use

Rules syntax

Contributing

About

Uh oh!

Releases

Packages

Languages

License

databulle/Categorize

Folders and files

Latest commit

History

Repository files navigation

Categorize

Install

Use

Rules syntax

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages