Skip to content

Latest commit

 

History

History
43 lines (29 loc) · 1.17 KB

File metadata and controls

43 lines (29 loc) · 1.17 KB

CSV Data Cleaner

This is a simple yet robust Python script for cleaning CSV files. It removes duplicate rows, replaces missing values with empty strings, and saves the cleaned data to a new file. It also includes basic error handling for missing files and corrupted input.

Features

  • Removes duplicate rows
  • Replaces missing values (NaN) with empty strings
  • Handles missing or empty input files gracefully
  • Reports how many duplicate rows were removed
  • Easy to use via command-line arguments

Requirements

  • Python 3.x
  • pandas library

Install dependencies with:

pip install pandas

Usage

Run the script from the command line:

python clean_csv.py input.csv output.csv

Arguments input.csv: The path to the original CSV file you want to clean.

output.csv: The path where the cleaned CSV file should be saved.

Error Handling If the input file does not exist, the script will notify you.

If the input file is empty or corrupted, an appropriate error message will be shown.

If the script is run with incorrect arguments, it will display usage instructions.

License This project is open source and free to use. No license restrictions.