🛠️ Data Wrangling Scripts & Workarounds Welcome to the Data Wrangling Scripts & Workarounds repo!
This repository is a curated collection of helpful scripts, clever workarounds, and creative hacks to manipulate, clean, transform, and prepare data across a wide range of scenarios in my career. Whether you're untangling messy spreadsheets, reformatting inconsistent datasets, or building quick fixes for ETL pipelines, this repo has your back. Of course, there are more than one way to do something so use what you want - I am just happy to share if I can!
📦 What's Inside You'll find an eclectic mix of: Data cleaning scripts - Remove duplicates, standardize values, parse malformed text, and more. File format converters - Tools to convert between CSV, JSON, Parquet, Excel, and other formats. String parsing and regex helpers - Handle semi-structured text with smart extraction and transformation logic. Date and time munging - Fix timezone issues, reformat timestamps, and align inconsistent time formats. API interaction scripts - Quick scripts to pull data from APIs and normalize it for analysis. Shell/data pipeline hacks - Lightweight CLI utilities and awk/sed/jq one-liners to make data behave.
🔧 Technologies Used Depending on the script, this repo may include examples in: R (tidyverse, lubridate, etc.) JavaScript (Google Apps) Python (pandas, numpy, json, etc.) SQL (PostgreSQL / SQLite snippets)
🚀 Clone the repo: git clone https://github.com/your-username/data-wrangling-scripts.git cd data-wrangling-scripts
🧩 Contributing Contributions are welcome! If you have a favorite data wrangling trick, workaround, or script that saves you time, feel free to. Fork the repo, Add your script in the appropriate folder, or Submit a pull request with a clear description!
📄 This repository is a Use it, modify it, and share it freely repo!
🙌 Acknowledgments This repo exists to make messy data a little less frustrating. Thanks to all the open-source contributors who write scripts that save the rest of us hours of pain.