AI-4-UX

A Python toolkit for cleaning, transforming, and analyzing User Experience (UX) survey data. Built for UX researchers and data analysts who need to process multilingual survey responses and perform statistical analysis on Likert-scale data.

Key Features

Data Cleaning – Standardize messy survey exports and filter incomplete responses
Scale Conversion – Convert French Likert scale text to numerical values
Sample Size Calculator – Determine required participants for statistical validity
Statistical Analysis – Normality tests, correlation matrices, and distribution metrics
Qualitative Processing – Clean open-ended responses and translate from French to English
Theme Extraction – Basic keyword-based categorization of feedback

Workflow Overview

┌─────────────────────────────────────────────────────────────────────┐
│                          DATA PIPELINE                              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  your_data.csv ──► cleanup.py ──► convert_to_numbers.py             │
│                         │                  │                        │
│                         ▼                  ▼                        │
│                 cleaned_data.csv    numeric_likert.csv              │
│                                            │                        │
│         ┌──────────────────────────────────┼──────────────────┐     │
│         │                                  │                  │     │
│         ▼                                  ▼                  ▼     │
│   shapiro_wilk.py              skewness_kurtosis.py    iqr_median.py│
│   spearman.py                                                       │
│                                                                     │
├─────────────────────────────────────────────────────────────────────┤
│                     OPEN-ENDED RESPONSES                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  open_ended.csv ──► clean_open.py ──► themes.py / feature.py        │
│                          │                                          │
│                          ▼                                          │
│              open_ended_translated.csv                              │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Scripts Reference

Data Cleaning & Preparation

Script	Description
`cleanup.py`	Removes invalid rows and standardizes column names from raw survey exports
`scale_mappings.py`	Defines French-to-number mappings for Likert scales and ordinal choices
`convert_to_numbers.py`	Applies scale mappings to convert text responses to numeric values

Statistical Analysis

Script	Description	Output
`sample_size_calculator.py`	Calculates required sample size for valid tests	(Console Output)
`shapiro_wilk.py`	Tests if data follows a normal distribution	`shapiro_wilk.csv`
`spearman.py`	Computes rank correlation matrix with heatmap	`spearman_correlation.csv`, `spearman_correlation_heatmap.png`
`skewness_kurtosis.py`	Measures distribution shape (asymmetry and tailedness)	`skewness_kurtosis_results.csv`
`iqr_median.py`	Calculates robust central tendency and spread metrics	`median_iqr_results.csv`

Qualitative Analysis

Script	Description	Output
`clean_open.py`	Cleans open-ended responses and translates French → English	`open_ended_cleaned.csv`, `open_ended_translated.csv`
`translate.py`	Standalone translation utility (re-run without re-cleaning)	`open_ended_translated.csv`
`feature.py`	Counts keyword mentions (e.g., "Calendar", "Mobile")	`features.csv`
`themes.py`	Categorizes responses into themes (e.g., "Bug", "RFE")	`themes.csv`

Usage

1. Prepare Structured Data

# Clean the raw survey export
python cleanup.py

# Convert text responses to numbers
python convert_to_numbers.py

2. Run Statistical Analysis

# Calculate required sample size
python sample_size_calculator.py

# Test for normal distribution
python shapiro_wilk.py

# Generate correlation matrix and heatmap
python spearman.py

# Analyze distribution shape
python skewness_kurtosis.py

# Calculate median and IQR
python iqr_median.py

3. Process Open-Ended Responses

# Clean and translate responses
python clean_open.py

# Extract feature keywords
python feature.py

# Categorize into themes
python themes.py

Caveats & Limitations

Warning

Keyword-based analysis has limitations. The feature.py and themes.py scripts use simple keyword matching. Results should be verified through human analysis for accuracy.

Note

Dataset-specific configuration required. The cleanup.py script contains a hardcoded French column name from the original survey. You must update this for different datasets.

Requirements

Install dependencies with pip:

pip install pandas numpy scipy matplotlib seaborn deep_translator

Data Files

File	Description
`your_data.csv`	Raw survey export (input)
`cleaned_data.csv`	Preprocessed structured data
`numeric_likert.csv`	Responses converted to numbers
`data_no_outliers.csv`	Dataset with outliers removed (used by statistical scripts)
`open_ended.csv`	Raw open-ended responses
`open_ended_cleaned.csv`	Cleaned open-ended responses
`open_ended_translated.csv`	Translated responses (French → English)
`themes.csv`	Categorized open-ended responses
`features.csv`	Feature keyword counts

License

This project is open-source and licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-4-UX

Key Features

Workflow Overview

Scripts Reference

Data Cleaning & Preparation

Statistical Analysis

Qualitative Analysis

Usage

1. Prepare Structured Data

2. Run Statistical Analysis

3. Process Open-Ended Responses

Caveats & Limitations

Requirements

Data Files

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.gitignore		.gitignore
README.md		README.md
clean_open.py		clean_open.py
cleaned_data.csv		cleaned_data.csv
cleanup.py		cleanup.py
convert_to_numbers.py		convert_to_numbers.py
data_no_outliers.csv		data_no_outliers.csv
feature.py		feature.py
features.csv		features.csv
iqr_median.py		iqr_median.py
median_iqr_results.csv		median_iqr_results.csv
normality_test_results.csv		normality_test_results.csv
numeric_likert.csv		numeric_likert.csv
open_ended.csv		open_ended.csv
open_ended_cleaned.csv		open_ended_cleaned.csv
open_ended_translated.csv		open_ended_translated.csv
sample_size_calculator.py		sample_size_calculator.py
scale_mappings.py		scale_mappings.py
shapiro_wilk.csv		shapiro_wilk.csv
shapiro_wilk.py		shapiro_wilk.py
skewness_kurtosis.py		skewness_kurtosis.py
skewness_kurtosis_results.csv		skewness_kurtosis_results.csv
spearman.py		spearman.py
spearman_correlation.csv		spearman_correlation.csv
spearman_correlation_heatmap.png		spearman_correlation_heatmap.png
themes.csv		themes.csv
themes.py		themes.py
translate.py		translate.py
your_data+openended.csv		your_data+openended.csv
your_data.csv		your_data.csv

Folders and files

Latest commit

History

Repository files navigation

AI-4-UX

Key Features

Workflow Overview

Scripts Reference

Data Cleaning & Preparation

Statistical Analysis

Qualitative Analysis

Usage

1. Prepare Structured Data

2. Run Statistical Analysis

3. Process Open-Ended Responses

Caveats & Limitations

Requirements

Data Files

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages