Text Classifier: AI vs SLW (Second Language Writers)

This project was built as a personal learning project to explore linguistic data science, and syntactic complexity modeling.

Live App

Features

Load real-world .txt samples by index
Visualize Syntactic Complexity Indices (e.g., MLS, CN_C, VP_T)
Generate classification results with confidence scores
Explore SHAP waterfall plots to interpret predictions

About the syntactic Complexity Indices

The classifier uses L2SCA Indices by TAASSC

Predictions are supported by SHAP contribution plots, showing how each feature influences the outcome toward AI or SLW.

Data Overview

The dataset consists of 300 text samples divided into three categories:

1–100: Human-written texts by second language writers (SLW)
101–200: AI-generated texts using general prompts
201–300: AI-generated texts created by prompting large language models (LLMs) to mimic SLW writing style

Data prepocessing by TAASSC

Data Usage Notice

The .txt files in txt_samples/ are included only for demonstration and learning purposes.
They are not licensed for reuse, redistribution, or commercial use.
See txt_samples/LICENSE.txt for full terms.
The dataset file X_binary.csv is private and is not licensed for reuse, redistribution, or modification.
It is shared solely for demonstration purposes and should not be used for any other purpose.

License

This project's code is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
txt_samples		txt_samples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
X_binary.csv		X_binary.csv
clf_bin_model.pkl		clf_bin_model.pkl
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Classifier: AI vs SLW (Second Language Writers)

Features

About the syntactic Complexity Indices

Data Overview

Data Usage Notice

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Text Classifier: AI vs SLW (Second Language Writers)

Features

About the syntactic Complexity Indices

Data Overview

Data Usage Notice

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages