-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
Precondition : I assume that you got 'MAGIC Gamma Telescope Data.csv' file from my previous posting. Just copy and paste in Notepad and save it as .csv file.
from tpot import TPOTRegressor
from sklearn.model_selection import train_test_split
from pandas import *
# load teh data
df=read_csv('MAGIC Gamma Telescope Data.csv')
# clean the data
features = df.drop('Class', axis=1).values # = X
df['Class'] = df['Class'].map({'g':0, 'h':1}) # changing 'g','h' into 0 and 1
target = df['Class'].values # = y
# Split the data
X_train, X_test, y_train, y_test = train_test_split(features, target, train_size=0.8, test_size=0.2)
# Let Genetic Programming find best ML model and hyperparameters
tpot = TPOTRegressor( generations=5, verbosity=2 )
# usually, for generations, population_size, offspring_size, the default options are good.
# If you want to get a result quickly, adjust only generation.
tpot.fit(X_train, y_train)
# Score the accuracy
tpot.score(X_test, y_test)
print("Cross Validation(CV) score : {} / 0<= CV score <= 1(perfectly accurate) ".format(tpot.score(X_test, y_test)))
# Export the generated code
tpot.export('tpot_test1_pipeline.py')
Optimization Progress: 13%|█▎ | 78/600 [27:19<3:55:49, 27.11s/pipeline]
Optimization Progress: 34%|███▎ | 202/600 [1:06:22<2:12:22, 19.96s/pipeline]Generation 1 - Current best internal CV score: 0.09374578856689733
Optimization Progress: 39%|███▉ | 233/600 [1:21:01<39:46, 6.50s/pipeline]
- It's being processed but you already can see that it almost converges to 0. (Since TPOTRegressor's default scoring is MSE(Mean Squared Estimation). In MSE, the number 0 means the most accurate.)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels