Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
Binary file added CBThallLogs.csv.xlsx
Binary file not shown.
31 changes: 31 additions & 0 deletions Data_preprocessing_Numpy.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
Column1;Column2;Column3;Column4;Column5;Column6
;50;365;2801;3626;4676
;40;365;3121;5381;15281
;40;365;3121;4501;16381
;50;365;1700;3200;16550
;40;365;3401;4421;13101
;50;365;-1850;-530;1480
;50;365;3750;6250;20250
;50;365;3751;5251;20250
;;;5453;6953;22250
;;;5100;6400;22250
;40;365;3400;5400;16600
;40;365;3360;;
;40;365;1940;;
;50;365;5500;7500;22250
;;;;;
;50;365;1250;2650;19250
;50;365;5500;5500;22250
;40;365;3500;4400;15600
;40;365;8200;13200;16600
;40;365;3080;4600;16600
;50;365;;;20250
;40;365;3440;3440;15600
;;365;;;
;40;365;2120;3280;15600
;50;;;6900;22250
;40;365;4900;4900;16600
;40;365;4000;5000;16600
;50;365;3400;5000;20250
;125;;;;54625
;40;;3441;4541;16600
573 changes: 573 additions & 0 deletions EDA_CHATGPT.ipynb

Large diffs are not rendered by default.

Binary file added IPCbt.xlsx
Binary file not shown.
43 changes: 43 additions & 0 deletions Kaggle_advanced_GB_algorithm_4_life_expectancy.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#XGBoost
import xgboost as xgb
from sklearn import preprocessing
X = df.drop(columns='Life expectancy')
y = df['Life expectancy']

lbl = preprocessing.LabelEncoder()
#Country','Year','Status
X['Country'] = lbl.fit_transform(X['Country'].astype(str))
X['Year'] = lbl.fit_transform(X['Year'].astype(str))
X['Status'] = lbl.fit_transform(X['Status'].astype(str))

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20,
random_state=1)

start = time.time()
#X_train["Species"].astype("category")
xgbr = xgb.XGBRegressor()

xgbr.fit(X_train, y_train)
y_pred = xgbr.predict(X_test)

xgb_rmse = np.sqrt(RSquared(y_test, y_pred))
print("R Squared for XGBoost: ", np.mean(xgb_rmse))

end = time.time()
diff = end - start
print('Execution time:', diff)

This script is using the XGBoost library to train a regression model and predict ‘Life expectancy’ from a dataset. Here’s a step-by-step breakdown:

Import necessary libraries: The script starts by importing the necessary libraries - xgboost, sklearn.preprocessing, and sklearn.model_selection.
Prepare the data: The target variable ‘Life expectancy’ is separated from the rest of the dataset. The remaining columns are stored in X and ‘Life expectancy’ is stored in y.
Encode categorical features: The LabelEncoder from sklearn.preprocessing is used to convert categorical features ‘Country’, ‘Year’, and ‘Status’ into numerical labels.
Split the data: The train_test_split function from sklearn.model_selection is used to split the dataset into a training set and a test set. 80% of the data is used for training and 20% is used for testing.
Train the model: An instance of XGBRegressor (a regression model from the XGBoost library) is created and then fitted on the training data.
Make predictions: The trained model is used to make predictions on the test data.
Evaluate the model: The R-squared value is calculated for the predictions. R-squared is a statistical measure that represents the proportion of the variance for a dependent variable that’s explained by an independent variable or variables in a regression model. The RSquared function used here is not defined in the provided code, so you would need to define it or import it from an appropriate library.
Measure execution time: The script also measures and prints the time it took to train the model and make predictions.
Please note that this is a simple example. The actual process of training an XGBoost model involves many more steps, including data preprocessing, parameter tuning, and model evaluation.

I hope this helps! If you have any other questions, feel free to ask. 😊
Loading