Est1y · Est1y · May 13, 2024 · May 13, 2024 · Jun 20, 2024 · Jun 20, 2024
diff --git a/CBThallLogs.csv.xlsx b/CBThallLogs.csv.xlsx
diff --git a/Data_preprocessing_Numpy.csv b/Data_preprocessing_Numpy.csv
@@ -0,0 +1,31 @@
+Column1;Column2;Column3;Column4;Column5;Column6
+;50;365;2801;3626;4676
+;40;365;3121;5381;15281
+;40;365;3121;4501;16381
+;50;365;1700;3200;16550
+;40;365;3401;4421;13101
+;50;365;-1850;-530;1480
+;50;365;3750;6250;20250
+;50;365;3751;5251;20250
+;;;5453;6953;22250
+;;;5100;6400;22250
+;40;365;3400;5400;16600
+;40;365;3360;;
+;40;365;1940;;
+;50;365;5500;7500;22250
+;;;;;
+;50;365;1250;2650;19250
+;50;365;5500;5500;22250
+;40;365;3500;4400;15600
+;40;365;8200;13200;16600
+;40;365;3080;4600;16600
+;50;365;;;20250
+;40;365;3440;3440;15600
+;;365;;;
+;40;365;2120;3280;15600
+;50;;;6900;22250
+;40;365;4900;4900;16600
+;40;365;4000;5000;16600
+;50;365;3400;5000;20250
+;125;;;;54625
+;40;;3441;4541;16600
diff --git a/EDA_CHATGPT.ipynb b/EDA_CHATGPT.ipynb
diff --git a/IPCbt.xlsx b/IPCbt.xlsx
diff --git a/Kaggle_advanced_GB_algorithm_4_life_expectancy.ipynb b/Kaggle_advanced_GB_algorithm_4_life_expectancy.ipynb
@@ -0,0 +1,43 @@
+#XGBoost
+import xgboost as xgb
+from sklearn import preprocessing
+X = df.drop(columns='Life expectancy')
+y = df['Life expectancy']
+
+lbl = preprocessing.LabelEncoder()
+#Country','Year','Status
+X['Country'] = lbl.fit_transform(X['Country'].astype(str))
+X['Year'] = lbl.fit_transform(X['Year'].astype(str))
+X['Status'] = lbl.fit_transform(X['Status'].astype(str))
+
+from sklearn.model_selection import train_test_split
+X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, 
+                                                    random_state=1)
+
+start = time.time()
+#X_train["Species"].astype("category")
+xgbr = xgb.XGBRegressor()
+
+xgbr.fit(X_train, y_train)
+y_pred = xgbr.predict(X_test)
+
+xgb_rmse = np.sqrt(RSquared(y_test, y_pred))
+print("R Squared for XGBoost: ", np.mean(xgb_rmse))
+
+end = time.time()
+diff = end - start
+print('Execution time:', diff)
+
+This script is using the XGBoost library to train a regression model and predict ‘Life expectancy’ from a dataset. Here’s a step-by-step breakdown:
+
+Import necessary libraries: The script starts by importing the necessary libraries - xgboost, sklearn.preprocessing, and sklearn.model_selection.
+Prepare the data: The target variable ‘Life expectancy’ is separated from the rest of the dataset. The remaining columns are stored in X and ‘Life expectancy’ is stored in y.
+Encode categorical features: The LabelEncoder from sklearn.preprocessing is used to convert categorical features ‘Country’, ‘Year’, and ‘Status’ into numerical labels.
+Split the data: The train_test_split function from sklearn.model_selection is used to split the dataset into a training set and a test set. 80% of the data is used for training and 20% is used for testing.
+Train the model: An instance of XGBRegressor (a regression model from the XGBoost library) is created and then fitted on the training data.
+Make predictions: The trained model is used to make predictions on the test data.
+Evaluate the model: The R-squared value is calculated for the predictions. R-squared is a statistical measure that represents the proportion of the variance for a dependent variable that’s explained by an independent variable or variables in a regression model. The RSquared function used here is not defined in the provided code, so you would need to define it or import it from an appropriate library.
+Measure execution time: The script also measures and prints the time it took to train the model and make predictions.
+Please note that this is a simple example. The actual process of training an XGBoost model involves many more steps, including data preprocessing, parameter tuning, and model evaluation.
+
+I hope this helps! If you have any other questions, feel free to ask. 😊