To calculate information gain, you can use the above formula in the following way:
diff --git a/Horror Trees/Node/node.py b/Horror Trees/Node/node.py
index 0db1120..05056c9 100644
--- a/Horror Trees/Node/node.py
+++ b/Horror Trees/Node/node.py
@@ -4,6 +4,8 @@
# the true and false branches.
class Node:
def __init__(self, column=-1, value=None, true_branch=None, false_branch=None):
+ # Implement the four attributes of the Node class:
+ # column, value, true_branch, false_branch
self.column = column
self.value = value
self.true_branch = true_branch
diff --git a/Horror Trees/Node/task-info.yaml b/Horror Trees/Node/task-info.yaml
index d84a40b..c3c975a 100644
--- a/Horror Trees/Node/task-info.yaml
+++ b/Horror Trees/Node/task-info.yaml
@@ -3,9 +3,9 @@ files:
- name: node.py
visible: true
placeholders:
- - offset: 333
+ - offset: 443
length: 127
- placeholder_text: "# TODO: Implement the four attributes of the Node class"
+ placeholder_text: "# TODO"
- name: task.py
visible: true
- name: tests/test_task.py
diff --git a/Horror Trees/Predict/task.md b/Horror Trees/Predict/task.md
index bdb85d0..5190e91 100644
--- a/Horror Trees/Predict/task.md
+++ b/Horror Trees/Predict/task.md
@@ -13,10 +13,73 @@ In `classify_subtree`, you need to:
1. Check whether `sub_tree` is an instance of the `Node` class and if yes, return the
current value of `sub_tree`, as in such a case, it is a class label.
+
+
+
+If `sub_tree` is not a `Node`, return it; it already represents the class label.
+
+```python
+if not isinstance(sub_tree, Node):
+ return sub_tree
+```
+
+
2. Compare the characteristic value from the column according to which a condition is set in the given node with the threshold value.
+
+
+
+Each node evaluates a specific feature of the object being classified. The index of the feature to check is stored in `sub_tree.column`.
+You need to extract the corresponding value from `x`.
+
+```python
+v = x[sub_tree.column]
+```
+
+
+
3. Depending on the result, choose the tree branch along which you will proceed (`true_branch` or `false_branch`).
+
+
+
+For numeric features, the node evaluates a threshold (e.g., `age >= 30`).
+Compare the feature value against this threshold to determine which branch to follow.
+
+```python
+if isinstance(v, int) or isinstance(v, float):
+ if v >= sub_tree.value:
+ branch = sub_tree.true_branch
+ else:
+ branch = sub_tree.false_branch
+```
+
+
+
+
+
+For categorical features, the node evaluates an equality condition (e.g., `color == "red"`).
+Determine the next branch based on whether the feature value matches this criterion.
+
+```python
+else:
+ if v == sub_tree.value:
+ branch = sub_tree.true_branch
+ else:
+ branch = sub_tree.false_branch
+```
+
+
+
4. Repeat these actions recursively until the result will be a class label (a leaf node).
+
+
+Choosing a branch is only the first step – you may encounter another node.
+Apply the same logic again by calling the function on the selected branch until you reach a leaf (the final class label).
+
+```python
+return self.classify_subtree(x, branch)
+```
+
To see the results of your code, add the following lines
to the `main` block in `task.py` and run it:
diff --git a/Iris Network/Backpropagation/task.md b/Iris Network/Backpropagation/task.md
index 21080af..4423ca0 100644
--- a/Iris Network/Backpropagation/task.md
+++ b/Iris Network/Backpropagation/task.md
@@ -70,9 +70,42 @@ In the `network.py` file, implement only the `backward` method of the `NN` class
- Calculate the error for the output layer (
delta_l2) as the difference between the network results (output) and the real class labels (y) multiplied elementwise by the derivative of the activation function for output ($\delta_{o}$ formula).
-- Calculate the error for the hidden layer (
delta_l1) as the product of input layer error matrices and the weights w2 multiplied elementwise by the derivative of the activation function wrt the output data of the hidden layer (layer1) ($\delta_{h}$ formula).
+
+
+
+```python
+delta_l2 = (y - output) * sigmoid_derivative(output)
+```
+
+
+- Calculate the error for the hidden layer (
delta_l1) by taking the product of the output layer error
+and the transpose of the weight matrix w2, then multiplying element-wise by the derivative
+of the activation function with respect to the hidden layer's output (layer1) ($\delta_{h}$ formula).
+
+
+
+```python
+delta_l1 = np.dot(delta_l2, self.w2.T) * sigmoid_derivative(self.layer1)
+```
+
+
- Adjust the weight coefficients of the output layer (
w2) by calculating the vector product of the hidden layer (layer1) and the output layer error (delta_l2) multiplied elementwise by the learning rate (formula 3).
+
+
+
+```python
+self.w2 += (np.dot(self.layer1.T, delta_l2) * learning_rate)
+```
+
+
- Adjust the weight coefficients of the hidden layer (
w1) by calculating the vector product of the input layer (X) and the hidden layer error (delta_l1), multiplied elementwise by the learning rate (formula 3).
+
+
+
+```python
+self.w1 += (np.dot(X.T, delta_l1) * learning_rate)
+```
+
Before you start, delete the `pass` operator and uncomment all lines that are not task commentaries.
diff --git a/Iris Network/Train and Predict/task.md b/Iris Network/Train and Predict/task.md
index 9e0cc19..c1d2b92 100644
--- a/Iris Network/Train and Predict/task.md
+++ b/Iris Network/Train and Predict/task.md
@@ -5,13 +5,29 @@ The process of setting up a neural network involves successive implementation of
In the `network.py` file, implement the `train` method of the `NN` class. Besides data, it takes the `n_iter` parameter, which sets
the necessary number of iterations. The method should call two other (previously implemented) methods in the right order. It does not return anything.
+
+On each iteration, generate predictions via feedforward and update the model's parameters using backward propagation.
+
+```python
+ for itr in range(n_iter):
+ l2 = self.feedforward(X)
+ self.backward(X, y, l2)
+```
+
+
Augment the implementation by the `predict` method, which passes all objects from the `X` matrix through the trained neural network.
-Before you start, delete the `pass` operator and uncomment all lines that are not task commentaries.
+
+The predict method is a required part of the neural network's interface.
+We will implement it here, even though it simply acts as a wrapper for the feedforward method.
+
+```python
+return self.feedforward(X)
+```
+While this case is straightforward, other scenarios may require a more complex implementation.
+
-
The predict method is a part of the interface of a program the neural network is expected to include, so we will implement it
-despite the fact that it just calls the feedforward method. It's a lucky coincidence – in other cases, there might be
-something else.
+Before you begin, delete the `pass` statement and uncomment all lines that are not task-related comments.
To see the results of your code in this step, add the following lines to the `main` block in `task.py`:
diff --git a/Pima Indians Diabetes and Linear Classifier/Gradient Descent/task-info.yaml b/Pima Indians Diabetes and Linear Classifier/Gradient Descent/task-info.yaml
index bc46119..aade20a 100644
--- a/Pima Indians Diabetes and Linear Classifier/Gradient Descent/task-info.yaml
+++ b/Pima Indians Diabetes and Linear Classifier/Gradient Descent/task-info.yaml
@@ -21,7 +21,7 @@ files:
placeholder_text: "# TODO: Set it to the new ones"
- offset: 2210
length: 28
- placeholder_text: "# Return the predicted classes"
+ placeholder_text: "# TODO"
- name: loss_functions.py
visible: true
- name: task.py
diff --git a/Pima Indians Diabetes and Linear Classifier/Read data/task-info.yaml b/Pima Indians Diabetes and Linear Classifier/Read data/task-info.yaml
index d0d71a1..ca00b97 100644
--- a/Pima Indians Diabetes and Linear Classifier/Read data/task-info.yaml
+++ b/Pima Indians Diabetes and Linear Classifier/Read data/task-info.yaml
@@ -3,18 +3,18 @@ files:
- name: task.py
visible: true
placeholders:
- - offset: 314
+ - offset: 256
length: 35
placeholder_text: "# TODO"
- - offset: 802
+ - offset: 532
length: 36
- placeholder_text: "# Standardize the dataset"
- - offset: 1000
+ placeholder_text: "# TODO"
+ - offset: 703
length: 60
- placeholder_text: "# Add a column of -1 to the left of X"
- - offset: 1145
+ placeholder_text: "# TODO"
+ - offset: 812
length: 12
- placeholder_text: "# {0, 1} -> {1, -1}"
+ placeholder_text: "# TODO"
- name: tests/test_task.py
visible: false
propagatable: false
diff --git a/Pima Indians Diabetes and Linear Classifier/Read data/task.py b/Pima Indians Diabetes and Linear Classifier/Read data/task.py
index 78df4cc..7cd1146 100644
--- a/Pima Indians Diabetes and Linear Classifier/Read data/task.py
+++ b/Pima Indians Diabetes and Linear Classifier/Read data/task.py
@@ -6,23 +6,19 @@
# and returns it as a pair of arrays: features
# and diabetes presence.
def read_data(fname):
- # The genfromtxt method loads data from a text file and splits columns
- # based on the provided delimiter.
+ # Load data from a CSV file using numpy.genfromtxt.
data = np.genfromtxt(fname, delimiter=',')
# The data is split into X (all columns but the last) and
# y (the last column).
X, y = data[:, :-1], data[:, -1]
- # The features are rescaled:
- # X is standardized by centering features around the mean
- # with a unit standard deviation. This means that the mean
- # and standard deviation of the standard scores are 0 and 1, respectively.
- # This procedure is recommended for data that follows a normal distribution.
+ # Standardize features: subtract the mean
+ # and divide by the standard deviation for each column.
X = (X - X.mean(axis=0)) / X.std(axis=0)
- # A column of -1s is prepended to the left of the X array.
+ # Prepend a column of -1s to X.
# It acts as a pseudo-feature that simplifies our vector
# calculations later on.
X = np.concatenate((-np.ones(len(X)).reshape(-1, 1), X), axis=1)
- # y is standardized: centered around 0 with a standard deviation of 1.
+ # Map labels from {0,1} to {1,-1}.
y = -(y * 2 - 1)
return X, y
diff --git a/Pima Indians Diabetes and Linear Classifier/Stochastic Gradient Descent/task-info.yaml b/Pima Indians Diabetes and Linear Classifier/Stochastic Gradient Descent/task-info.yaml
index f31373a..f847ae2 100644
--- a/Pima Indians Diabetes and Linear Classifier/Stochastic Gradient Descent/task-info.yaml
+++ b/Pima Indians Diabetes and Linear Classifier/Stochastic Gradient Descent/task-info.yaml
@@ -5,14 +5,14 @@ files:
placeholders:
- offset: 1114
length: 32
- placeholder_text: "# TODO: Generate the batch"
+ placeholder_text: "# TODO"
- offset: 1172
length: 60
placeholder_text: "# TODO: Calculate the gradient using the current weights, X\
\ and y batches"
- offset: 2183
length: 25
- placeholder_text: "# TODO: Initialize it here"
+ placeholder_text: "# TODO"
- name: gradient_descent.py
visible: true
- name: loss_functions.py