-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsession31.py
More file actions
116 lines (77 loc) · 2.28 KB
/
session31.py
File metadata and controls
116 lines (77 loc) · 2.28 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
"""
Unsupervised Learning
We have data but we do not have label
k-means Clustering :)
k denotes number of classes which we want to achieve
X Y P
1 1 A
1 0 B
0 2 C
2 4 D
3 5 E
Step1:
Assume 2 centroids randomly for two clusters/classes
eg.A(1,1) and C(0,2)
Step2:
Compute distance of all the points from each centroid
Euclidean Distance Formula->sqrt[(x2-x1)**2 + (y2-y1)**2)]
-------------------------------------
P C1(1,1) C2(0,2)
-------------------------------------
A 0 1.4
B 1 2.2
C (1.4)sqrt(2) 0
D (3.2)sqrt(10) 2.8
E (4.5)sqrt(20) 4.2
Step3:
Arrange points as per the distance from the centroids
P NearestTo
A C1
B C1
C C2
D C2
E C2
Step4:
X Y P NearestTo
1 1 A C1
1 0 B C1
0 2 C C2
2 4 D C2
3 5 E C2
Re-check again with new centroids of the created new clusters
Cluster Mean
CM1 = (1+1)/2, (1+0)/2
CM2 = (0+2+3)/3, (2+4+5)/3
CM1 = (1, 0.5)
CM2 = (1.7, 3.7)
-------------------------------------
P CM1(1,0.5) CM2(1.7,3.7)
-------------------------------------
A .5 2.7
B .5 3.7
C 1.8(sqrt(3.25)) 2.4
D 3.6 0.5
E 4.9 1.9
X Y P NearestTo
1 1 A CM1
1 0 B CM1
0 2 C CM1
2 4 D CM2
3 5 E CM2
Redo the same steps till we donot get the same clusters
CM3 = (1+1+0)/3, (1+0+2)/3
CM4 = (2+4)/2, (4+5)/2
CM3 = (0.66, 1)
CM4 = (3, 4.5)
CM3 CM4 NearestT0
A 0.34 4.03 CM3
B 1.05 4.93 CM3
"""
import matplotlib.pyplot as plt
X = [1, 1, 0, 2, 3]
Y = [1, 0, 2, 4, 5]
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.scatter(X, Y)
# plt.plot(X, Y)
plt.show()