-
Notifications
You must be signed in to change notification settings - Fork 5
Expand file tree
/
Copy path_dict_Math.txt
More file actions
245 lines (183 loc) · 8.13 KB
/
_dict_Math.txt
File metadata and controls
245 lines (183 loc) · 8.13 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
=================
Math - calculus:
e = 2.71828... lim( (1+1/N)^N ) for large N
ln, log
derivative - slope of a curve
integral - area under the curve
Dirac Delta Function
multivariate calculus
partial derivatives (derivative along one dimension)
gradient
gradient descent
=================
Math - linear algebra:
vector - array of scalars (individual numbers)
matrix - 2-dimensional table of vectors
tensor - multi-dimensional matrix
dot-product of vectors and matrices
vector spaces, base vectors
matrices of space transformations
determinant of a matrix (number showing tranformation of volume)
rank of a matrix - number of independent dimensions
eigen vectors, eigen values - solutions of
Ax = lambda * x
(x=vector, lambda=value).
Idea - turn matrix to diagonal form, which
simplifies matrix multiplication.
Also in many cases you can simplify calculations
by concentrating only on dimensions corresponding
to largest eigen values (this is the
foundation of PCA = Principle Component Analysis.
inverse matrix - the 'reciprocal' of a matrix
=================
Probability & Statistics:
probability - the chance of something happening
combinatorics, subsets
the number of ways to choose m objects from n
without replacement:
C(n,m) = n!/(m!*(n-m)!)
Venn Diagrams - graphical illustration of
two (or more) event types hapenning.
They presented as circles (possibly intersecting).
Conditional Probability - the chance of something
happening given that something else has happened:
P(A|B)
Bayes's Theorem:
P(B|A) = P(A|B)*P(B)/P(A).
random variable - a quantity which can randomly
take different values with probability which
may depend on the value.
random variable may have descrete values (dice),
or continuous values in some range.
probability distribution - describes how
the probability of value depends on value.
For example, uniform distribution,
normal gaussian distribution, etc.
expected (average, mean) value - the sum (or integral)
of the products of the values multiplied
by the probability of this value.
variance - the 'spread' of a probability
distribution calculated as sum of squares
of differences between values and expected value
standard deviation - square root of variance
PDF(x) = Probability Density Function - the
chance of "x" occuring in given range
CPF(x) = Cumulative Probability Function - an
integral of PDF - the chance that value
is lower than "x"
Binomial distribution - tossing a coin.
P(x) is the probability of x successes
out of N trials
Poisson distribution - arises as a limit
of binomial distribution when we trying
to solve a problem of how many events
would occure during given time
P(x; l) = exp(-l) * l^x / x!
where l = lambda = expected number of
events per unit of time.
For Poisson distribution mean = variance = l
Exponential Distributions - distribution of
time gaps between events for a Poisson process
Uniform Distribution - a probability density
function in which every value is equally likely
Central Limit Theorem (CLT) - foundational theorem
in statistic leading to Normal (Gaussian) Distribution.
Here is an illustration of how it works.
Suppose that we have big number N1 of observations
of a random variable. We randomly split all
this data into N2 groups - and then in each
group we calculate the mean value.
Now we make a histogram of those N2 mean values.
If N1 and N2 are big enough, the histogram
will have familiar bell shape.
As N1 and N2 become larger, the shape will
approach the gaussian formula:
exp(-x^2/2)
In other words, regardless of the PDF of the
underlying random variable, the distribution of
mean values of subgroups will always approach
bell shape.
In nature we meet CLT all the time, because we
we have groups of groups of groups
(atoms form molecules which form bigger structures,
etc.)
In finance events are not independent, not
identically distributed - so CLT does not apply,
and financial returns are not normally distributed.
=================
More Statistics
Sample mean (average) Xm = sum(Xi)/N
Sample Standard Deviation = sum( (Xi-Xm)^2 )/(N-1)
Note - divides by (N-1) to correct for the fact
that we have already used one dimension when
we have calculated the mean Xm
Median - the 50th percentile of ranked observations
from least to greatest.
Linear Regression - draw line through points
an algorithm that draws a straight line
that minimizes the least-squared error
between this line and data points.
OLS = Ordinary Least Squares - a specific simple
implementation of Lnear Regression.
R2 = Coefficient of Determination - the proportion
of variance explained by a model fit. We want
R2 to be close to 1 for good fit.
Z-score - distance from the mean measured in
standard deviations (sigma-s). For example,
Z = 2 means that the value is above mean by 2 sigmas.
z = -1.5 means that the value is below mean by 1.5 sigma
Confidence Interval - a range around mean value
that contains certain percentage of the distribution.
689599.7 rule: simple rule to remember
the percentage of values that lie within a band
around the mean in a normal distribution
with a width of 2,4, or 6 standard deviations.
For normal distribution:
Z-value Range Confidence
1 -1,1 68.27%
2 -2,2 95.45%
3 -3,3 99.73%
not needed: Student's t-distribution
Chi-squared distribution
F-distribution
Gamma-distribution
Stochastic Processes - a time series of observations.
They may be empirical observations or values
generated at random from a distribution.
Time Series Analysis - methods of uncovering
patterns in time series to make predictions.
Random Walk (RW) - a simple stochastic process consisting
of steps (up or down) with known constant probabilities.
Brownian motion - extension of Random Walk
when number of steps becomes very large. Observation
effectively generated from a gaussian distribution
Diffusion - long term spread of observations
Poisson Process - a stochastic process with known
expected number of events per unit of time (lambda).
The actual observed number of events will follow
Poisson distribution.
Both mean and variance are equal to lambda.
The gaps between events follow exponential distribution.
White noise - time series generated using
uniform distribution.
Gaussian noise - time series generated using a
gaussian distribution
Markov Process - a stochastic process without memory
Next value depends only on current value (memoryless).
Monte Carlo method - using random trials to model
events for measuring or predictive purposes.
MCMC (Markov Chain Monte Carlo) - running Monte
Carlo simulations using Markov Chain model
Correlation function - function calculating how
two variables relate to each other. If both
variables go up and down synchronously, we
say that they are correlated.
Autocorrelation - how a random variable relates
to itself in the past.
Fourier Analysis, filtering to reduce noise
taking some form of average of past time series
values to create a smoother time series, albeit
one that suffers from lag or some other property
related to using past data.
Extracting Signal from Noise by synchronization
Signal-to-Noise ratio improves ~sqrt(N))