@@ -27,7 +27,7 @@ premature optimization is the root of all evil." -- Donald Knuth
2727
2828## Overview
2929
30- It's probably safe to say that Python is the most popular language for scientific computing.
30+ Python is the most popular language for many aspects of scientific computing.
3131
3232This is due to
3333
@@ -74,29 +74,30 @@ Let's briefly review Python's scientific libraries.
7474
7575### Why do we need them?
7676
77- One reason we use scientific libraries is because they implement routines we want to use.
77+ We need Python's scientific libraries for two reasons:
7878
79- * numerical integration, interpolation, linear algebra, root finding, etc.
79+ 1 . Python is small
80+ 2 . Python is slow
81+
82+ ** Python in small**
8083
81- For example, it's usually better to use an existing routine for root finding than to write a new one from scratch.
84+ Core python is small by design -- this helps with optimization, accessibility, and maintenance
8285
83- (For standard algorithms, efficiency is maximized if the community can
84- coordinate on a common set of implementations, written by experts and tuned by
85- users to be as fast and robust as possible!)
86+ Scientific libraries provide the routines we don't want to -- and probably shouldn't -- write oursives
8687
87- But this is not the only reason that we use Python's scientific libraries .
88+ * numerical integration, interpolation, linear algebra, root finding, etc .
8889
89- Another is that pure Python is not fast.
90+ ** Python is slow **
9091
91- So we need libraries that are designed to accelerate execution of Python code .
92+ Another reason we need the scientific libraries is that pure Python is relatively slow .
9293
93- They do this using two strategies:
94+ Scientific libraries accelerate execution using three main strategies:
9495
95- 1 . using compilers that convert Python-like statements into fast machine code for individual threads of logic and
96- 2 . parallelizing tasks across multiple "workers" (e.g., CPUs, individual threads inside GPUs).
96+ 1 . Vectorization: providing compiled machine code and interfaces that make this code accessible
97+ 1 . JIT compilation: compilers that convert Python-like statements into fast machine code at runtime
98+ 2 . Parallelization: Shifting tasks across multiple threads/ CPUs / GPUs /TPUs
9799
98- We will discuss these ideas extensively in this and the remaining lectures from
99- this series.
100+ We will discuss these ideas in depth below.
100101
101102
102103### Python's Scientific Ecosystem
@@ -123,7 +124,7 @@ Here's how they fit together:
123124* Pandas provides types and functions for manipulating data.
124125* Numba provides a just-in-time compiler that plays well with NumPy and helps accelerate Python code.
125126
126- We will discuss all of these libraries extensively in this lecture series.
127+ We will discuss all of these libraries at length in this lecture series.
127128
128129
129130## Pure Python is slow
@@ -189,15 +190,13 @@ a, b = ['foo'], ['bar']
189190a + b
190191```
191192
192- (We say that the operator ` + ` is * overloaded* --- its action depends on the
193- type of the objects on which it acts)
194193
195- As a result, when executing ` a + b ` , Python must first check the type of the objects and then call the correct operation.
194+ As a result, when executing ` a + b ` , Python must first check the type of the
195+ objects and then call the correct operation.
196196
197- This involves a nontrivial overhead.
197+ This involves overhead.
198198
199- If we repeatedly execute this expression in a tight loop, the nontrivial
200- overhead becomes a large overhead.
199+ If we repeatedly execute this expression in a tight loop, the overhead becomes large.
201200
202201
203202#### Static types
@@ -243,38 +242,29 @@ To illustrate, let's consider the problem of summing some data --- say, a collec
243242
244243#### Summing with Compiled Code
245244
246- In C or Fortran, these integers would typically be stored in an array, which
247- is a simple data structure for storing homogeneous data.
245+ In C or Fortran, an array of integers is stored in a single contiguous block of memory
248246
249- Such an array is stored in a single contiguous block of memory
250-
251- * In modern computers, memory addresses are allocated to each byte (one byte = 8 bits).
252247* For example, a 64 bit integer is stored in 8 bytes of memory.
253248* An array of $n$ such integers occupies $8n$ * consecutive* memory slots.
254249
255- Moreover, the compiler is made aware of the data type by the programmer.
256-
257- * In this case 64 bit integers
250+ Moreover, the data type is known at compile time.
258251
259252Hence, each successive data point can be accessed by shifting forward in memory
260253space by a known and fixed amount.
261254
262- * In this case 8 bytes
263255
264256#### Summing in Pure Python
265257
266258Python tries to replicate these ideas to some degree.
267259
268- For example, in the standard Python implementation (CPython), list elements are placed in memory locations that are in a sense contiguous.
260+ For example, in the standard Python implementation (CPython), list elements are
261+ placed in memory locations that are in a sense contiguous.
269262
270263However, these list elements are more like pointers to data rather than actual data.
271264
272265Hence, there is still overhead involved in accessing the data values themselves.
273266
274- This is a considerable drag on speed.
275-
276- In fact, it's generally true that memory traffic is a major culprit when it comes to slow execution.
277-
267+ Such overhead is a major culprit when it comes to slow execution.
278268
279269
280270### Summary
@@ -295,15 +285,11 @@ synonymous with parallelization.
295285
296286This task is best left to specialized compilers!
297287
298- Certain Python libraries have outstanding capabilities for parallelizing scientific code -- we'll discuss this more as we go along.
299-
300-
301288
302289
303290## Accelerating Python
304291
305- In this section we look at three related techniques for accelerating Python
306- code.
292+ In this section we look at three related techniques for accelerating Python code.
307293
308294Here we'll focus on the fundamental ideas.
309295
@@ -325,10 +311,11 @@ Many economists usually refer to array programming as "vectorization."
325311In computer science, this term has [a slightly different meaning](https://en.wikipedia.org/wiki/Automatic_vectorization).
326312```
327313
328- The key idea is to send array processing operations in batch to pre-compiled
329- and efficient native machine code.
314+ The key idea is to send array processing operations in batch to pre-compiled and
315+ efficient native machine code.
330316
331- The machine code itself is typically compiled from carefully optimized C or Fortran.
317+ The machine code itself is typically compiled from carefully optimized C or
318+ Fortran.
332319
333320For example, when working in a high level language, the operation of inverting a
334321large matrix can be subcontracted to efficient machine code that is pre-compiled
@@ -346,6 +333,7 @@ The idea of vectorization dates back to MATLAB, which uses vectorization extensi
346333``` {figure} /_static/lecture_specific/need_for_speed/matlab.png
347334```
348335
336+ NumPy uses a similar model, inspired by MATLAB
349337
350338
351339### Vectorization vs for pure Python loops
@@ -423,19 +411,17 @@ can be run) has slowed dramatically in recent years.
423411Chip designers and computer programmers have responded to the slowdown by
424412seeking a different path to fast execution: parallelization.
425413
426- Hardware makers have increased the number of cores (physical CPUs) embedded in each machine.
414+ This involves
427415
428- For programmers, the challenge has been to exploit these multiple CPUs by
429- running many processes in parallel (i.e., simultaneously).
416+ 1 . increasing the number of CPUs embedded in each machine
417+ 1 . connecting hardware accelerators such as GPUs and TPUs
430418
431- This is particularly important in scientific programming, which requires handling
432-
433- * large amounts of data and
434- * CPU intensive simulations and other calculations.
419+ For programmers, the challenge has been to exploit this hardware
420+ running many processes in parallel.
435421
436422Below we discuss parallelization for scientific computing, with a focus on
437423
438- 1 . the best tools for parallelization in Python and
424+ 1 . tools for parallelization in Python and
4394251 . how these tools can be applied to quantitative economic problems.
440426
441427
@@ -447,22 +433,18 @@ scientific computing and discuss their pros and cons.
447433
448434#### Multiprocessing
449435
450- Multiprocessing means concurrent execution of multiple processes using more than one processor.
451-
452- In this context, a ** process** is a chain of instructions (i.e., a program).
436+ Multiprocessing means concurrent execution of multiple threads of logic using more than one processor.
453437
454438Multiprocessing can be carried out on one machine with multiple CPUs or on a
455- collection of machines connected by a network.
439+ cluster of machines connected by a network.
456440
457- In the latter case, the collection of machines is usually called a
458- ** cluster** .
441+ With multiprocessing, * each process has its own memory space* , although the physical memory chip might be shared.
459442
460- With multiprocessing, each process has its own memory space, although the
461- physical memory chip might be shared.
462443
463444#### Multithreading
464445
465- Multithreading is similar to multiprocessing, except that, during execution, the threads all share the same memory space.
446+ Multithreading is similar to multiprocessing, except that, during execution, the
447+ threads all * share the same memory space* .
466448
467449Native Python struggles to implement multithreading due to some [ legacy design
468450features] ( https://wiki.python.org/moin/GlobalInterpreterLock ) .
@@ -472,6 +454,7 @@ But this is not a restriction for scientific libraries like NumPy and Numba.
472454Functions imported from these libraries and JIT-compiled code run in low level
473455execution environments where Python's legacy restrictions don't apply.
474456
457+
475458#### Advantages and Disadvantages
476459
477460Multithreading is more lightweight because most system and memory resources
0 commit comments