diff --git a/docs/newChapters/the_law/5a_HowTo_memoize.md b/docs/newChapters/the_law/5a_HowTo_memoize.md index cdb3add..2ef7aad 100644 --- a/docs/newChapters/the_law/5a_HowTo_memoize.md +++ b/docs/newChapters/the_law/5a_HowTo_memoize.md @@ -1,20 +1,8 @@ # HowTo: memoize? -Memoize is the act of caching the output for a particular function given the same set of inputs. It is a good first regulator to learn. Below is a basic memoization function: +Memoize is the act of caching the output for a particular set of inputs for a function. It is one of the primary targets for higher-order functions, and the first higher-order function you should learn. -```javascript -function memoize(original) { - const cache = {}; - return function regulator(...args) { - const key = JSON.stringify(args); - if (cache[key]) - return cache[key]; - return cache[key] = original(...args); - } -} -``` - -## Why: memoize? +## Memoize use cases 1. `fetch`. Use memoize to create your own network cache. This might be better for certain types of apps, but at the same time, investigate how your server/browser already caches network requests, as you will now have many many caches which can give you great pain. @@ -22,89 +10,55 @@ function memoize(original) { 3. Heavy `sync` functions, such as heavy mathematics, text processing, DOM node creation. -## Which: functions can we memoize? +You should only memoize pure functions, functions that: -You should only memoize pure functions: +1. only get pure values as input, +2. only produce pure or cloneable values, and +3. have no side effects. -1. Pure functions get pure values as input. If you memoize a function whose input values are impure, ie. they contain state information that cannot be serialized/signatured consistently, then your memoized regulated functions might think that: - * two functions that are given different input arguments are - * two functions that are given identical input arguments. +If your input values are not pure, ie. they contain state information that cannot be serialized/signatured consistently, then your memoized regulated functions might think that: -2. Pure functions produce pure (or at least cloneable) values. If your original function produces an impure output such as a `Stream` object, then these impure objects cannot be reused (without first being cloned or similar). +* two functions that are given different input arguments are +* two functions that are given identical input arguments. -3. Pure functions have no side effects. If your original function has side-effects, and then the memoize regulator causes some invocations to this function to be skipped, then you have a problem. If the original function has side-effects, you must evaluate if you can skip those side-effects before you memoize it. +If your function has side-effects, such as making a network request or updating a system counter, then every invocation of the original function can trigger those side effects. Thus, when your memoized regulated function call makes a hit in its cache, then it will skip those side-effects. This is often good, but an active decision must be made whether this is desired in each instance. -## Why: `JSON.stringify(..)`? +If your original function produces a impure output such as a `Response` or `Stream` object, then these impure objects cannot be reused. Such original functions can never be memoized. -Note: The regulator in the `memoize` function above creates a argument signature key using `JSON.stringify(args)`. `JSON.stringify(args)` is the recommended approach because: +## Step1: Basic memoization -1. `JSON.stringify(args)` is generic in terms of argument type and count, -2. `JSON.stringify(args)` is very simple, and -3. `JSON.stringify(args)` is relatively fast. - -However, be aware that JSON doesn't distinguish between `undefined` and `null` and unset properties: -`JSON.stringify({a: [, undefined], b: undefined}) === JSON.stringify({a: [null, null]})`. - -## HowTo: memoizeLRU? - -LRU is short for Least Recently Used (or Last Recently Used, whichever you prefer). LRU is a simple mechanism to avoid saving everything forever, and using too much memory. In fact, if you intend for your application to run for a little while, and expect a function to be called with different arguments, then you probably should make your memoize function with an LRU strategy. +The basic memoization function looks like this: ```javascript -function memoizeLRU(original, size = 100) { +function memoize(original) { const cache = {}; - return function regulator(...args) { + return function memoized(...args) { const key = JSON.stringify(args); - if (key in cache) { - const value = cache[key]; //we delete the key (entry) - delete cache[key]; //and then we put the key (entry) back in - return cache[key] = value; //this changes the order of the key in the object. - } - const keys = Object.keys(cache); //the last used keys are at the end of Object.keys() - keys.length >= size && delete cache[keys[0]]; //thus, keys[0] is the least recently used - const res = original(...args); - return cache[key] = res; + if (cache[key]) + return cache[key]; + return cache[key] = original(...args); } } - -const memoAbs = memoizeLRU(Math.abs); // -const two = memoAbs(-2); -const twotwo = memoAbs(-2); ``` -//todo move async lru without error handling. +### Why: `JSON.stringify(..)`? -```javascript -function memoizeLRUasync(original, size = 100) { - const cache = {}; - const regulator = function (...args) { - const key = JSON.stringify(args); - if (key in cache) { - const value = cache[key]; - delete cache[key];//by deleting the key, we change the key order when we put it back in. LRU. Presto! - return cache[key] = value; - } - const keys = Object.keys(cache); - keys.length >= size && delete cache[keys[0]]; - const res = original(...args); - if (!(res instanceof Promise)) - return cache[key] = res; - res.catch(err => delete cache[key]); - return cache[key] = res.then(res2 => cache[key] = res2); - } - Object.defineProperty(original, 'name', {value: '_mem_' + original.name}); - return regulator; -} +Note that the function used to create a signature key for the arguments is `JSON.stringify(args)`. `JSON.stringify(..)` is the recommended approach because it is a) generic in terms of argument type and count, b) very simple, and c) fast. But, be aware that JSON doesn't distinguish between `undefined` and `null` and unset properties: -const memoAbs = memoizeLRUasync(Math.abs); // -const two = memoAbs(-2); -const twotwo = memoAbs(-2); +```javascript +console.log(JSON.stringify({a: [, undefined], b: undefined})); +console.log(JSON.stringify({a: [null, null]})); ``` -## WhatAbout: `Error`s? +## Step2: to remember ones mistakes, or to move on? That is the question. + +If the original function fails and throws an `Error`, then the basic memoization function above will not work. -> Should we `memoize` our original's mistakes? Or should we just move on? +So, the question is: do we want to remember previous `Error`s and cache them? Or do we hope that the `Error` is temporary and that the function *might* succeed next time we ask it? -When we memoize `Error`s we must create two separate cache tables in order to simply distinguish between cached outputs that should be `return`ed and cached errors that should be `throw`n: +Often, `Error`s might right themselves. For example, the `Error` might be caused by a faulty network connection, and as soon as the internet connection is re-established, the function will work. A function that gets data from the network, disc, system or whatever, might very well be considered pure enough for memoization. If that is the case, you do not wish to cache `Error`s, and the basic implementation is good enough (for sync functions). + +But, if you know that the memoized function have no such context, then you might wish to cache any `Error`s. If so, here is how you do that: ```javascript function memoize(original) { @@ -126,127 +80,6 @@ function memoize(original) { } ``` -> If you don't want to memoize `Error`s, then simply use the basic approach. In the basic example at the beginning of the article, then if the original function fails and throws an `Error`, then the basic memoization function above will not `cache` anything: the error `throw`n will cancel the post-processing in the regulator. But. Is that a bad thing? Or a good thing? Do we want to cache previous `Error`s so to avoid calling functions to reproduce them? Or do we want to forget previous `Error`s and hope that it will just work next time? -> -> The answer is: it depends. Most of all it depends on the original function and your use of that original function. Secondly, it depends on the `Error`, its type and instance and context. For some original functions and some `Error`s you might like to try several times, for other original functions, you might like to avoid making the same mistake twice, needlessly. - -## Demo: Cloneable 1: `DOMNodes` - -To make a set of `DOMNode`s is a good example of a heavy sync process that is a) most often possible to memoize if you b) always `cloneNode(true)` the result. Here is a demo of such a `memoize` regulator that assumes a sync original function that will not fail/`throw` anything. - -```javascript -function memoizeDOMNodes(original) { - const cache = {}; - return function regulator(...args) { - const key = JSON.stringify(args); - if (key in cache) { - const value = cache[key]; - delete cache[key]; - cache[key] = value; - return value.cloneNode(true); - } - const keys = Object.keys(cache); - keys.length >= size && delete cache[keys[0]]; - return cache[key] = original(...args); - } -} - -function makeH1(text) { - const h1 = document.createElement('h1'); - h1.innerText = text; - return h1; -} - -makeH1 = memoizeDOMNodes(makeH1); - -const a = makeH1('hello sunshine'); -const b = makeH1('hello world'); -const c = makeH1('hello sunshine'); -console.log(a !== c); -``` - -## Demo: Cloneable 2: `async` `Response`, with no `Error` - -The `fetch()` is another prime candidate for memoization. - -```javascript -function memoizeResponse(original) { - const cache = {}; - return function regulator(...args) { - const key = JSON.stringify(args); - if (key in cache) { - const value = cache[key]; - delete cache[key]; - cache[key] = value; - if (value instanceof Promise) - return value.then(response => response.clone()); - return value.clone(); - } - const keys = Object.keys(cache); - keys.length >= size && delete cache[keys[0]]; - const promise = original(...args); - cache[key] = promise; - promise.then(response => cache[key] = response); - promise.then(response => delete cache[key]); - return promise.then(response => response.clone()); - } -} - -const memoFetch = memoizeDOMNodes(fetch); - -(async function () { - - let a = memoFetch('https://example.com/a'); - if (a instanceof Promise){ - console.log("a is a ", a.constructor.name); - a = await a; - console.log("a is now a ", a.constructor.name); - } - const b = memoFetch('https://example.com/a'); - console.log("b is a ", a.constructor.name); - console.log(a !== b); -})(); -``` - -How you can memoize two impure, but cloneable outputs: `DOMNode` and `Response` objects. Making `DOMNode`s is a good example of a heavy sync ; `fetch()` is a good example of a heavy async function that returns a `Response` object that must be `clone()`d before (re)use. - -The demo below holds three different memoize functions: - -1. `memoizeDOMNodes`. This assumes a sync function, and will memoize/cache `Error`s. -2. `asyncMemoizeFetch`. This is the simple `async` variant, with no `Error` management. -3. `memoizeFetch`. This variant updates the cache to contain the `Response` object once the initial `Response` has settled. - -```javascript -function memoizeLRU(fun, size = 100) { - - function toCloneOrNot(output) { - return output instanceof Object && output.clone instanceof Function ? output.clone() : - output instanceof Object && output.cloneNode instanceof Function ? output.cloneNode(true) : output; - } - - const cache = {}; - const fun2 = function (...args) { - const key = JSON.stringify(args); - if (key in cache) { - const value = cache[key]; - delete cache[key]; - cache[key] = value; - return value instanceof Promise ? value.then(res2 => toCloneOrNot(res2)) : toCloneOrNot(value); - } - const keys = Object.keys(cache); - keys.length >= size && delete cache[keys[0]]; - const res = original(...args); - if (!(res instanceof Promise)) - return toCloneOrNot(cache[key] = res); - cache[key] = res; //adding the promise for now. - res.catch(err => delete cache[key]); - return res.then(res2 => toCloneOrNot(cache[key] = res2)); - } - Object.defineProperty(fun, 'name', {value: '_mem_' + fun.name}); - return fun2; -} -``` - ### Test: neverForget! ```javascript @@ -311,6 +144,37 @@ function asyncMemoize(original) { } ``` +## HowTo: memoizeLRU? + +LRU is short for Least Recently Used. It is a simple mechanism to avoid saving everything forever, and using too much memory. In fact, if you intend for your application to run for a little while, and expect a function to be called with different arguments, then you must use an LRU strategy. + +```javascript +function memoizeLRU(original, size = 100) { + const cache = {}; + const regulator = function (...args) { + const key = JSON.stringify(args); + if (key in cache) { + const value = cache[key]; + delete cache[key];//by deleting the key, we change the key order when we put it back in. LRU. Presto! + return cache[key] = value; + } + const keys = Object.keys(cache); + keys.length >= size && delete cache[keys[0]]; + const res = original(...args); + if (!(res instanceof Promise)) + return cache[key] = res; + res.catch(err => delete cache[key]); + return cache[key] = res.then(res2 => cache[key] = res2); + } + Object.defineProperty(original, 'name', {value: '_mem_' + original.name}); + return regulator; +} + +const memoAbs = memoizeLRU(Math.abs); // +const two = memoAbs(-2); +const twotwo = memoAbs(-2); +``` + ## HowTo: memoizeLRU and support cloneable? Often, the function you wish to memoize returns an impure, but cloneable value: `fetch` returns a `Response` object, or you may have a heavy custom function that returns a DOM node. You cannot simply reuse the previous object in this instance, but you can reuse a `clone()` of that object. @@ -346,13 +210,12 @@ function memoizeLRU(fun, size = 100) { Here is the memoize function that works for both pure outputs, `Reponse` objects, and DOM `Node`s. ```javascript -function memoizeLRU(fun, size = 100) { +function memoizeLRU(original, size = 100) { function toCloneOrNot(output) { return output instanceof Object && output.clone instanceof Function ? output.clone() : output instanceof Object && output.cloneNode instanceof Function ? output.cloneNode(true) : output; } - const cache = {}; const fun2 = function (...args) { const key = JSON.stringify(args); @@ -371,7 +234,7 @@ function memoizeLRU(fun, size = 100) { res.catch(err => delete cache[key]); return res.then(res2 => toCloneOrNot(cache[key] = res2)); } - Object.defineProperty(fun, 'name', {value: '_mem_' + fun.name}); + Object.defineProperty(original, 'name', {value: '_mem_' + original.name}); return fun2; } ``` @@ -422,6 +285,15 @@ const asyncSum2 = memoizeLRU(asyncSum); })(); ``` +Prints this: + +``` +true +true +true {hello: "sunshine"} {hello: "sunshine"} true +true