Advanced memoization for JavaScript functions with lodash.memoize
Copyright JS Foundation and other contributors

Advanced memoization for JavaScript functions with lodash.memoize

Written on

"In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again."

Wikipedia on Memoization

Lodash is a modern JavaScript utility library delivering modularity, performance & extras. It’s a ubiquitous library, and your project probably uses it, even if you’re not aware of it. Lodash contains an implementation of memoization in the form of memoize function.

By default, memoize function only uses the first argument provided to the memoized function as the cache key. Here is the simplest possible example of memoization:

1
2
3
4
5
6
7
8
const { memoize } = require('lodash');

const func = (arg1) => `My name is ${arg1}`
const funcM = memoize(func);

console.dir(funcM('John')); // => 'My name is John'
console.dir(funcM('John')); // => 'My name is John' (cache hit)
console.dir(funcM.cache.size); // => 1

It’s important to realize that the cache key of the memoized function is determined using SameValueZero comparison:

1
2
3
4
5
6
7
8
const { memoize } = require('lodash');

const func = (arg1) => JSON.stringify(arg1)
const funcM = memoize(func);

console.dir(funcM({a: 1})); // => '{"a":1}'
console.dir(funcM({a: 1})); // => '{"a":1}'
console.dir(funcM.cache.size); // => 2

Now, what if we introduce more arguments to our func?

1
2
3
4
5
6
7
8
const { memoize } = require('lodash');

const func = (arg1, arg2) => `${arg1} ${arg2}`;
const funcM = memoize(func);

console.dir(funcM('John', 'Doe')); // => "John Doe"
console.dir(funcM('John', 'FooBar')); // => "John Doe" (cache hit)
console.dir(funcM.cache.size); // => 2

We can see that there is an obvious problem. As stated above, lodash only uses the first argument as the cache key. To fix this, we have to use resolver function:

1
2
3
4
5
6
7
8
9
const { memoize } = require('lodash');

const func = (arg1, arg2) => `${arg1} ${arg2}`;
const resolver = (arg1, arg2) => JSON.stringify([arg1, arg2]);
const funcM = memoize(func, resolver);

console.dir(funcM('John', 'Doe')); // => "John Doe"
console.dir(funcM('John', 'FooBar')); // => "John FooBar"
console.dir(funcM.cache.size); // => 2

This resolver solution obviously wouldn’t scale. What if our func needs to consume complex objects as arguments? That’s fine unless the object arguments are huge or contain a cyclic reference.

When arguments are huge objects, we’ll spend too much CPU time serializing those objects into JSON strings to determine the cache key. And instead of the SameValueZero comparison for objects, we’ll be comparing huge strings. When object arguments contain a cyclic reference, we’ll not be able to serialize them into JSON string and determine the cache key.

Lodash has one remaining trick up its sleeve to overcome these two problems - being able to influence how cache keys are maintained. To achieve our goal, we’ll have to create a specialization of lodash.memoize; the function called memoizeN:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
const { memoize } from "lodash";

const sameValueZero = (x, y) => x === y || (Number.isNaN(x) && Number.isNaN(y));
const shallowArrayEquals = (a) => (b) => {
  return Array.isArray(a) && Array.isArray(b)
    && a.length === b.length
    && a.every((val, index) => sameValueZero(val, b[index]));
};
const list = (...args) => args;

class Cache extends Map {
  delete(key) {
    const keys = Array.from(this.keys());
    const foundKey = keys.find(shallowArrayEquals(key));
    return super.delete(foundKey);
  }
  get(key) {
    const keys = Array.from(this.keys());
    const foundKey = keys.find(shallowArrayEquals(key));
    return super.get(foundKey);
  }
  has(key) {
    const keys = Array.from(this.keys());
    return keys.findIndex(shallowArrayEquals(key)) !== -1;
  }
}

const memoizeN = (fn, { resolver = list } = {}) => {
  const { Cache: OriginalCache } = memoize;
  memoize.Cache = Cache;
  const memoized = memoize(fn, resolver);
  memoize.Cache = OriginalCache;
  return memoized;
};

The real trick here is that the memoized function parameters are transformed into a list, and this list forms a cache key. We override the lodash.memoize.Cache class on an ad-hoc basis with our implementation capable of doing cache key lookups for our list cache keys.

Observe the following example:

1
2
3
4
5
6
7
8
9
10
11
const complexObj = {a: 1};
const simpleObj = {b: 2};

const func = (arg1, arg2) => arg2;
const resolver = (arg1, arg2) => [arg1, JSON.stringify(arg2)];
const funcM = memoizeN(func, { resolver });

console.dir(funcM(complexObj, simpleObj)); // => {b: 2}
console.dir(funcM(complexObj, {b: 2})); // => {b: 2} (cache hit)
console.dir(funcM({a: 1}, {b: 2})); // => {b: 2}
console.dir(funcM.cache.size); // => 2

We intend to memoize func using both arguments to form the cache key. For the first argument, use the SameValueZero comparison on the original value. For the second one, use SameValueZero comparison on JSON stringified value.

We can go even further with the specialization to make memoizeN consume a list of comparator functions applied to appropriate arguments instead of abusing a resolver to serialize some arguments.

The implementation will look as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
const memoize = require('lodash/memoize');
const isEqual = require('lodash/isEqual');

const sameValueZero = (x, y) => x === y || (Number.isNaN(x) && Number.isNaN(y));
const makeShallowArrayEquals = (comparators) => (a) => (b) => {
  return Array.isArray(a) && Array.isArray(b)
    && a.length === b.length
    && a.every((val, index) => {
      const comparator = typeof comparators === 'function' ? comparators : comparators[index] || sameValueZero;
      return comparator(val, b[index]);
    });
};
const list = (...args) => args;

class Cache extends Map {
  shallowArrayEquals = makeShallowArrayEquals(sameValueZero);

  delete(key) {
    const keys = Array.from(this.keys());
    const foundKey = keys.find(this.shallowArrayEquals(key));
    return super.delete(foundKey);
  }
  get(key) {
    const keys = Array.from(this.keys());
    const foundKey = keys.find(this.shallowArrayEquals(key));
    return super.get(foundKey);
  }
  has(key) {
    const keys = Array.from(this.keys());
    return keys.findIndex(this.shallowArrayEquals(key)) !== -1;
  }
}

const memoizeN = (fn, { resolver = list, comparators = sameValueZero }) => {
  const { Cache: OriginalCache } = memoize;
  memoize.Cache = Cache;
  const memoized = memoize(fn, resolver || list);
  memoized.cache.shallowArrayEquals = makeShallowArrayEquals(comparators);
  memoize.Cache = OriginalCache;
  return memoized;
};

module.exports = memoizeN;

This version of memoizeN applies a specific comparator algorithm for each argument:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
const obj1 = {a: 1};
const obj2 = {b: 2};
const obj3 = {c: 3};

const func = (arg1, arg2, arg3) => arg3;
const strictEquality = (x, y) => x === y;
const sameValueZero = (x, y) => x === y || (Number.isNaN(x) && Number.isNaN(y));
const deepEquality = lodash.isEqual;
const comparators = [strictEquality, sameValueZero, deepEquality];
const funcM = memoizeN(func, { comparators });

console.dir(funcM(obj1, obj2, obj3)); // => {c: 3}
console.dir(funcM(obj1, obj2, {c: 3})); // => {c: 3} (cache hit)
console.dir(funcM(obj1, {b: 2}, obj3); // => {c: 3}
console.dir(funcM.cache.size); // => 2

If we define comparators option as a function, the memoized function will use it to compare every argument. When comparators are not provided, SameValueZero comparison is used for all arguments.

1
2
3
4
5
6
7
8
9
10
11
const obj1 = {a: 1};
const obj2 = {b: 2};
const obj3 = {c: 3};

const func = (arg1, arg2, arg3) => arg3;
const sameValueZero = (x, y) => x === y || (Number.isNaN(x) && Number.isNaN(y));
const funcM = memoizeN(func, { comparators: sameValueZero }); // equivalen with memoizeN(func)

console.dir(funcM(obj1, obj2, obj3)); // => {c: 3}
console.dir(funcM(obj1, obj2, obj3)); // => {c: 3} // cache hit
console.dir(funcM.cache.size); // => 1

I’ve extracted memoizeN into GitHub Gist installable via npm. If you prefer to use memoizeN as npm package, here is how to achieve that:

 $ npm install gist:2e6c77d38cf00faacceaf37e46b76a32 --save

This command will create the following entry inside your dependencies:

"dependencies": {
  "@char0n/memoizeN": "gist:2e6c77d38cf00faacceaf37e46b76a32"
}

Then use it via CommonJS or ESM:

const memoizeN = require('@char0n/memoizeN');
// or
import memoizeN from '@char0n/memoizeN';

If you need even more advanced functionality like LRU cache, TTL, reference counting, and others, I suggest you look at memoizee. memizee is probably the most feature-rich implementation of memoization for JavaScript.


Fork me on GitHub