Skip to content

kamilmielnik/trie

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

272 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

@kamilmielnik/trie

Version License Bun Node version Test Coverage Format

Trie data structure implemented in TypeScript:

Table of contents

Installation

# Bun
bun add @kamilmielnik/trie

# npm
npm install @kamilmielnik/trie

# Yarn
yarn add @kamilmielnik/trie

See full API Docs - generated by typedoc.

Good to know:

  • all objects are mutable
  • every class, interface, type, constant and function is exported
  • all exports are named (there is no default export)

There are 2 ways to use the API.

Create Trie instance and use its methods.

Example

import { Trie } from '@kamilmielnik/trie';

const trie = new Trie();
trie.add('master');
trie.add('mask');
trie.hasPrefix('man'); // false
trie.hasPrefix('mas'); // true
trie.has('mas');       // false
trie.remove('mas');    // false
trie.has('master');    // true
trie.serialize();      // "(m(a(s(t(e(r))k))))"
trie.remove('master'); // true
trie.serialize();      // "(m(a(s(k))))"

Manipulate existing instances of Node with functions.

Example

The following example works identically to the object-oriented example above.

import { add, has, hasPrefix, Node, remove, serialize } from '@kamilmielnik/trie';

const root: Node = {};

add(root, 'master');
add(root, 'mask');
hasPrefix(root, 'man'); // false
hasPrefix(root, 'mas'); // true
has(root, 'mas');       // false
remove(root, 'mas');    // false
has(root, 'master');    // true
serialize(root);        // "(m(a(s(t(e(r))k))))"
remove(root, 'master'); // true
serialize(root);        // "(m(a(s(k))))"

Examples

Load dictionary from file

import { fromArray, Node } from '@kamilmielnik/trie';
import fs from 'fs';

const fromFile = (filepath: string): Node => {
  const file = fs.readFileSync(filepath, 'utf-8');
  // Assuming file contains 1 word per line
  const words = file.split('\n').filter(Boolean);
  const node = fromArray(words);
  return node;
};

Serialize Node to a file

import { Trie } from '@kamilmielnik/trie';
import fs from 'fs';

const toFile = (filepath: string, trie: Trie): void => {
  const serialized = trie.serialize();
  fs.writeFileSync(filepath, serialized);
};

Load serialized Node from a file

import { deserialize, Node } from '@kamilmielnik/trie';
import fs from 'fs';

const fromFile = (filepath: string): Node => {
  const serialized = fs.readFileSync(filepath, 'utf-8');
  const node = deserialize(serialized);
  return node;
};

Find all words with given prefix

import { find, Node, toArray } from '@kamilmielnik/trie';

const findWordsWithPrefix = (node: Node, prefix: string): string[] => {
  const prefixNode = find(node, prefix) || {};
  const descendants = toArray(prefixNode, { prefix, sort: true, wordsOnly: true });
  const words = descendants.map(({ prefix: word }) => word);
  return words;
};

Serialization and compression

This package can be used to efficiently serialize and compress dictionaries.

It reaches 66.40 compression ratio (98.49% space saving) for SJP.PL (pl-PL) when combined with 7-Zip at ultra compression level.

Language ๐Ÿ‡บ๐Ÿ‡ธ en-US ๐Ÿ‡ฌ๐Ÿ‡ง en-GB ๐Ÿ‡ต๐Ÿ‡ฑ pl-PL
Name TWL06 SOWPODS SJP.PL
Source Download Download Download
Words count 178,690 267,751 3,229,855
File size [B] 1,763,166 2,707,020 42,172,320
File size (serialized) [B] (-42.38%) 1,016,004 (-43.94%) 1,517,454 (-69.40%) 12,903,507
File size (7z) [B] (-78.54%) 378,418 (-79.23%) 562,247 (-86.84%) 5,550,546
File size (serialized + 7z) [B] (-89.25%) 189,484 (-89.78%) 276,737 (-98.49%) 635,085

Performance

Benchmarks are produced by bench/index.ts using tinybench. Run bun run bench to regenerate the charts below.

add, find, has, hasPrefix, remove are very fast - $O(\log_2 n)$ (millions of operations per second).

Fast operations chart


deserialize, fromArray, serialize, toArray are slow - $O(n)$ (few operations per second).

Slow operations chart