package huffman

import "go.abhg.dev/algorithm/huffman"

Package huffman implements an n-ary Huffman coding algorithm.

It can be used to generate prefix-free labels for any number of items. Prefix-free labels are labels where for any two labels X and Y, there's a guarantee that X is not a prefix of Y. This is a useful property because it allows for unambiguous matching. When receiving incremental input (e.g. keystrokes), as soon as the input matches a prefix-free label X, you can stop and process the item corresponding to X.

Index

Examples

Functions

func Label

func Label(base int, freqs []int) (labels [][]int)

Label generates unique prefix-free labels for items given their frequencies.

base is the number of symbols in the alphabet. For example, for a binary alphabet, base should be 2. base=26 is a common choice for alphabetic labels. The mapping from base to alphabet is the caller's responsibility. base must be at least 2.

For len(freqs) items, freqs[i] specifies the frequency of item i. Items with higher frequencies will be assigned shorter labels.

The returned value is a list of labels for each item. labels[i] is the label for item i, specified as indexes into the alphabet. For example, given a binary alphabet {a b}, the label {0 1 0} means "aba".

Example
// alphabet is the set of symbols to use for the labels.
// We'll use a short alphabet for this example.
const alphabet = "asdf"
// We have some number of items, each with a frequency.
// Frequency can be thought of as the priority of the item.
// Higher priority items get shorter labels.
items := []struct {
	value string
	freq  int
}{
	{"parameter", 52},
	{"values", 52},
	{"variable", 55},
	{"argument", 56},
	{"slice", 59},
	{"types", 70},
	{"expression", 88},
	{"function", 158},
	{"value", 169},
	{"type", 530},
}

// Convert the items to a list of frequencies.
freqs := make([]int, len(items))
for i, item := range items {
	freqs[i] = item.freq
}

// Generate the labels.
// Highest frequency items like "value" and "type"
// will get the shortest labels.
labels := Label(len(alphabet), freqs)

// Print the labels.
for i, labelIndexes := range labels {
	item := items[i]

	// labels[i] refers to indexes in alphabet.
	// Join them to form the full label.
	label := make([]byte, len(labelIndexes))
	for j, idx := range labelIndexes {
		label[j] = alphabet[idx]
	}

	fmt.Printf("%s: %s\n", item.value, label)
}

Output:

parameter: sa
values: ss
variable: sd
argument: sf
slice: da
types: ds
expression: dd
function: df
value: a
type: f