diff --git a/docs/DESIGN.adoc b/docs/DESIGN.adoc
index 0ef2861..8efc15b 100644
--- a/docs/DESIGN.adoc
+++ b/docs/DESIGN.adoc
@@ -539,19 +539,20 @@ image::./annotated_wordlist_example.png[Screenshot of docs/annotated_words.ods a
 
 ==== Wordlist generation
 
-The final wordlist can be generated by using `./docs/wordlist-new.ipynb`, which brings together the full list of words, nltk lemmatized words, and manual annotated words into one list.
+The final wordlist can be generated by running the scripts in `./wordlist/`, which brings together the full list of words, nltk lemmatized words, and manual annotated words into one list.
+See link:WORDLIST.html[WORDLIST] for more information
 
 The output is of the format:
 
 [source]
 ----
-word,number
-the,1
-of,2
-him,3
-hymn,3
-apple,4
-apples,4
+WORD,NUMBER
+THE,1
+OF,2
+HIM,3
+HYMN,3
+APPLE,4
+APPLES,4
 ----
 
 === Implementation [[implementation]]
diff --git a/docs/WORDLIST.adoc b/docs/WORDLIST.adoc
new file mode 100644
index 0000000..7d51c0f
--- /dev/null
+++ b/docs/WORDLIST.adoc
@@ -0,0 +1,74 @@
+// echo WORDLIST.adoc | entr sh -c "podman run --rm -it --network none -v "${PWD}:/documents/" asciidoctor/docker-asciidoctor asciidoctor -r asciidoctor-mathematical -a mathematical-format=svg WORDLIST.adoc; printf 'Done ($(date -Isecond))\n'"
+
+:toc:
+:nofooter:
+:!webfonts:
+:source-highlighter: rouge
+:rouge-style: molokai
+:sectlinks:
+
+= Wordlist
+
+The wordlist for this_algorithm begins with the wordlist from link:https://github.com/ps-kostikov/english-word-frequency/[ps-kostikov/english-word-frequency^] (link:https://github.com/ps-kostikov/english-word-frequency/blob/master/data/frequency_list.txt[data/frequency_list.txt^])
+
+But this list is not sufficient.
+It contains profane, negative, or words otherwise unfit for this algorithm.
+Because the wordlist required for this_algorithm is relatively small (8194), we can reduce this 53,000 word list substantially.
+
+Processing steps (source code is available in the `wordlist/` directory):
+
+* `00-frequency-list` - A base list of most possible words (not necessarily including words from step 02), sorted by desire to include, which is frequency in this case
++
+[source]
+----
+WORD,FREQUENCY
+THE,18399669358
+OF,12042045526
+BE,9032373066
+AND,8588851162
+----
+* `01-lemmatized-words` - List of words that should be lemmatized and represent the same underlying value, in any order
++
+[source]
+----
+WORD,LEMMATIZED_WORD,LEMMATIZER
+ARE,BE,SPACY
+ITS,IT,SPACY
+NEED,NEE,SPACY
+THOUGHT,THINK,SPACY
+SOMETIMES,SOMETIME,SPACY
+----
+* `02-custom-lemmatizations` - List of custom lemmatizations, used for any words of homonyms that the automatic lemmatization failed to capture
++
+[source]
+----
+WORD1,WORD2
+ADD,ADDS
+ADS,ADDS
+AFFECTED,EFFECT
+AFFECT,EFFECT
+AFFECTIONS,AFFECTION
+----
+* `03-exclude` - Words to include. If any word in a lemmatization group is present, the entire group is excluded from the result.
+Words can be excluded for any reason.
++
+[source]
+----
+WORD
+A
+AARON
+ABA
+ABANDON
+ABANDONING
+----
+* `04-deduplicated-words` - The final list of words and the associated numeric value
++
+[source]
+----
+WORD,NUMBER
+THE,1
+OF,2
+BEE,3
+ARE,3
+BE,3
+----