Link Search Menu Expand Document

Word Shapes

Comparing Words by the Letter-Adjacency Graphs.

Posted: Jun 16th, 2023

This video by John Turner has a fun idea: Looking at which words are ‘shaped’ the same. One of the ways he defines a word’s shape is via its graph1 of letter adjacencies. For example, “baboon” and “refers” have the same graph shape because the network of connections between adjacent letters is similar.

Editted screenshot of John Turner's video, showing how 'baboon' and 'refers' have the same graph.

Unfortunately, despite using a graphing library called Scott to compute canonical representations of each word’s graph, what Turner has calculated doesn’t seem to actually be (just) about graph isomorphism. It also takes into account the position of letters around the “letter wheel”. I found this unsatisfyingly restrictive.

Looking at just the networks of letter adjacency, “baboon” should be similar not just to words like “refers”, but also to words like “cats” and “wooly”.

The video made me curious what the results would look like when looking at just the graph of adjcencies between letters in a word. (Henceforce “the word’s graph” for short.)

This post partially answers that question using the enable1 dictionary of words from this Github Repo

Which Small Graphs are Missing?

4 or fewer nodes

If we’re looking at words with four or fewer distinct letters, there are only a few ‘shapes’ such a word graph could have. (20, to be precise.) And every one of them has at least one corresponding word.

In fact, all of them except for the K4 complete graph have multiple words. K4 only has one, and it’s a bit iffy. It’s “gensengs”, which is the plural of an alternate spelling of “ginseng”.

Should that count? well, it’s in the dictionary I’m using, so 🤷

5 nodes

Of the 21 simple connected graphs with 5 nodes, all are represented except for one. The missing graph is K5, aka the pentatope graph.

No word has this graph. I also checked the larger words and none of them seem to contain this shape as a subgraph either. Spooky!

Here’s a table showing all the graphs with 5 or fewer nodes.

Graph Example Word Visualization
singleton graph i word graph for i
2-path to word graph for to
3-path air word graph for air
K3 (triangle) aqua word graph for aqua
paw graph catch word graph for catch
4-path fire word graph for fire
diamond graph miasma word graph for miasma
square graph anima word graph for anima
K4 (tetrahedron) gensengs word graph for gensengs
banner graph absorb word graph for absorb
fork graph elixir word graph for elixir
(3,2)-tadpole graph propel word graph for propel
bull graph alcohol word graph for alcohol
kite graph calculus word graph for calculus
butterfly graph tempest word graph for tempest
(4,1)-lollipop graph torturous word graph for torturous
cricket graph aether word graph for aether
5-path earth word graph for earth
dart graph instant word graph for instant
5-star kabbalah word graph for kabbalah
gem graph seascape word graph for seascape
(2,3)-complete bipartite loyalty word graph for loyalty
house graph automata word graph for automata
(1,1,3)-complete tripartite attractant word graph for attractant
house X graph lanolin word graph for lanolin
5-cycle (pentagon) exhume word graph for exhume
3-dipyramidal intensities word graph for intensities
5-graph 31 nurturant word graph for nurturant
5-wheel milliosmols word graph for milliosmols
K5 (pentatope)  

Names are taken from this page on Biconnected Graphs from Wolfram Mathworld.

6 Nodes

TODO

  1. “Graph” as in “graph theory”, the study of networks and connections. To get a word’s letter-adjacency graph: Each letter is a vertex. There is an edge connecting two letters if they show up next to each other in the word. The graphs are simple graphs, meaning we don’t connect a letter to itself (in words like “moon”), nor do we add extra edges when the same adjacency happens multiple times (in words like “donor”). 


Comments or questions about this page? Please send a message to RobertMartinWinslow at gmail dot com. Feedback is welcomed.