Graph Layout Algorithm

2024-09-05 9 minute read
Repository: https://git.sr.ht/~jaxter184/tui-nodes

Blender UI showing many interconnected nodes — Blender's node editor UI

Node editors are really neat! They're a very intuitive way to visualize the dependent relationships of the various parts of a process, from graphics rendering pipelines to audio patches. I've been doing a lot of keyboard-driven TUI stuff for the last few years, and while node editors are a very GUI-centric interface, I think with a few tweaks, a TUI node editor could be a very reasonable thing to do.

The main challenge that makes node editors so GUI-centric is that there is a lot of mouse interaction. While you can use a mouse in a TUI, keyboards are by far the preferred input method, and the spatial nature of a node editor makes it tough to navigate with just a keyboard. Some node editors solve this by tying navigation to the topology of the graph: up arrow key selects the parent, down arrow key selects the first child, left and right navigate siblings. If all the nodes are connected (they usually are, and even if they aren't, there are workarounds), this allows you to navigate to any other node with ease.

However, even if you solve navigation, a lot of the clicking in a node editor is just moving nodes around to tidy the connections, and having to use arrow keys for that would greatly slow down workflows, which basically nullifies 80% of the benefit of a keyboard-only input method.

Which brings me to my solution: node placement should be determined only by topology. No more clicking around reorganizing nodes, they'll be placed mostly in the right place automatically, no more manually routing connections, plus you won't need to store extra data in files to keep track of positions.

prior art

mermaid and dot have some way of placing nodes by topology that i didnt bother looking into

vocabulary

graph - a set of nodes with connections between them
directed graph - a graph where the connections are directional
directed acyclic graph - a directed graph where following a series of connections from node to node will never result in a loop
node - an object in a graph that connects to other nodes
connection - a relationship between two nodes, generally represented with a line connecting the two (footnote: choosing this over the classical graph theory term "edge" because it's a little confusing since we're talking about the geometry of how things get drawn)
line segment - when drawing a connection, these are the straight parts between bends
joining - the place where two connections with the same destination represented by separate lines come together into one line
splitting - the place where a connection between a source and two destinations splits apart into separate lines

background

why are we using a DAG?
how is it related to a tree?

goals

For simplicity, we will assume rectangular nodes and orthogonal connections that only travel horizontally and vertically, with right angled turns.

For clarity, we will assume all data flows in one direction. In some cases, this may be left to right, in our case, we will pick bottom to top.

requirements

Connections cannot pass behind nodes, as they would be much harder to follow.

Connections can only overlap when perpendicular. Put another way, only two connections can overlap at any point, and one must be travelling horizontally while the other travels vertically.

it must be clear whether segments are intersecting at a junction or passing over each other

nice-to-haves

Positioning should be predictable. If I know how two nodes are connected, and I know where one of the nodes is, I should be able to know where the other is as accurately as possible.

should be as compact as possible

minimize connection intersections, both when passing over each other and when splitting/joining at a junction

if it doesn't interfere with any of the above heuristics, the segments of a line between bends should try to be similar to each other in length.

Try to draw the least amount of lines possible. This primarily means that connected nodes should be placed close to each other. Also, if two connections join, they should do so at the earlist point possible, so that the majority of their travel is as a combined line.

willing to allow minor space inefficiencies for a simpler or faster algorithm

our use case

The actual data that the node is representing is arbitrary, but I've chosen my favorite kouign amann recipe.

Assuming we have a loose list of which components have which ingrendients (these are the directed connections):

dough requires salt
butter sheet requires salt
butter sheet requires sugar
dough requires water
syrup requires sugar
syrup requires spices
dough requires butter
kouign amann requires syrup
dough requires flour
laminated dough requires dough
butter sheet requires butter
laminated dough requires butter sheet
dough requires yeast
kouign amann requires laminated dough

We can sort that list and build a heirarchical representation of the ingredients is as follows (in order of amount):

* kouign amann
	* laminated dough
		* dough
			* flour
			* water
			* butter
			* yeast
			* salt
		* butter sheet
			* butter
			* sugar
			* salt
	* syrup
		* sugar
		* spices

We're basically 80% of the way there already! Though now comes the tough part, where we figure out how to display these with connections between them so that items (like butter and salt) aren't repeated.

You could imagine that if this were a graphics pipeline, it would be much more efficient to just calculate the butter and salt processes once, and reuse that output for the inputs to both the dough and the butter sheet processes.

the algorithm is as follows

Place nodes and their children

Before we even draw any connections, let's roughly arrange all of the nodes first, based on their relationships to each other, starting with the root:

node labelled 'kouign amann' with two inputs

Note the two inputs at the bottom of the node, which are for laminated dough and syrup. We'll follow those two inputs and place their corresponding nodes:

same as previous image, but with two more nodes under labelled 'laminated dough' and 'syrup'

So far so good. laminated dough was placed in a way that automatically connects it to the corresponding input on the kouign amann node, but syrup wasn't. Later when we draw connections, we'll bump the lower row down one unit so that we can make the connection, but for now, we'll leave them where they are for simplicity.

Horizontal nudging

Quick pause: it's not immediately obvious now, but if we try to connect syrup, butter sheet is going to make the connections a bit more roundabout than necessary, even if we do the vertical bumping mentioned earlier:

It's not too bad, but there are similar cases that we could run into in the future that are much worse:

Recall one of our goals: "Try to draw the least amount of lines possible".

To address these cases, let's adjust our algorithm to nudge parent nodes to their leftmost child. This will pull it further from its parent, but since there is one connection to the parent and two to its children, putting it closer to its children draws fewer lines:

We'll have to do the same horizontal nudging to butter sheet in the next step (which will move syrup again, poor fella).

However, it happens to be the case that butter and salt, the last two ingredients of dough, are also ingredients in butter sheet, so for more compactness, we can modify our nudging algorithm to only move parent nodes when all of their children are to the right.

Vertical nudging

Now all of the nodes have been placed, but there's one more adjustment that we should make before drawing connections. It seems that sugar is on the same row as butter sheet even though one is an ingredient of the other. A quick fix is to nudge it down one row. In this case, we don't have to move it anymore, but if it collided with something on its row, we'd nudge it (and its parents) to the right.

Also, once we bump sugar down one row, we can tuck in spices. This is our first time nudging something to the left, and the mechanics of that are a little weird, so I'll leave that as a bonus in the appendix (TODO).

Easy connections

Alright! How exciting! All the nodes are placed and we're ready to draw connections. Let's start with the easy ones: anywhere where we can just draw an unobstructed horizontal line between inputs and outputs, let's do that:

This is another place where the exact mechanics of the algorithm are up for grabs. Should the ingredients on the bottom row stay aligned? To do so would make connections longer, but would maybe be more readable. Since our original goals emphasize compactness, I'll leave it as-is for now, but maybe I'll return to it later.

We could also mode syrup and its children to the left by one unit, which would make one line one unit shorter, but tracking when that optimization could be made would greatly increase the complexity of the algorithm, which is no fun for anyone, so we can ignore that.

Harder connections

The only connections left are the inputs of butter sheet. There is no way to lay out the connections without at least one intersection, so now is when we need to start worrying about all the rules regarding connections and create a measurable heuristic for how good a connection is. To review (and simplify), our rules for connections are:

All ingredients flow from bottom to top
Connections cannot pass behind nodes
Unrelated connections can only overlap when perpendicular
Minimize connection intersections, both junctions and overlaps
Try to draw the least amount of lines possible.

First, let's take a closer look at what we're working with:

The inputs of butter sheet, in order, are butter, sugar, salt.

The way I went about doing this is creating a pathing algorithm (similar to A*) that tries to make its way to the other end of the connection.

TODO: the rest of the article

appendix

box drawing characters

what are they

what are the caveats (output pipe can't match both block wall and connection type/style)