Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort nodes in D3 so that connecting paths will be clear?


I'm working on a diagram to show the relationship between two groups.
For the illustration of the problem, let's say those two groups are Cities and Superheros, and for each City we show which Superhero visited it, and vice-versa.

Here is a codepen of my current code.

Hover over each node to see its name and to highlight the nodes it is connected to.

As you can see I have a row at the top with nodes for each City, a row at the bottom with nodes for each Superhero, and paths between the two rows to show the relationships:

enter image description here

As you can see, in most cases the paths are very long and cross each other a lot. It makes the digram very visually-cluttered.

I'm sure that if I sort the cities and\or superheros in a smarter way, the diagram will be much clearer.
(There is no significance to the current order of cities nor the order of the superheros, and to illustrate this I shuffle these two arrays before the diagram loads)

cities = _.shuffle(cities);
superheros = _.shuffle(superheros);

My question:
How can I sort the cities and\or heros so that the links between the two will be much clearer (paths won't be as long, paths won't cross each other as much, etc).
Is there a well-known algorithm for this? A helper function in D3 maybe? Or maybe I'm using the wrong type of diagram altogether.

like image 678
devdevdev Avatar asked Feb 01 '26 23:02

devdevdev


1 Answers

Alternative visualization types are possible, which will likely display this information more clearly. A grid comes to mind, but it is probably less visually exciting than the sort of visualization you have currently:

Les Misérables [character] Co-occurance (Mike Bostock):

enter image description here

This type of visualization could easily be co-opted to show character and location overlap. The mirroring of the grid won't be present with location/character axes. But, if nodes and links are desirable we can certainly work with that.


Detangling the Visualization

Traditionally I would have obfuscated this type of diagram by tangling it further. This would involve overly complicating those spaghetti style matching questions in school quizzes in the hope of convincing the teacher that it was easier to give me the benefit of the doubt than trace the paths. That it is so easy to make these networks visually untraceable suggests that if your visualization gets any more detailed, this might not be an ideal style. But, to make up for my previous sins in making unreadable networks let's detangle what we have.

There surely is a nice algorithm that calculates the least length of paths needed to draw the diagram, separates isolated networks from each other and aligns sources and targets as close as possible on the x axis. Though this might place all the highly connected nodes close together, resulting in the center of the visualization looking like a plate of spaghetti.

But, as I'm lazy, I'd rather have the chart do some sort of self organization. Luckily, in d3 we have a force layout.

Let's keep your general layout to start, a row of heros and a row of cities. Instead of randomizing the position of each node and drawing a connecting line, we can set a number of forces and parameters into play:

  • Fix the y coordinate so that the rows are preserved
  • Highly value link distance so that linked nodes are pulled closely together
  • Start with a mild attraction between nodes to allow mixing and then slowly start to force them apart
  • As the layout develops, increase the collide radius so that nodes are spaced equally-ish

If we do this correctly, we'll find a nice visualization that will separate isolated networks (essential for this type of visualization) and for the most part keep links relatively short.

Here's an example of a random output using this approach (cut in half):

enter image description here

The relationships are clearer than if nodes are randomly sorted (excepting fluke-ish lucky outcomes), there are a few links that appear to be stretching pretty far, but they are relatively limited in number. Isolated networks are visually separated too, which is important from a topological perspective.

For comparison, this is a random load of the original layout (I did change the node style and row order for some reason, but the glass of beer in my hand says to not correct it):

enter image description here

this is a particularly unlucky random draw, Black Widow is connected to the first and last location on the x axis.

The average link in the original layout is much more horizontal than the average link in the force layout - more vertical-ish links increase clarity and decrease clutter by decreasing the number of intersections while also shortening link length.

We can try to quantify the improvement a force layout brings. For example, the average link length (measured as distance between source and target along the x axis) in the random sorting (sample size = 10) was 521 px, while the force layout sorting had an average link length of 208 px (sample size also 10). Here's the histogram of each layouts link lengths:

enter image description here

enter image description here

My sample size is small, but the pattern should be clear.

Ok, I've shown the results and stated the general requirements, but how about a demonstration. Without going into how a force diagram works or how to code it, here's a quick demo bl.ock and below is a quick explanation of the key parameters I've used:

  1. Set the simulation's start properties:

These will be modified once the simulation starts, but these make good starting points:

var simulation = d3.forceSimulation()
    // set optimal distance to be 1 pixel between source and target:
    .force("link", d3.forceLink().id(function(d) { return d.name; }).distance(1))
    // set the distance at which forces apply to be limited to some distance,
    // make nodes attracted to each other to start:
    .force("charge", d3.forceManyBody().distanceMax(cities.length+1/width).strength(10))
    // try to keep nodes centered:
    .force("center", d3.forceCenter(width / 2, height / 2))
    // slow decay time, increases time to simulation end
    .alphaDecay(0.01);
  1. Modify the forces as the simulation proceeds:

We want to change a few forces, such as the attraction to a repulsion as the simulation winds down.

function ticked() {
    var force = this;
    var alpha = force.alpha(); // current alpha
    var padding = 20; // minimum distance from sides of visualization
    var targetSeparation = (width-padding*2)/(cities.length*2) // ideal separation between nodes on x axis.

    // if we are late in the simulation, change collide radius to ideal separation
    // also change the charge between nodes to repulsion
    if (alpha < 0.5) {
        force.force( "collide",d3.forceCollide((0.5 - alpha)*2*targetSeparation) )
             .force("charge", d3.forceManyBody().distanceMax(targetSeparation).strength((0.5 - alpha) * 50))
    }
  1. Don't forget to fix the y coordinate of the force nodes with d.fy

Further Improvements

So now we are pretty detangled, but we can do better. Why restrict ourselves to 2 rows? 3 would probably be ideal:

enter image description here

This might be ideal if say, one had two types of characters to display. Luckily for every hero there is a villain. Separation is best along a natural division, eg: women / men, former scientists & heroes with advanced degrees / everyone else, from earth / not from earth, racoon / not racoon ...

But even if separation along some theme isn't possible, it will still detangle the visualization further. Based on this, the only change from the first example of mine here and this one (shown above) is that I've separated the heroes out (and added a bit of height to the visualization).


If the cool down time is unbearable (you could do some sort of scaling to separate nodes when sorting), then you could run the simulation without drawing it - loading only the final version (or even store the layout for future use if it the nodes and links don't change each load). But, this is a different sort of issue for a different question.

like image 126
Andrew Reid Avatar answered Feb 03 '26 12:02

Andrew Reid