I got a data set that where each sample has a size (0-1000) and a value (grade 1-5). I want to visualise the data with circles of different sizes along a line (domain axis), much like:
(note that circles even with the same effective taxrate do not overlap)
- sample 1: size 300 value 3.2
- sample 2: size 45 value 3.8
- sample 3: size 4400 value 4.0
- sample 5: size 233 value 0.2
- sample 6: size 4000 value 4.2
How can the data above be visualised using circles on a line (size decides diameter, value decides approximate position on the line) so that circles do not overlap?
I've been looking at D3's packing layout, but from what I can tell it doesn't support this out of the box. Anyone got any ideas on how to approach this?
Oooh, this one was a puzzle...
If you look at the code for the NYTimes graphic, it uses pre-computed coordinates in the data file, so that's not much use.
However, there's an unused variable declaration at the top of the script that hints that the original version used
d3.geom.quadtree to lay out the circles. The quadtree isn't actually a layout method; it is used to create a search tree of adjacent nodes, so that when you need to find a node in a given area you don't have to search through the whole set. Example here.
The quadtree can therefore be used to identify which of your datapoints might be overlapping each other on the x-axis. Then you have to figure out how much you need to offset them in order to avoid that overlap. The variable radii complicate both functions...
I've got a test case implemented here: http://fiddle.jshell.net/6cW9u/5/
The packing algorithm isn't perfect: I always add new circles to the outside of existing circles, without testing whether they could possibly fit closer in, so sometimes you get significant extra whitespace when it is just the far edges of circles bumping into each other. (Run it a few times to get an idea of the possibilities -- note that I've got x-variables distributed as random normal and r-variables distributed as random uniform.) I also got a stack overflow on the recursive methods during one iteration with N=100 -- the random distribution clearly wasn't distributed well enough for the quadtree optimization.
But it's got the basic functionality. Leave a comment here if you can't follow the logic of my code comments.
New fiddle here: http://fiddle.jshell.net/6cW9u/8/
After a lot of re-arranging, I got the packing algorithm to search for gaps between existing bubbles. I've got the sort order switched (so that biggest circles get added first) to show off how little circles can get added in the gaps -- although as I mention in the code comments, this reduces the efficiency of the quadtree search.
Also added various decoration and transition so you can clearly see how the circles are being positioned, and set the r-scale to be square root, so the area (not radius) is proportional to the value in the data (which is more realistic, and what the O.P. asked for).