Data clustering for a multiuser website heatmap

Estimated reading time: 2 minutes ( save for later )

Or how I clustered from 480k to 4.8k possible significant datapoints with 2loc

I haven’t written here for ages, I’m really busy with doing my last month of compulsory community service (mandatory in Austria) and working on awesome stuff. But I’ll try to write down my thoughts more often (hell, I have loads of article drafts open but not enough time to finish them :/
Anyways. While I was working on a new multiuser heatmap.js demonstration I figured out that tracking every mouse position (defined by x and y) on a 800×600 tracking area resulted in lots of possible datapoints, to be exact.. 800×600 = 480k. The fact that heatmap.js could have to load 480k different datapoints was freaking me out. 480k datapoints mean that the probability of hitting one datapoint twice is 1/480k so it’s pretty hard to generate good heatmap data (since a heatmap has higher and lower frequently hitted areas -> red/blue). This got me thinking and I came up with a pretty simple solution, two lines of code.

x = (x/10 >> 0)*10;
y = (y/10 >> 0)*10;

So whenever I get a new mouse position (x/y) I divide it by 10, the bitshift by 0 kills the decimal places – it’s not rounding!! and multiply it by 10 again. Here is an example: Position ( 152 / 144 ). x=152, y=144. x/10 = 15.2. 15.2 without decimal places equals 15, then multiplying by 10 results in 150. Same procedere for 144 results in 140. This results in having coordinate steps of 10 pixels. I reduced my tracking area of 800 x 600 to an tracking area of 80 x 60 but it’s still 800×600 big. There is one little drawback: since the coordinates are in pixel steps of 10 the tracked mouse position could have a maximal deviation of 9pixels but I think thats not a big deal when it comes to a reduction of datapoint possibilities by the factor 100. :-) Unfortunately 4.8k are still too much datapoint possibilities for heatmap.js which makes the realtime heatmap demo really memory consuming. If you have an idea how to solve the memory consumtion issues please let me know ;) or file a pull request for heatmap.js on github.
Patrick

Leave a reply