Last year at Buzzwords it was reported the Apache Mahout project had a new kind of clustering algorithm soon to be available which promised extraordinary speed. Since that time, that promise has been filled. This new algorithm is extraordinarily fast, possibly the fastest production clustering algorithm available. It also has many unusual characteristics which can make clustering applicable in new ways.
This talk is a report on the progress of this new kind of clustering. I will describe the theory behind how this algorithm works and how it is able to provide high quality clustering with only a single pass through the data. Mostly, however, I will focus on practical results of this algorithm.
Read more:
https://2013.berlinbuzzwords.de/sessi...
About Dan Filimon:
https://2013.berlinbuzzwords.de/users...
Website: https://berlinbuzzwords.de/
Twitter: / berlinbuzzwords
LinkedIn: / berlin-buzzwords
Reddit: / berlinbuzzwords