My Latest Soon-to-be-Forgotten-Uncompleted Project
I remember reading about it after it was announced (probably on /.), and promptly forgetting about it. I stumbled across it again listening to one of the Google lectures on MapReduce, and started poking around, thinking “maybe this would make a good test project for my new (I-just-want-to-play-sized) Hadoop cluster”. But then I needed an algorithm before I started parallelizing it, and, hey, what do you know, I’m getting really good numbers.
Of course, I’m getting my good numbers on a very small randomly selected subset of the entire project. So it could be just a fluke. And it’s slower than molasses in an Adirondack winter (current estimate: 10+ years to calculate the entire qualifying data set). But hey, it has my attention.







