The Tale of Creating a Distributed Web Crawler
Around 6 million records with about 15 fields each. This was the dataset that I wanted to analyze for a data analysis project of mine. But »
Around 6 million records with about 15 fields each. This was the dataset that I wanted to analyze for a data analysis project of mine. But »
"Come on, I worked so hard on this project! And this is publicly accessible data! There's certainly a way around this, right? Or else, I did »
It was almost 11PM. My distributed web crawler had been running for a few hours when I discovered a very weird thing. One of its log »
Here I was again, looking at my screen in complete disbelief. This time, it was different though. My distributed web crawler seemed to be slowing down »
If you read part 1 of this series, you know that my crawler was plagued by several memory leaks. Using umdh, I was able to determine »