Menu

Benoit Bernard

My thoughts about programming, debugging and technology

Web Scraping and Crawling Are Perfectly Legal, Right?

"Come on, I worked so hard on this project! And this is publicly accessible data! There's certainly a way around this, right? Or else, I did »

The Case of the Mysterious Python Crash

It was almost 11PM. My distributed web crawler had been running for a few hours when I discovered a very weird thing. One of its log »

Using Uber's Pyflame and Logs to Tackle Scaling Issues

Here I was again, looking at my screen in complete disbelief. This time, it was different though. My distributed web crawler seemed to be slowing down »

Tracking Down a Freaky Python Memory Leak (Part 2)

If you read part 1 of this series, you know that my crawler was plagued by several memory leaks. Using umdh, I was able to determine »

How to Build lxml and Get Its Debug Symbols on Windows

If you're one of those lost souls looking around to debug lxml on Windows, then you're most likely having some trouble. One of my Python applications »