Skip to content

Digitizing graphs

I wanted to add Down Syndrome growth charts to the webservices, but as far as I can tell, the charts are available only as images in the AAP's guidelines (and the original paper; subscription only). The often-cited has charts, and Greg Richards was generous enough to share his data with me. However, some of the data are from a different study, and he got his data from the original charts the old-fashioned way: with pencil, ruler, and a blown-up copy of the paper. Nothing wrong with that; that's how I got my numbers for the bilirubin chart, but I wanted all my charts to match the AAP's.

So how to get the numbers off the graph? I emailed the lead author of the original paper, but haven't gotten any answer. I can pull the graphs as gif's from the PDF of the paper (thanks to and Sun's PDF importer; Adobe's reader seems to get more limited with each upgrade). I was afraid I would have to digitize the graph by hand; I read the cool article on Sudoku recognition and figured I could learn about Hough transforms to get the graph, and 2-D Fourier transforms to remove the gridlines, then blob detection to find the lines. Turning pixels into measurements would be the trivial last step. Sounds like fun, if I had an infinite amount of free time.

Luckily, I found Engauge Digitizer. With almost no time reading the manual, I had it removing gridlines, digitizing the curves on the graph, and exporting values at x-values that I selected into CSV files. It was close to easy. Not quite automated, but with only 4 graphs to digitize, I was done in half an hour. Highly recommended. With my remaining free time, I'll write a quick tutorial so I don't forget what I did.

Post a Comment

Your email is never published nor shared. Required fields are marked *