{"id":1138,"date":"2010-01-10T20:55:04","date_gmt":"2010-01-11T02:55:04","guid":{"rendered":"http:\/\/bililite.nfshost.com\/blog\/?p=1138"},"modified":"2010-01-10T20:55:04","modified_gmt":"2010-01-11T02:55:04","slug":"digitizing-graphs","status":"publish","type":"post","link":"https:\/\/bililite.com\/blog\/2010\/01\/10\/digitizing-graphs\/","title":{"rendered":"Digitizing graphs"},"content":{"rendered":"<p>I wanted to add <a href=\"http:\/\/en.wikipedia.org\/wiki\/Down_syndrome\">Down Syndrome<\/a> growth charts to the <a href=\"\/webservices\/\">bililite.com webservices<\/a>, but as far as I can tell, the charts are available only as images in the <a href=\"http:\/\/aappolicy.aappublications.org\/cgi\/content\/full\/pediatrics;107\/2\/442\"><abbr title=\"American Academy of Pediatrics\">AAP<\/abbr>'s guidelines<\/a> (and the <a href=\"http:\/\/pediatrics.aappublications.org\/cgi\/reprint\/81\/1\/102\">original paper<\/a>; subscription only). The often-cited <a href=\"http:\/\/growthcharts.com\/charts\/DS\/charts.htm\">growthcharts.com<\/a> has charts, and Greg Richards was generous enough to share his data with me. However, some of the data are from a different study, and he got his data from the original charts the old-fashioned way: with pencil, ruler, and a blown-up copy of the paper. Nothing wrong with that; that's how I got my numbers for the <a href=\"\/webservices\/bili\">bilirubin chart<\/a>, but I wanted all my charts to match the AAP's.<\/p>\r\n<p>So how to get the numbers off the graph? I emailed the lead author of the original paper, but haven't gotten any answer. I can pull the graphs as gif's from the PDF of the paper (thanks to <a href=\"http:\/\/www.openoffice.org\/\">OpenOffice.org<\/a> and <a href=\"http:\/\/extensions.services.openoffice.org\/project\/pdfimport\">Sun's PDF importer<\/a>; Adobe's reader seems to get more limited with each upgrade). I was afraid I would have to digitize the graph by hand; I read the cool article on <a href=\"http:\/\/sudokugrab.blogspot.com\/2009\/07\/how-does-it-all-work.html\">Sudoku recognition<\/a> and figured I could learn about <a href=\"http:\/\/en.wikipedia.org\/wiki\/Hough_transform\">Hough transforms<\/a> to get the graph, and <a href=\"http:\/\/en.wikipedia.org\/wiki\/Discrete_Fourier_transform#Multidimensional_DFT\">2-D Fourier transforms<\/a> to remove the gridlines, then <a href=\"http:\/\/en.wikipedia.org\/wiki\/Blob_detection\">blob detection<\/a> to find the lines. Turning pixels into measurements would be the trivial last step. Sounds like fun, if I had an infinite amount of free time.<\/p>\r\n<p>Luckily, I found <a href=\"http:\/\/digitizer.sourceforge.net\/\">Engauge Digitizer<\/a>. With almost no time reading the manual, I had it removing gridlines, digitizing the curves on the graph, and exporting values at x-values that I selected into CSV files. It was close to easy. Not quite automated, but with only 4 graphs to digitize, I was done in half an hour. Highly recommended. With my remaining free time, I'll write a quick tutorial so I don't forget what I did.<\/p>","protected":false},"excerpt":{"rendered":"I wanted to add Down Syndrome growth charts to the bililite.com webservices, but as far as I can tell, the charts are available only as images in the AAP's guidelines (and the original paper; subscription only). The often-cited growthcharts.com has charts, and Greg Richards was generous enough to share his data with me. However, some [&hellip;]","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12],"tags":[],"_links":{"self":[{"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/posts\/1138"}],"collection":[{"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/comments?post=1138"}],"version-history":[{"count":6,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/posts\/1138\/revisions"}],"predecessor-version":[{"id":1144,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/posts\/1138\/revisions\/1144"}],"wp:attachment":[{"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/media?parent=1138"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/categories?post=1138"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/tags?post=1138"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}