Macintosh data extraction software for getting data from scanned graphs and images

Data extraction software

Software that estimates data based on image files, such as scanned charts and graphs, or movies. Carl Witthoft suggested this category and provided us with several starting programs.

DataThief III

Current version: 3.0 (Nov. 2000). 680x0, PPC, Classic, OS X, Linux, UNIX, Windows; runs under Java
$25 (version 1 was postcard/beerware)
Listing date: 9/2005. Updated with no change on 12/1/2010.

"DataThief is a program to reverse engineer a set of data from a given plot in a magazine or journal. This program gives you the opportunity to incorporate somebody else's data points in your plots. This comes in very handy when for instance. you would like to compare your data with the data in a published article for which you don't have the data in table format.”

The original (Classic only) was postcard/beerware and could only read PICT files (though with QuickTime translation other formats would work.) The current very-crossplatform version, completetly rewritten in Java, has the same basic function: estimating data based on points on a graph. It can trace most continuous lines, even when they cross themselves, and can convert numbers to other formats (e.g. dates). Version III works on MacOS 8 and 9 as well as X and those other platforms. DataThief II (version 1.21), for older Macs, is still available on the DataThief web site.

"The original version of DataThief was written by Kees Huyser and Jan van der Laan. Available from Bas Tummers.

Engauge Digitizer

Current version: 4.1

Listing updated: 7/2007 (no change as of 6/2012)

"...converts an image file showing a graph or map, into numbers. The numbers can be read on the screen, and written or copied to a spreadsheet." Can remove gridlines, match points, trace curves, match axes, handle a wide variety of graph types, and has other nifty features. Available from SourceForge. Linux, OS X, and Windows versions. (You can download the Mac version by looking for the file ending in tar.gz but then you'll need to build it).


Current version: 3.0
with free trial.
Listing update: 12/2010

Similar to DataThief but runs under OS X; GraphClick can retrieve the original data from a scanned graph or chart. It includes automatic curve, symbol, and bar-chart detection, can handle deformed images, and supports multiple data sets in one graph. GraphClick is surprisingly flexible, handling error bars, QuickTime movies (frame by frame), deformed axis systems, and numerous other quirks and specialties. Version 3 brought map projections, data sets that can be copied as columns, keeping coordinates of data points during copy/paste, and bug fixes.

Plot Digitizer

Current version: 2.50. Supports Mac, Linux, Windows (Java program).
Listing updated: 12/2010 (program updated June 2010)

"Plot Digitizer is a Java program used to digitize scanned plots of functional data. ... This program will allow you to take a scanned image of a plot (in GIF, JPEG, or PNG format) and quickly digitize values off the plot just by clicking the mouse on each data point....Besides digitizing points off of data plots, this program can be used to digitize other types of scanned data (such as scaled drawings or orthographic photos).

"Plot Digitizer includes a special "semi-auto" digitizing feature. For plotted data that trends from left to right, you can simply indicate what data you want digitized with a thick paint brush and the program will automatically sort out grid lines, noise, etc, and will attempt to digitize the line for you. This feat is accomplished with the help of the open source autotrace image vectorization program."

