Data extraction software estimates data based on image files such as scanned charts and graphs. Carl Witthoft suggested this category and provided us with several starting programs. Also see graphing / data visualization software; and general statistics packages. Page fully updated January 2021.
Current version: 12.1
Free. Signed. Catalina-ready.
Listing updated: 1/21/2021; software updated in 2019
"...converts an image file showing a graph or map, into numbers. The numbers can be read on the screen, and written or copied to a spreadsheet." Can remove gridlines, match points, trace curves, match axes, handle a wide variety of graph types, and has other nifty features. Linux, OS X, and Windows versions; it’s on the Mac App Store.
Engauge Digitizer takes images in PNG, JPG, and TIF format and recovers the data points from graphs, possibly to be used in new graphs. Work can be saved in DIG format for later editing. Engauge Digitizer has numerous special features to make data more accurate and easier to obtain; it’s pretty impressive.
Current version: DataThief III 1.7; Java program
$25 (version 1 was postcard/beerware)
Listing updated 1/21/2021. Software updated 2015.
Unsigned. Runs under Catalina.
DataThief reverse engineers data from a scanned plot, so you can incorporate published data in your plots—very handy if you need to compare your data with that of an article that doesn’t provide it in a table.
The current crossplatform version, using Java, can trace most continuous lines, even when they cross themselves, and can convert numbers to other formats (e.g. dates). Version III works on MacOS 8 and 9 as well as X and those other platforms. DataThief II (version 1.21), for older Macs, is still available on the DataThief web site.
The original version of DataThief was written by Kees Huyser and Jan van der Laan. Available from Bas Tummers.
Current version: 2.6.9. Supports Mac, Linux, Windows (Java program).
Listing updated: 1/21/21; program updated 10/10/20.
Plot Digitizer is a Java program for digitizing scanned plots of data, from GIF, JPEG, or PNG; you manually select values by clicking on each data point (which may help in complex graphs that automatic software doesn't do well). It can also be used to digitize scaled drawings, orthographic photos, etc.
There is a semi-automatic feature, for data that trends from left to right; you can paint the data you want with a brush, and “the program will automatically sort out grid lines, noise, etc, and will attempt to digitize the line for you. This feat is accomplished with the help of the open source autotrace image vectorization program.”