Although Scraper perfectly grabs information from a single, well-formatted Web page and produces a spreadsheet ready for export, the free Chrome extension isn't helpful for reporters looking to scrape more complicated databases that would require any sort of automation.
For its intended purpose -- single-page scraping -- it's simple enough to use: highlight the text you want to scrape, right click and select "scrape similar" from the menu. Provided the page contains actual tables, the tool functions beautifully from there, funneling the highlighted text and the rest of the page's tabular data into a clear chart in a new tab. The "Export to Google Docs" button then enables reporters to take the data elsewhere.
But if the database isn't formatted just right, don't expect much. We had issues with a site organized without HTML tables and got an error message on a database that displayed its results in an iFrame.
The browser extension is also a little finicky: when you highlight the data you want to scrape, do not highlight the column names or it won't work. If you grab plain text, you will either end up with a blank page result or a tab with the text you highlighted.
Scraper's also not designed for sites with more complex structures, like those with detail pages or even just multiple pages that would be too tedious for a journalist to click through. The developer was quick to point out these limitations, however, in an email response.
Because it's free and easy to use, Scraper can be a great way to grab one-off data, performing a copy-paste-export job without losing the table formatting. However, it's best for reporters working with more complicated data sets to use another tool with more functionality.
1.6
//March 18, 2011
//Free
//Yes
//No
//No
While it quickly gathered data from the first page of the South Dakota lobbyists website, Scraper's single-page limitation meant it could not automatically scrape the other pages of lobbyist information.
READ OUR FULL TEST RESULT »Scraper required a lot of hand-holding to capture even a single page of data from this directory of physicians in British Columbia. But even with manual help, its navigation limitations make it a bad choice for the task.
READ OUR FULL TEST RESULT »Scraper was able to capture some data from result pages after we manually entered a name search in this teacher database, but it doesn't have the ability to navigate through multiple pages, making it unsuitable for this task.
READ OUR FULL TEST RESULT »Because there are technically no tables on the Obama-Biden transition Web site, Scraper was useless for capturing the linked memos we wanted to save.
READ OUR FULL TEST RESULT »Testing
Testing
The Reporters' Lab welcomes relevant discussion from readers, but reserves the right to remove comments flagged as inappropriate or spam. The lab is not responsible for the content of user comments and cannot guarantee their accuracy.