Google Drive

Google Drive text recognition promising, but file size limits a problem

Although it chokes at 2 MB or 10 pages, Drive works well for small files.

Overall:

OCR limits present significant constraints; useful for document sharing, management

Documentation:

Google maintains help files, online guides, blog, brief online documentation; staff responded to an email questions within a day

Usability:

Extremely simple to use, especially for anyone accustomed to Google interfaces

Community:

After only a month, has active moderated forum with hundreds of threads

Performance:

Great on small files, useless on large ones

Product:

Google Drive

//
Company:

Google

//
Cost:

Free-$2.49/month

Google Drive is fast and simple to use, with precise searches and abundant memory for projects. But for all that storage capacity, the Web-based service has severe file size restrictions that pose significant problems for reporters who often work with large documents. 

Drive, working in sync with Google Docs, imposes a 2 MB limit for documents requiring optical character recognition (although users can store larger documents in their original formats). It also stops recognizing text at 10 pages, regardless of file size. As for whether this limit may expand in the future, a Google Drive representative said the company had no pending improvements to announce.

And that's a shame, because when Google Drive does accept files, it handles them almost flawlessly. When we tested it on converting scanned-in memos, it reproduced the text verbatim in two files. But it rejected two other files outright, because they exceeded the service's 2MB limit.

Those limits were more vexing in our test of PDF files of typed transcripts. Drive failed to import three of five files, and in the two files it converted, it stopped at 10 pages, foiling our test parameters. File-size limits also caused it to fail our test of a tricky partial-text PDF. It did much better in a test of converting form-based PDFs, quickly accepting and converting the 174 small files and rendering the text well enough that a search found nearly all of the required terms.

Google Drive shows promise for reporters with certain tasks, such as managing a slew of letters, emails or government forms. Reporters can expect quick uploads, easy file-sharing and generally accurate searching, given the usual caveats. However, the file-size issue is a major constraint for reporters dealing with large files and tight deadlines; other programs and services we tested have no such limits.

It might be possible to use those programs to split the documents into manageable pieces, and then upload to Drive. But if you have to use another program for document parsing, it would be simpler to use it for the conversion and analysis as well. 

 
Product:

Google Drive

//
Company:

Google

//
OS Tested:

Web Based

//
Cost:

Free-$2.49/month

//
Open Sourced:

No

//
Demo Available:

No

//
Obsolete:

No

 

How Google Drive performed on our tests

Verdict:

Converted text in most search targets accurately

Drive quickly recognizes text in scanned forms

Google Drive handled this test with relative ease, uploading and recognizing text in 174 political candidate disclosure forms in about 40 minutes.

READ OUR FULL TEST RESULT »

Verdict:

Flawlessly converted small documents; didn't accept larger files

Google Drive flawless with scanned-in memos, but only if small files

Due to its file size limit, Google Drive was only able to recognize text in half the memos we tested from the Obama administration's Your Seat at the Table site.

READ OUR FULL TEST RESULT »

Verdict:

Couldn't upload large file

Google Drive fails to convert large partial-text PDF

Google Drive failed completely in our attempt to recognize text in this list of executive branch reports required by Congress.

READ OUR FULL TEST RESULT »

Verdict:

Can't handle large files, long documents; completely failed test

Drive chokes on large transcript files, fails to convert text

Google Drive couldn't get out of the starting gate in this test of combatant tribunal transcripts because of its file size restrictions.

READ OUR FULL TEST RESULT »
comments powered by Disqus

The Reporters' Lab welcomes relevant discussion from readers, but reserves the right to remove comments flagged as inappropriate or spam. The lab is not responsible for the content of user comments and cannot guarantee their accuracy.

Testing

Testing