DocumentCloud

DocumentCloud performs erratically when searching memos

Sometimes its entity analysis exceeds standard, other times it falls far short.

Product:

DocumentCloud

//
Company:

Investigative Reporters and Editors

//
Cost:

Free

Verdict:

Good for a first pass, not enough for rock-solid analysis

Read Latest Review »

DocumentCloud only spotted about half the references of organizations mentioned in these memos from the Obama-Biden transition team -- although it performed better than our annotator on those it did catch.

For three of the six government agencies and organizations, it spotted all of the references from the test criteria and then some, picking up others completely missed by our own test creators. For example, it accurately found 64 references to "Ducks Unlimited" -- the test required it find only 43.

However, the other three groups it missed completely. Some, like the Global Privacy and Information Quality Working Group or its abbreviation, GPIQWG, are hidden in a muddle of acronyms. But it also completely missed "U.S. Department of Justice."

According to developer Jeremy Ashkenas, DocumentCloud's entity analysis service, OpenCalais, uses a complex set of lists and rules to identify names and organizations, and they may simply miss some entities.

A second pass through the system produced identical results.

 
Product:

DocumentCloud

//
Company:

Investigative Reporters and Editors

//
Version Tested:

N/A

//
Release Date:

2009

//
OS Tested:

Web Based

//
Cost:

Free

//
Open Sourced:

Yes

//
Demo Available:

No

//
Obsolete:

No

Testing

Testing