Thursday, October 30, 2008

Google’s PDF Search Throws Some Light on the Dark Web

Google said on its official blog that it has developed optical character recognition technology to the point that its search engine can read any scanned document in Adobe’s PDF format, effectively turning scanned images into words that are searchable and indexable.

It’s no secret that Google has been looking into OCR; the Mountain View, Calif.-based company’s efforts to make books and newspapers digitally searchable are also related to its broader efforts to expand the parameters of search.

No comments: