Foxit Corporation Forums  

Go Back   Foxit Corporation Forums > Portable Document Format (PDF) Tools > Foxit PDF IFilter

Foxit PDF IFilter Foxit IFilter helps users to index a large amount of PDF documents and then quickly find text within these documents.

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old December 9th, 2008, 10:45 AM
cadyellow cadyellow is offline
Junior Member
 
Join Date: Dec 2008
Posts: 2
cadyellow is an unknown quantity at this point
Default PDFs with poor OCR'd text layer

Hi,
Does IFilter's ability to find words within PDFs depend on the goodness of the OCR that was done at the time the PDF's text layer was created, i.e., the quality of the existing text layer?

I ask because I got a Canon scanner last year and planned to scan a huge # of documents, but success from the scan project is so far unrealized since it depends on being able to get good search results. That's been a complete failure--searching generally does NOT find words that are in the PDF document. I found out why when I saw the hidden text layer that got created by Canon's OCR utility and discovered that the majority of words either don't appear at all or are so misspelled that search engines could not be expected to find them. So if IFilter depends on such a poor text layer, it's not going to be able to do any better job than the search tools I've already got.
Reply With Quote
  #2  
Old December 9th, 2008, 07:30 PM
emily emily is offline
Moderator
 
Join Date: Aug 2008
Posts: 38
emily will become famous soon enough
Post

Hello,
Thank you for your interest in Foxit PDF IFilter.
Foxit PDF IFilter does not extract the text from the OCR now, but we will deal with it as a high priority.Thanks.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -8. The time now is 07:06 AM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
©2005-2008 Foxit Corporation. All rights reserved.