PDA

View Full Version : Index Large PDF files


gorak
August 4th, 2009, 11:31 PM
I recently installed Foxit PDF Ifilter and integrated it with WDS 4.0. I'm very happy with the performance and it's much better than adobe, but it has the same issue as adobe. I have a lot of pdf documents which are more than 1200 pages. The problem is foxit indexes only the first 700-800 pages of text and then rest are simply ignored. This is really annoying for me since I miss most of the materials in the last few hundred page of the pdf document. How do I overcome this issue? I'm a research student and need to index a lot of documents on a regular basis. Pls help me.

emily
August 5th, 2009, 02:35 AM
Hello Gorak,
Thanks for your feedback.
Could you send a sample file to us for testing? Because of we cannot reproduce this problem. Here is the Email: support@foxitsoftware.com. Please contain the keyword "Foxit IFilter" in your email title.
Thanks for your help.
Emily

gorak
August 5th, 2009, 08:46 AM
I cannot send any of my pdf's since they are confidential. But the problem is reproducible very easily. Here are the steps to reproduce it.

1. Open a pdf file that is at least 1000-1200 pages big with full of text.

2. Now go to, say page 1100

3. Copy some text in a paragraph, (the text in the paragraph should be unique and should not appear elsewhere in the pdf) and paste it in the deskbar of windows search 4.0 within quotes. Windows search 4.0 will not find that. But you do the same for text in page 300 or 400, then windows search will be able to find it.

gorak
August 5th, 2009, 07:41 PM
Is there no one who can help me out on this?

emily
August 6th, 2009, 12:07 AM
Hello Gorak,
About the issue which you reported, I had tested by using 1310 pages file, this issue may be caused by the setting of WDS4.0. Because I use ifilttst.exe to call Foxit PDF IFilter (1.0.0.2411), Adobe IFilter 6.0, they both can extract the content of last few hundred pages of the pdf documents. The ifilttst.exe is a part of Windows Server 2003 Resource Kit Tools; you can download it from the following link:
http://www.microsoft.com/downloads/details.aspx?familyid=9d467a69-57ff-4ae7-96ee-b18c4790cffd&displaylang=en.
The tool is described here in the MSDN library:
http://msdn.microsoft.com/en-us/library/ms692580(VS.85).aspx.

Emily

gorak
August 6th, 2009, 09:17 AM
Okay so what options do you think I need to set in WDS 4.0 to make this work?

gorak
August 6th, 2009, 02:09 PM
I understand this could be related to WDS 4.0. Any clues on what might cause this issue with WDS 4.0 ?

emily
August 7th, 2009, 01:55 AM
]Hello Gorak,
About WDS4.0, please contact Microsoft, here is the Email: support@microsoft.com. They may be known WDS4.0 better than me.
Emily