Foxit Corporation Forums  

Go Back   Foxit Corporation Forums > Portable Document Format (PDF) Tools > Foxit PDF IFilter

Foxit PDF IFilter Foxit IFilter helps users to index a large amount of PDF documents and then quickly find text within these documents.

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old September 4th, 2008, 01:59 AM
robix73 robix73 is offline
Junior Member
 
Join Date: Aug 2008
Posts: 8
robix73 is an unknown quantity at this point
Default amount of extracted text limits?

Hello
has the foxit pdf filter a limit about the amount of text that can be extract from a single pdf files?. To me it seems that the limit is situated at ~150mb of text per pdf files.
Am I wrong?
Reply With Quote
  #2  
Old September 8th, 2008, 07:40 PM
AmyLin AmyLin is offline
Moderator
 
Join Date: Feb 2007
Posts: 92
AmyLin is an unknown quantity at this point
Default

Hello,

Thank you for your feedback.
Foxit didn't limit the file size. I think the they are control by index service.
Most of the registry entries for Indexing Service are found under the key
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Contro l\
ContentIndex.

The DefaultColumnFile entry is found under the key
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Contro l\
ContentIndexCommon.

And I think "MaxTextFilterBytes" is relate to your issue.
Could you help to go to the following link go get more informations about it:
http://msdn.microsoft.com/en-us/libr...19(VS.85).aspx
Reply With Quote
  #3  
Old November 25th, 2008, 08:26 AM
robix73 robix73 is offline
Junior Member
 
Join Date: Aug 2008
Posts: 8
robix73 is an unknown quantity at this point
Default

Hello
as explained in another post, the limits are not from the foxit pdf filter, but they are due to sharepoint registry entries.Foxit Pdf filter is not guilty.
Regards
Reply With Quote
  #4  
Old November 25th, 2008, 05:02 PM
emily emily is offline
Moderator
 
Join Date: Aug 2008
Posts: 38
emily will become famous soon enough
Default

Hello,
This is caused by the default set of maximum document size in MOSS, the
default size is 16MB.

Could you help to try the following steps to chang the maximum document
size.

You need to add the key for MAXDOWNLOADSIZE for the MOSS search.

1. run Regedit.exe.
Start-> run->type \"regedit\", click\"ok\".
2. Locate the following registry subkey:
HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Office Server\\12.0
\\Search\\Global\\Gathering Manager
3. Click \"edit\"->new->DWORD Value, and named it \"maxdownloadsize\"
4. Double-click, choose\"Decimal\"and type the Value data you want
5. Restart the server .start->run, type\"cmd\"->type \"iisreset\"
6. Start a full crawl before search.
Best Regards,
Emily
Reply With Quote
  #5  
Old November 26th, 2008, 12:18 AM
robix73 robix73 is offline
Junior Member
 
Join Date: Aug 2008
Posts: 8
robix73 is an unknown quantity at this point
Default

Emily you're right but that's not enough
With only your suggestion, the best results is that only 10 or maybe 11 mb of text will be indexed, no matter if maxdownloadsize is setted to 4096 (4Gb file size) or tha maxgrowfactor is setted to 16 (and the max text theoric extracted and allowed in this situation is 4096x16).

I explained on another forum all the step i've done and now my box can index over 90Mb of data from a single file.

Here you are the link

http://forums.microsoft.com/MSDN/Sho...94310&SiteID=1

The registry keys involved are others.
They are:

DedicatedFilterProcessMemoryQuota
FilterProcessMemoryQuota
FolderHighPriority
CB_ChunkBufferSizeInMegaBytes
CB_MinBytesReservedForDoc
RobotThreadsNumber


And all of these modifications will affect not only the Moss behaviour versus Pdf files but Moss behaviour versus all filetype (Pdf Doc Docx txt rtf xls and so on)

I Hope that foxit pdf filter team will test these keys and that my suggestion are valueable for all the users.
Sorry for my poor english
Au revoir
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -8. The time now is 10:03 AM.


Powered by vBulletin® Version 3.8.5
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
©2005-2008 Foxit Corporation. All rights reserved.