Rich Docs Indexing

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Rich Docs Indexing

Eric Pugh-4
Hi all,

I've been working with the RichDocumentRequestHandler (http://
issues.apache.org/jira/browse/SOLR-284)  for the past weeks, and it  
seems to be working quite well.  We discovered that when we throw a  
27 MB PDF document at it we needed to beef up the Java Heap size, and  
we haven't come up with a great solution for handling PDF documents  
that have a password on them, beyond not indexing them.

I wanted to see if I could get some momentum going on seeing if this  
is something that the committers want in Solr 1.3...   I'd like to  
write up a wiki page similar to http://wiki.apache.org/solr/UpdateCSV 
page that would give folks a chance to see what this code can do, but  
highlight that it is a wiki page about just a patch file?  Would this  
be okay, or misleading to folks?

I've updated the patch to revision 555996.

Thanks for your consideration!   PS, is anyone going to be at OSCON  
in two weeks?  I'd love to meet up with some other Solr folks.

Eric

-------------------------------------------------------
Principal
OpenSource Connections
Site: http://www.opensourceconnections.com
Blog: http://blog.opensourceconnections.com
Cell: 1-434-466-1467




Reply | Threaded
Open this post in threaded view
|

Re: Rich Docs Indexing

Erik Hatcher

On Jul 13, 2007, at 10:31 AM, Eric Pugh wrote:
> I wanted to see if I could get some momentum going on seeing if  
> this is something that the committers want in Solr 1.3...   I'd  
> like to write up a wiki page similar to http://wiki.apache.org/solr/ 
> UpdateCSV page that would give folks a chance to see what this code  
> can do, but highlight that it is a wiki page about just a patch  
> file?  Would this be okay, or misleading to folks?

Eric - kudos!  Thanks for this contribution and effort to document  
it.  There is already precedent here - the Field Collapsing  
contribution has worked thus far too:

        <http://wiki.apache.org/solr/FieldCollapsing>

So go for it!

        Erik, who will one day look at this contribution, but not for a few  
weeks, sorry

Reply | Threaded
Open this post in threaded view
|

Re: Rich Docs Indexing

Yonik Seeley-2
In reply to this post by Eric Pugh-4
On 7/13/07, Eric Pugh <[hidden email]> wrote:
>  I'd like to write up a wiki page that would give folks a chance to see what this code
> can do, but highlight that it is a wiki page about just a patch file?

That's fine.  And if you don't link it to the main page, there
shouldn't be any confusion.

-Yonik