Solr/Nutch Integration Patch Error

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr/Nutch Integration Patch Error

Tkach
Has anyone tried to apply/use the patches to the Nutch trunk from
NUTCH-442? Between that code and the example from Sami's FooFactory
weblog I've been able to at least get things running, but still hit a
snag.  When I try to run SolrIndexer.java I get an error from the Hadoop
MapTask (via Indexer.java:157) complaining about a Type mismatch in the
map "expected org.apache.hadoop.io.ObjectWritable, recieved
org.apache.nutch.crawl.NutchWritable".

Looking at Indexer.java I can see where the OutputFormatter.map() seems
to be trying to "send" a new NutchWritable, but SolrIndexer.index() sets
up its JobConf map to use an ObjectWritable.  I suspect that's where the
problem is, but I'm not familiar enough with the code (so far) to be
able to tell how to fix this.

I can post the errors/logs from it.  I just wasn't sure which was
relevant nor what the best way was (didn't want to just dump loads of
lines of unformatted stack traces here).

--
This email message and any attachments are for the sole use of the intended
recipient(s) and may contain information that is proprietary to Ahold and/or
its subsidiaries ("Ahold") or otherwise confidential or legally privileged.
If you have received this message in error, please notify the sender by
reply, and delete all copies of this message and any attachments.  If you
are the intended recipient you may use the information contained in this
message and any files attached to this message only as authorized by Ahold.
Files attached to this message may only be transmitted using secure systems
and appropriate means of encryption, and must be secured using the same
level of password and security protection with which the file was provided
to you.  Any unauthorized use, dissemination or disclosure of this message
or its attachments is strictly prohibited.
Reply | Threaded
Open this post in threaded view
|

Re: Solr/Nutch Integration Patch Error

Brian Whitman


On Feb 12, 2008, at 11:57 AM, Nick Tkach wrote:

> Has anyone tried to apply/use the patches to the Nutch trunk from  
> NUTCH-442? Between that code and the example from Sami's FooFactory  
> weblog I've been able to at least get things running, but still hit  
> a snag.  When I try to run SolrIndexer.java I get an error from the  
> Hadoop MapTask (via Indexer.java:157) complaining about a Type  
> mismatch in the map "expected org.apache.hadoop.io.ObjectWritable,  
> recieved org.apache.nutch.crawl.NutchWritable".


for sami's version, look here http://variogram.com/latest/?p=26


Reply | Threaded
Open this post in threaded view
|

Re: Solr/Nutch Integration Patch Error

Tkach
Ah, thank you, that's much closer.  I just have one other question.
When I try to compile the SolrClientAdapter.java from the zip file on
the page you mentioned I get errors on the calls to solrDoc.setBoost()
and solrDoc.addField(), and solrDoc.add().  (cannot find symbol)

I've looked around and I only see a SimpleSolrDoc defined in the
solr-client.zip from SOLR-20 and that doesn't seem to inherit nor define
any of these functions in its SimpleSolrDoc.  Did SimpleSolrDoc.java
move or something maybe?

I did find though that the SolrClientAdapter.java posted on FooFactory
(bottom of the page) seems to compile and run just fine in this case, so
it's not a big deal.

Brian Whitman wrote:

>
>
> On Feb 12, 2008, at 11:57 AM, Nick Tkach wrote:
>
>> Has anyone tried to apply/use the patches to the Nutch trunk from
>> NUTCH-442? Between that code and the example from Sami's FooFactory
>> weblog I've been able to at least get things running, but still hit a
>> snag.  When I try to run SolrIndexer.java I get an error from the
>> Hadoop MapTask (via Indexer.java:157) complaining about a Type
>> mismatch in the map "expected org.apache.hadoop.io.ObjectWritable,
>> recieved org.apache.nutch.crawl.NutchWritable".
>
>
> for sami's version, look here http://variogram.com/latest/?p=26
>

--
This email message and any attachments are for the sole use of the intended
recipient(s) and may contain information that is proprietary to Ahold and/or
its subsidiaries ("Ahold") or otherwise confidential or legally privileged.
If you have received this message in error, please notify the sender by
reply, and delete all copies of this message and any attachments.  If you
are the intended recipient you may use the information contained in this
message and any files attached to this message only as authorized by Ahold.
Files attached to this message may only be transmitted using secure systems
and appropriate means of encryption, and must be secured using the same
level of password and security protection with which the file was provided
to you.  Any unauthorized use, dissemination or disclosure of this message
or its attachments is strictly prohibited.