Help with StopFilterFactory

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Help with StopFilterFactory

heaven
This post was updated on .
Hi, just tried your suggestion but get this error:
26.8.2014 12:53:19 ERROR SolrCore org.apache.solr.common.SolrException: Exception writing document id Site 4078370 to the index; possible analysis error.
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164)
	at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
	at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:704)
	at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:858)
	at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:557)
	at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
	at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247)
	at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
	at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
	at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
	at org.eclipse.jetty.server.Server.handle(Server.java:364)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
	at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
	at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
	at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
	at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: enablePositionIncrements=false is not supported anymore as of Lucene 4.4 as it can create broken token streams
	at org.apache.lucene.analysis.util.FilteringTokenFilter.checkPositionIncrement(FilteringTokenFilter.java:40)
	at org.apache.lucene.analysis.util.FilteringTokenFilter.setEnablePositionIncrements(FilteringTokenFilter.java:142)
	at org.apache.lucene.analysis.core.StopFilterFactory.create(StopFilterFactory.java:128)
	at org.apache.solr.analysis.TokenizerChain.createComponents(TokenizerChain.java:67)
	at org.apache.lucene.analysis.AnalyzerWrapper.createComponents(AnalyzerWrapper.java:102)
	at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:180)
	at org.apache.lucene.document.Field.tokenStream(Field.java:552)
	at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:103)
	at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
	at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
	at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:465)
	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1537)
	at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:236)
	at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
	... 40 more

And then I found the next: http://stackoverflow.com/questions/18668376/solr-4-4-stopfilterfactory-and-enablepositionincrements.

I don't really know why they did so, the reason that "it can create broken token streams" doesn't fit in my mind. Perhaps those who made this decision do not use Solr so they simply don't care, that's the only explanation I can find.
Reply | Threaded
Open this post in threaded view
|

Re: Help with StopFilterFactory

heaven
This post was updated on .
So it sounds like a bug/regression to me, doesn't it? Interned is full of complaints about this issue and why should all we suffer because of someone, who didn't know when and how to use this feature and as result got wrong data indexed? Who cares about it??? And why to remove the option that is so useful for many people who do know how to use it?
Reply | Threaded
Open this post in threaded view
|

Re: Help with StopFilterFactory

Jack Krupansky-2
In reply to this post by heaven
Sigh. Maybe I vaguely recall some vague discussion of this.

Okay, so you can get the old" behavior, either by globally setting the
"lucene match version" in solrconfig:

<luceneMatchVersion>4.3</luceneMatchVersion>

Or, probably best, just set the lucene match version for that specific token
filter by adding this attribute:

luceneMatchVersion="4.3"

But... the old behavior is now "deprecated", so it mostly likely will not be
in Solr 5.0.

I'll think about this some more as to whether there might be some workaround
or alternative.

-- Jack Krupansky

-----Original Message-----
From: heaven
Sent: Tuesday, August 26, 2014 6:02 AM
To: [hidden email]
Subject: Re: Help with StopFilterFactory

Hi, just tried your suggestion but get this error:


And then I found the next:
http://stackoverflow.com/questions/18668376/solr-4-4-stopfilterfactory-and-enablepositionincrements.

I don't really know why they did so, the reason that "it can create broken
token streams" doesn't fit in my mind. Perhaps those who made this decision
do not use Solr so they simply don't care, that's the only explanation I can
find.



--
View this message in context:
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839p4155157.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Help with StopFilterFactory

heaven
There is — admit that enablePositionIncrements removal was a bad idea and restore it. Why to remove an option that has no alternatives because of those who get wrong results with it? I really don't understand this approach. And what should we do now, after spending lots of money on the integration with Solr? Now we have to search alternatives because of such weird decisions.
Reply | Threaded
Open this post in threaded view
|

Re: Help with StopFilterFactory

Jack Krupansky-2
In reply to this post by heaven
I agree that it's a bad situation, and wasn't handled well by the Lucene
guys. They may have had good reasons, but they didn't execute a decent plan
for how to migrate existing behavior.

-- Jack Krupansky

-----Original Message-----
From: heaven
Sent: Tuesday, August 26, 2014 6:51 AM
To: [hidden email]
Subject: Re: Help with StopFilterFactory

So it sounds like a bug to me, doesn't it? Interned is full of complaints
about this issue and why should all we suffer because of someone, who didn't
know when and how to use this feature and as result got wrong data indexed?
Who cares about it??? And why to remove the option that is so useful for
many people who do know how to use it?



--
View this message in context:
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839p4155162.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Help with StopFilterFactory

heaven
This post was updated on .
They did:
>> If this behavior does not fit the application needs, the query parser needs to be configured to not take position increments into account when generating phrase queries.

Another one would be to write your own search engine maybe :)
Reply | Threaded
Open this post in threaded view
|

Re: Help with StopFilterFactory

heaven
In reply to this post by Jack Krupansky-2
Hello,

Any thoughts on this? Should I open a jira ticket? Or how can we engage at least one of Solr devs to this issue?

Best,
Alex
Reply | Threaded
Open this post in threaded view
|

Re: Help with StopFilterFactory

heaven
12