How on EARTH do I remove 's in schema file?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

How on EARTH do I remove 's in schema file?

donato
I have been racking my brain for days... I need to remove 's from say "patrick's" If I search for "patrick" or "patricks" I get the same number of results, however, if I search for "patrick's" it's a different number. I just want solr to ignore the 's Can someone PLEASE help me!!!! It is driving me nuts!!!! Here is my schema file... Id Name
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

Erick Erickson
Your schema file didn't come through. Have you tried looking at the
admin UI/Analysis page for the three values? That often tells you what
is going on.

The other thing to do is attach &debug=query to the URL. That'll show
you how the query parsed, which is separate from the analysis bits.

Best,
Erick

On Fri, Mar 17, 2017 at 3:30 PM, donato <[hidden email]> wrote:

> I have been racking my brain for days... I need to remove 's from say
> "patrick's" If I search for "patrick" or "patricks" I get the same number of
> results, however, if I search for "patrick's" it's a different number. I
> just want solr to ignore the 'sCan someone PLEASE help me!!!! It is driving
> me nuts!!!!Here is my schema file...
> Id                          Name
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709.html
> Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

Erick Erickson
what stemmers are you using? I got the results I by using
EnglishPosessiveFilterFactory followeed by PorterStemFilterFactory.

Or you could use Porter and remove the leftover trailing apostrophe.

Best,
Erick

On Fri, Mar 17, 2017 at 5:05 PM, Erick Erickson <[hidden email]> wrote:

> Your schema file didn't come through. Have you tried looking at the
> admin UI/Analysis page for the three values? That often tells you what
> is going on.
>
> The other thing to do is attach &debug=query to the URL. That'll show
> you how the query parsed, which is separate from the analysis bits.
>
> Best,
> Erick
>
> On Fri, Mar 17, 2017 at 3:30 PM, donato <[hidden email]> wrote:
>> I have been racking my brain for days... I need to remove 's from say
>> "patrick's" If I search for "patrick" or "patricks" I get the same number of
>> results, however, if I search for "patrick's" it's a different number. I
>> just want solr to ignore the 'sCan someone PLEASE help me!!!! It is driving
>> me nuts!!!!Here is my schema file...
>> Id                          Name
>>
>>
>>
>> --
>> View this message in context: http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

donato
Thanks for the response, Erik!

Can you download my schema file here? CLICK HERE.

I'm not too familiar with this technology yet. I tried adding that &debug=query at the end of my URL, but nothing happened.

Thanks again for the repsonse! All along, I just wanted queries for cat, cats, kitten and kitties to return the same number of results - and it does - partially because of the synonyms.txt file.

But this apostrophe thing is killing me!
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

Erick Erickson
bq: I'm not too familiar with this technology yet. I tried adding that
&debug=query at the end of my URL, but nothing happened.

You need to look at the raw response. There should be a section at the
end of the response where debug information is appended.

Please just paste the relevant bits of your xml file inline for the
field you're considering. The admin UI>>select core>>analysis page is
_really_ your friend here.

Best,
Erick

On Fri, Mar 17, 2017 at 5:29 PM, donato <[hidden email]> wrote:

> Thanks for the response, Erik!
>
> Can you download my schema file here?  CLICK HERE <https://we.tl/PHEknbNa5Z>
> .
>
> I'm not too familiar with this technology yet. I tried adding that
> &debug=query at the end of my URL, but nothing happened.
>
> Thanks again for the repsonse! All along, I just wanted queries for cat,
> cats, kitten and kitties to return the same number of results - and it does
> - partially because of the synonyms.txt file.
>
> But this apostrophe thing is killing me!
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325718.html
> Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

donato
Erick,

Here is the analysis: https://www.screencast.com/t/DKKklTXk

Do you need everything on that page? I'm not sure what I am looking for here...

Also, this is my current schema.xml file DOWNLOAD HERE. Not sure if I have something in the wrong place/order?

Thanks again! I really need this done by Monday...

Cheers.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

vishal jain
In reply to this post by donato
Try "stemEnglishPossessive" to remove.

On Sat, Mar 18, 2017 at 4:00 AM, donato <[hidden email]> wrote:

> I have been racking my brain for days... I need to remove 's from say
> "patrick's" If I search for "patrick" or "patricks" I get the same number
> of
> results, however, if I search for "patrick's" it's a different number. I
> just want solr to ignore the 'sCan someone PLEASE help me!!!! It is driving
> me nuts!!!!Here is my schema file...
> Id                          Name
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709.html
> Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

Erick Erickson
First, uncheck the "verbose" checkbox. The nitty-gritty information
isn't relevant at this point.

Second, hover over each of the light-gray like "MCF", "PRCF" and such.
You'll see the element of the analysis chain that stands for, and the
difference between the line before and this line is the effect of that
element. For instance, on the query side you see that "patrick" is
turned into "patrick", "patricks" and "patrick's" by "SF" which I'd
guess is your SynonymFilter. But hovering over that will tell you
exactly what element is producing those changes.

Then it looks like you're using HTMLStripCharFilter, MappingCharFilter
and PatternReplaceCharFilter (Factories all). Why do you think all
those are necessary?

So stop. Take a deep breath. My guess is that you've been trying a
bunch of different approaches and the interactions of all the
different parts are throwing you off. Start simple, with say
StandardTokenizerFactory
LowercaseFilterFactory
EnglishPosessiveFilterFactory
PorterStemFilterFactory

Use the analysis page and work your way toward complexity. Concentrate
on the indexing side first. Enter all three of your variants (jack
jacks jack's) in the box and press the button. Do not pass go. Do not
collection $200 until you see the effects of your changes on the
analysis page.

Your stated goal here is that all of your variants reduce to "jack" in
the example above. Don't bother querying until you see that result in
your index.

Tip: It is a bit clumsy to have to restart Solr every time you make
changes in your schema (although if you're running stand-alone you can
reload the core). So I often define several different field types with
different possibilities and compare them after a single reload.

Best,
Erick

On Sat, Mar 18, 2017 at 8:12 PM, vishal jain <[hidden email]> wrote:

> Try "stemEnglishPossessive" to remove.
>
> On Sat, Mar 18, 2017 at 4:00 AM, donato <[hidden email]> wrote:
>
>> I have been racking my brain for days... I need to remove 's from say
>> "patrick's" If I search for "patrick" or "patricks" I get the same number
>> of
>> results, however, if I search for "patrick's" it's a different number. I
>> just want solr to ignore the 'sCan someone PLEASE help me!!!! It is driving
>> me nuts!!!!Here is my schema file...
>> Id                          Name
>>
>>
>>
>> --
>> View this message in context: http://lucene.472066.n3.
>> nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

donato
Thank you so much, Erick! I will try that!

I do have one other question though... what sections do I do all of this in? I see like four or five sections with different things in them. Do I use all of those in each section or just in some? What is each section? What do they do?

Thanks again for your time. Truly. Thank you!


From: Erick Erickson [via Lucene] <ml-node+[hidden email]>
Sent: Saturday, March 18, 2017 11:29:49 PM
To: donato
Subject: Re: How on EARTH do I remove 's in schema file?
 
First, uncheck the "verbose" checkbox. The nitty-gritty information
isn't relevant at this point.

Second, hover over each of the light-gray like "MCF", "PRCF" and such.
You'll see the element of the analysis chain that stands for, and the
difference between the line before and this line is the effect of that
element. For instance, on the query side you see that "patrick" is
turned into "patrick", "patricks" and "patrick's" by "SF" which I'd
guess is your SynonymFilter. But hovering over that will tell you
exactly what element is producing those changes.

Then it looks like you're using HTMLStripCharFilter, MappingCharFilter
and PatternReplaceCharFilter (Factories all). Why do you think all
those are necessary?

So stop. Take a deep breath. My guess is that you've been trying a
bunch of different approaches and the interactions of all the
different parts are throwing you off. Start simple, with say
StandardTokenizerFactory
LowercaseFilterFactory
EnglishPosessiveFilterFactory
PorterStemFilterFactory

Use the analysis page and work your way toward complexity. Concentrate
on the indexing side first. Enter all three of your variants (jack
jacks jack's) in the box and press the button. Do not pass go. Do not
collection $200 until you see the effects of your changes on the
analysis page.

Your stated goal here is that all of your variants reduce to "jack" in
the example above. Don't bother querying until you see that result in
your index.

Tip: It is a bit clumsy to have to restart Solr every time you make
changes in your schema (although if you're running stand-alone you can
reload the core). So I often define several different field types with
different possibilities and compare them after a single reload.

Best,
Erick

On Sat, Mar 18, 2017 at 8:12 PM, vishal jain <[hidden email]> wrote:

> Try "stemEnglishPossessive" to remove.
>
> On Sat, Mar 18, 2017 at 4:00 AM, donato <[hidden email]> wrote:
>
>> I have been racking my brain for days... I need to remove 's from say
>> "patrick's" If I search for "patrick" or "patricks" I get the same number
>> of
>> results, however, if I search for "patrick's" it's a different number. I
>> just want solr to ignore the 'sCan someone PLEASE help me!!!! It is driving
>> me nuts!!!!Here is my schema file...
>> Id                          Name
>>
>>
>>
>> --
>> View this message in context: http://lucene.472066.n3.
>> nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709.html
>> Sent from the Solr - User mailing list archive at Nabble.com.



If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325826.html
To unsubscribe from How on EARTH do I remove 's in schema file?, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

Erick Erickson
OK, you're defining a <fieldType>. It has one or two sections,
<analyzer type="index"> blah blah blah
<analyzer type="query">

For the time being, these should pretty much be very, very similar if
not identical.

If you only have
<analyzer> in the fieldType, then the same analysis chain is used both
for indexing and querying.

The admin UI analysis page can either show you specific fields or
there's a "types" (or "fieldTypes, I forget which) later on in the
drop-down. So to see the analysis results, all you have to do is
define a fieldType and reload/restart.

Once you're satisfied with the fieldType, you assign it to a specific
field with the
<field..../> tag

Have fun!
Erick

On Sat, Mar 18, 2017 at 8:34 PM, donato <[hidden email]> wrote:

> Thank you so much, Erick! I will try that!
>
> I do have one other question though... what sections do I do all of this in? I see like four or five sections with different things in them. Do I use all of those in each section or just in some? What is each section? What do they do?
>
> Thanks again for your time. Truly. Thank you!
>
> ________________________________
> From: Erick Erickson [via Lucene] <[hidden email]>
> Sent: Saturday, March 18, 2017 11:29:49 PM
> To: donato
> Subject: Re: How on EARTH do I remove 's in schema file?
>
> First, uncheck the "verbose" checkbox. The nitty-gritty information
> isn't relevant at this point.
>
> Second, hover over each of the light-gray like "MCF", "PRCF" and such.
> You'll see the element of the analysis chain that stands for, and the
> difference between the line before and this line is the effect of that
> element. For instance, on the query side you see that "patrick" is
> turned into "patrick", "patricks" and "patrick's" by "SF" which I'd
> guess is your SynonymFilter. But hovering over that will tell you
> exactly what element is producing those changes.
>
> Then it looks like you're using HTMLStripCharFilter, MappingCharFilter
> and PatternReplaceCharFilter (Factories all). Why do you think all
> those are necessary?
>
> So stop. Take a deep breath. My guess is that you've been trying a
> bunch of different approaches and the interactions of all the
> different parts are throwing you off. Start simple, with say
> StandardTokenizerFactory
> LowercaseFilterFactory
> EnglishPosessiveFilterFactory
> PorterStemFilterFactory
>
> Use the analysis page and work your way toward complexity. Concentrate
> on the indexing side first. Enter all three of your variants (jack
> jacks jack's) in the box and press the button. Do not pass go. Do not
> collection $200 until you see the effects of your changes on the
> analysis page.
>
> Your stated goal here is that all of your variants reduce to "jack" in
> the example above. Don't bother querying until you see that result in
> your index.
>
> Tip: It is a bit clumsy to have to restart Solr every time you make
> changes in your schema (although if you're running stand-alone you can
> reload the core). So I often define several different field types with
> different possibilities and compare them after a single reload.
>
> Best,
> Erick
>
> On Sat, Mar 18, 2017 at 8:12 PM, vishal jain <[hidden email]</user/SendEmail.jtp?type=node&node=4325826&i=0>> wrote:
>
>> Try "stemEnglishPossessive" to remove.
>>
>> On Sat, Mar 18, 2017 at 4:00 AM, donato <[hidden email]</user/SendEmail.jtp?type=node&node=4325826&i=1>> wrote:
>>
>>> I have been racking my brain for days... I need to remove 's from say
>>> "patrick's" If I search for "patrick" or "patricks" I get the same number
>>> of
>>> results, however, if I search for "patrick's" it's a different number. I
>>> just want solr to ignore the 'sCan someone PLEASE help me!!!! It is driving
>>> me nuts!!!!Here is my schema file...
>>> Id                          Name
>>>
>>>
>>>
>>> --
>>> View this message in context: http://lucene.472066.n3.
>>> nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion below:
> http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325826.html
> To unsubscribe from How on EARTH do I remove 's in schema file?, click here<
> NAML<
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325827.html
> Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

donato
Thanks, Erick!

Okay. Clearly the StandardTokenizerFactory is keeping the 's. I have commented almost everything out, including the stopwords, which doesn't even HAVE patrick's in there.

This is what the analyizer is saying now: for the Name Fieldname / FieldType

ST | patrick's - StandardTokenizerFactory
SF | patrick - synonymfilter
EPF | patrick - english posessive
PSF | patrick - porter stem filter
LCF | patrick - lower case filter

However, this is what it is saying for Tag

DA | patrick's - FieldType$DefaultAnalyzer$1 - whatever that means

It's still showing different results between patrick and patrick's. I don't understand why those other filters are not having ANY effect. This is killing me!

 
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

donato
In reply to this post by Erick Erickson
And here is my most recent schema.xml file after your suggestions... DOWNLOAD HERE
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

John Blythe
StandardTokenizer IS removing it. The token you see in each line is what is
passed _in_ to the tokenizer. The next line shows what came out.

On Sun, Mar 19, 2017 at 9:00 AM donato <[hidden email]> wrote:

> And here is my most recent schema.xml file after your suggestions... *
> <https://we.tl/OskBwQwWAo> DOWNLOAD HERE*
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325841.html
> Sent from the Solr - User mailing list archive at Nabble.com.

--
--
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | [hidden email]
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

donato
Then why is it not working? It doesn't make sense at all? And in the Tag field, it appears NOTHING is happening... It states something about a DefaultAnalyzer?

I am seriously at a loss... this seems like a simple solution that shouldn't be this hard! And it should be somewhat expected. 

Why is the Tag field not using the analyzer and filters I have in place?



From: John Blythe [via Lucene] <ml-node+[hidden email]>
Sent: Sunday, March 19, 2017 9:04:29 AM
To: donato
Subject: Re: How on EARTH do I remove 's in schema file?
 
StandardTokenizer IS removing it. The token you see in each line is what is
passed _in_ to the tokenizer. The next line shows what came out.

On Sun, Mar 19, 2017 at 9:00 AM donato <[hidden email]> wrote:

> And here is my most recent schema.xml file after your suggestions... *
> <https://we.tl/OskBwQwWAo> DOWNLOAD HERE*
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325841.html
> Sent from the Solr - User mailing list archive at Nabble.com.

--
--
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | [hidden email]
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713



If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325842.html
To unsubscribe from How on EARTH do I remove 's in schema file?, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

John Blythe
it is working. the 's is stripped in the next line. the token doesn't
change as the analysis chain progresses bc there is nothing to change. you
seem to have no synonyms for "patrick" in your synonym file.
EnglishPossessive has no possessive to strip (ST already did that).
porterstem stems "patrick" with "patrick". and LowerCase filter has no work
to do because it's already in lower case.

as to your "tag" field, <field name="Tag" type="string" indexed="true"
stored="true" multiValued="true" required="false"/>

that's a string type. what goes in comes out, nothing more and nothing
less. your definitions above aren't for the tag field, they're for other
fields. you can't update your "text" field's definitions and expect "tag"
to inherit them, they'll need to be set as well

--
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | [hidden email]
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713

On Sun, Mar 19, 2017 at 9:08 AM, donato <[hidden email]> wrote:

> Then why is it not working? It doesn't make sense at all? And in the Tag
> field, it appears NOTHING is happening... It states something about a
> DefaultAnalyzer?
>
> I am seriously at a loss... this seems like a simple solution that
> shouldn't be this hard! And it should be somewhat expected.
>
> Why is the Tag field not using the analyzer and filters I have in place?
>
>
> ________________________________
> From: John Blythe [via Lucene] <[hidden email]>
> Sent: Sunday, March 19, 2017 9:04:29 AM
> To: donato
> Subject: Re: How on EARTH do I remove 's in schema file?
>
> StandardTokenizer IS removing it. The token you see in each line is what is
> passed _in_ to the tokenizer. The next line shows what came out.
>
> On Sun, Mar 19, 2017 at 9:00 AM donato <[hidden email]</user/SendEmail.jtp?type=node&node=4325842&i=0>>
> wrote:
>
> > And here is my most recent schema.xml file after your suggestions... *
> > <https://we.tl/OskBwQwWAo> DOWNLOAD HERE*
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-
> remove-s-in-schema-file-tp4325709p4325841.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
> --
> --
> *John Blythe*
> Product Manager & Lead Developer
>
> 251.605.3071 | [hidden email]</user/SendEmail.jtp?
> type=node&node=4325842&i=1>
> www.curvolabs.com
>
> 58 Adams Ave
> Evansville, IN 47713
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-
> remove-s-in-schema-file-tp4325709p4325842.html
> To unsubscribe from How on EARTH do I remove 's in schema file?, click
> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=
> unsubscribe_by_code&node=4325709&code=ZGRpY2VjY2FAb3V0bG9vay5jb218ND
> MyNTcwOXwtMTcwNTcxMzYyNg==>.
> NAML<http://lucene.472066.n3.nabble.com/template/
> NamlServlet.jtp?macro=macro_viewer&id=instant_html%
> 21nabble%3Aemail.naml&base=nabble.naml.namespaces.
> BasicNamespace-nabble.view.web.template.NabbleNamespace-
> nabble.view.web.template.NodeNamespace&breadcrumbs=
> notify_subscribers%21nabble%3Aemail.naml-instant_emails%
> 21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-
> tp4325709p4325843.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

donato
Thank you so much, John! How do I set the Tag field then too? I think that is the problem, correct?

Sorry, but I just inherited this recently...



From: John Blythe [via Lucene] <ml-node+[hidden email]>
Sent: Sunday, March 19, 2017 9:17:52 AM
To: donato
Subject: Re: How on EARTH do I remove 's in schema file?
 
it is working. the 's is stripped in the next line. the token doesn't
change as the analysis chain progresses bc there is nothing to change. you
seem to have no synonyms for "patrick" in your synonym file.
EnglishPossessive has no possessive to strip (ST already did that).
porterstem stems "patrick" with "patrick". and LowerCase filter has no work
to do because it's already in lower case.

as to your "tag" field, <field name="Tag" type="string" indexed="true"
stored="true" multiValued="true" required="false"/>

that's a string type. what goes in comes out, nothing more and nothing
less. your definitions above aren't for the tag field, they're for other
fields. you can't update your "text" field's definitions and expect "tag"
to inherit them, they'll need to be set as well

--
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | [hidden email]
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713

On Sun, Mar 19, 2017 at 9:08 AM, donato <[hidden email]> wrote:

> Then why is it not working? It doesn't make sense at all? And in the Tag
> field, it appears NOTHING is happening... It states something about a
> DefaultAnalyzer?
>
> I am seriously at a loss... this seems like a simple solution that
> shouldn't be this hard! And it should be somewhat expected.
>
> Why is the Tag field not using the analyzer and filters I have in place?
>
>
> ________________________________
> From: John Blythe [via Lucene] <[hidden email]>
> Sent: Sunday, March 19, 2017 9:04:29 AM
> To: donato
> Subject: Re: How on EARTH do I remove 's in schema file?
>
> StandardTokenizer IS removing it. The token you see in each line is what is
> passed _in_ to the tokenizer. The next line shows what came out.
>
> On Sun, Mar 19, 2017 at 9:00 AM donato <[hidden email]</user/SendEmail.jtp?type=node&node=4325842&i=0>>
> wrote:
>
> > And here is my most recent schema.xml file after your suggestions... *
> > <https://we.tl/OskBwQwWAo> DOWNLOAD HERE*
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-
> remove-s-in-schema-file-tp4325709p4325841.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
> --
> --
> *John Blythe*
> Product Manager & Lead Developer
>
> 251.605.3071 | [hidden email]</user/SendEmail.jtp?
> type=node&node=4325842&i=1>
> www.curvolabs.com
>
> 58 Adams Ave
> Evansville, IN 47713
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-
> remove-s-in-schema-file-tp4325709p4325842.html
> To unsubscribe from How on EARTH do I remove 's in schema file?, click
> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=
> unsubscribe_by_code&node=4325709&code=ZGRpY2VjY2FAb3V0bG9vay5jb218ND
> MyNTcwOXwtMTcwNTcxMzYyNg==>.
> NAML<http://lucene.472066.n3.nabble.com/template/
> NamlServlet.jtp?macro=macro_viewer&id=instant_html%
> 21nabble%3Aemail.naml&base=nabble.naml.namespaces.
> BasicNamespace-nabble.view.web.template.NabbleNamespace-
> nabble.view.web.template.NodeNamespace&breadcrumbs=
> notify_subscribers%21nabble%3Aemail.naml-instant_emails%
> 21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-
> tp4325709p4325843.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325844.html
To unsubscribe from How on EARTH do I remove 's in schema file?, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

Ahmet Arslan
In reply to this post by donato
Hi Donato,

How about using ApostropheFilterFactory ?


http://lucene.apache.org/core/6_4_2/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html

Ahmet

On Sunday, March 19, 2017 4:08 PM, donato <[hidden email]> wrote:



Then why is it not working? It doesn't make sense at all? And in the Tag field, it appears NOTHING is happening... It states something about a DefaultAnalyzer?

I am seriously at a loss... this seems like a simple solution that shouldn't be this hard! And it should be somewhat expected.

Why is the Tag field not using the analyzer and filters I have in place?


________________________________
From: John Blythe [via Lucene] <[hidden email]>
Sent: Sunday, March 19, 2017 9:04:29 AM
To: donato
Subject: Re: How on EARTH do I remove 's in schema file?

StandardTokenizer IS removing it. The token you see in each line is what is
passed _in_ to the tokenizer. The next line shows what came out.

On Sun, Mar 19, 2017 at 9:00 AM donato <[hidden email]</user/SendEmail.jtp?type=node&node=4325842&i=0>> wrote:

> And here is my most recent schema.xml file after your suggestions... *
> <https://we.tl/OskBwQwWAo> DOWNLOAD HERE*
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325841.html
> Sent from the Solr - User mailing list archive at Nabble.com.

--
--
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | [hidden email]</user/SendEmail.jtp?type=node&node=4325842&i=1>
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713


________________________________
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325842.html
To unsubscribe from How on EARTH do I remove 's in schema file?, click here<
NAML<
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325843.html

Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

John Blythe
In reply to this post by donato
easiest means for PoC: you could copy your "text" field definition (find
this: <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">...</fieldType>), paste it, and then change the
"name" property to "tag"


On Sun, Mar 19, 2017 at 9:43 AM, donato <[hidden email]> wrote:

> Thank you so much, John! How do I set the Tag field then too? I think that
> is the problem, correct?
>
> Sorry, but I just inherited this recently...
>
>
> ________________________________
> From: John Blythe [via Lucene] <[hidden email]>
> Sent: Sunday, March 19, 2017 9:17:52 AM
> To: donato
> Subject: Re: How on EARTH do I remove 's in schema file?
>
> it is working. the 's is stripped in the next line. the token doesn't
> change as the analysis chain progresses bc there is nothing to change. you
> seem to have no synonyms for "patrick" in your synonym file.
> EnglishPossessive has no possessive to strip (ST already did that).
> porterstem stems "patrick" with "patrick". and LowerCase filter has no work
> to do because it's already in lower case.
>
> as to your "tag" field, <field name="Tag" type="string" indexed="true"
> stored="true" multiValued="true" required="false"/>
>
> that's a string type. what goes in comes out, nothing more and nothing
> less. your definitions above aren't for the tag field, they're for other
> fields. you can't update your "text" field's definitions and expect "tag"
> to inherit them, they'll need to be set as well
>
> --
> *John Blythe*
> Product Manager & Lead Developer
>
> 251.605.3071 | [hidden email]</user/SendEmail.jtp?typ
> e=node&node=4325844&i=0>
> www.curvolabs.com
>
> 58 Adams Ave
> Evansville, IN 47713
>
> On Sun, Mar 19, 2017 at 9:08 AM, donato <[hidden
> email]</user/SendEmail.jtp?type=node&node=4325844&i=1>> wrote:
>
> > Then why is it not working? It doesn't make sense at all? And in the Tag
> > field, it appears NOTHING is happening... It states something about a
> > DefaultAnalyzer?
> >
> > I am seriously at a loss... this seems like a simple solution that
> > shouldn't be this hard! And it should be somewhat expected.
> >
> > Why is the Tag field not using the analyzer and filters I have in place?
> >
> >
> > ________________________________
> > From: John Blythe [via Lucene] <[hidden email]</user/SendEmail.jtp?typ
> e=node&node=4325844&i=2>>
> > Sent: Sunday, March 19, 2017 9:04:29 AM
> > To: donato
> > Subject: Re: How on EARTH do I remove 's in schema file?
> >
> > StandardTokenizer IS removing it. The token you see in each line is what
> is
> > passed _in_ to the tokenizer. The next line shows what came out.
> >
> > On Sun, Mar 19, 2017 at 9:00 AM donato <[hidden
> email]</user/SendEmail.jtp?type=node&node=4325842&i=0>>
> > wrote:
> >
> > > And here is my most recent schema.xml file after your suggestions... *
> > > <https://we.tl/OskBwQwWAo> DOWNLOAD HERE*
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > > http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-
> > remove-s-in-schema-file-tp4325709p4325841.html
> > > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> > --
> > --
> > *John Blythe*
> > Product Manager & Lead Developer
> >
> > 251.605.3071 | [hidden email]</user/SendEmail.jtp?
> > type=node&node=4325842&i=1>
> > www.curvolabs.com
> >
> > 58 Adams Ave
> > Evansville, IN 47713
> >
> >
> > ________________________________
> > If you reply to this email, your message will be added to the discussion
> > below:
> > http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-
> > remove-s-in-schema-file-tp4325709p4325842.html
> > To unsubscribe from How on EARTH do I remove 's in schema file?, click
> > here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=
> > unsubscribe_by_code&node=4325709&code=ZGRpY2VjY2FAb3V0bG9vay5jb218ND
> > MyNTcwOXwtMTcwNTcxMzYyNg==>.
> > NAML<http://lucene.472066.n3.nabble.com/template/
> > NamlServlet.jtp?macro=macro_viewer&id=instant_html%
> > 21nabble%3Aemail.naml&base=nabble.naml.namespaces.
> > BasicNamespace-nabble.view.web.template.NabbleNamespace-
> > nabble.view.web.template.NodeNamespace&breadcrumbs=
> > notify_subscribers%21nabble%3Aemail.naml-instant_emails%
> > 21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
> >
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> > nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-
> > tp4325709p4325843.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-
> s-in-schema-file-tp4325709p4325844.html
> To unsubscribe from How on EARTH do I remove 's in schema file?, click
> here<http://lucene.472066.n3.nabble.com/template/NamlServlet
> .jtp?macro=unsubscribe_by_code&node=4325709&code=ZGRpY2V
> jY2FAb3V0bG9vay5jb218NDMyNTcwOXwtMTcwNTcxMzYyNg==>.
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet
> .jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.
> naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.
> NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_
> subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%
> 3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble
> .com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325845.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

donato
In reply to this post by donato
Hi John,

I actually do have <field name="Tag" type="string" indexed="true"
stored="true" multiValued="true" required="false"/> in there already...

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: How on EARTH do I remove 's in schema file?

John Blythe
You do. That's the problem. It's a string type, it doesn't have the
analysis chain "text" does.
On Sun, Mar 19, 2017 at 12:47 PM donato <[hidden email]> wrote:

> Hi John,
>
> I actually do have <field name="Tag" type="string" indexed="true"
> stored="true" multiValued="true" required="false"/> in there already...
>
> Thanks.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-on-EARTH-do-I-remove-s-in-schema-file-tp4325709p4325856.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
--
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | [hidden email]
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713
12