Reposting unABLE to match

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Reposting unABLE to match

shridharv
Solr <http://localhost:8084/Genie/>


  Solr Admin (GENIE)

ShridharVAIO:8084
cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin
SolrHome=c:\Documents and
Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/


    Field Analysis

*Field name*
*Field value (Index)*
verbose output
highlight matches "unABLE TO CONNECT"
*Field value (Query)*
verbose output "unABLE TO CONNECT"
       


      Index Analyzer


        org.apache.solr.analysis.HTMLStripWhitespaceTokenizerFactory {}

term position 1 2 3
term text "unABLE TO CONNECT"
term type word word word
source start,end 0,7 8,10 11,19


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position 1 2 3
term text "unABLE TO CONNECT"
term type word word word
source start,end 0,7 8,10 11,19


        org.apache.solr.analysis.StandardFilterFactory {}

term position 1 2 3
term text "unABLE TO CONNECT"
term type word word word
source start,end 0,7 8,10 11,19


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position 1 2
term text "unABLE CONNECT"
term type word word
source start,end 0,7 11,19


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position 1 2 3
term text un ABLE CONNECT
                        unABLE
term type word word word
word
source start,end 1,3 3,7 11,18
                                1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position 1 2 3
term text un able connect
                        unable
term type word word word
word
source start,end 1,3 3,7 11,18
                                1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position 1 2 3
term text un able connect
                        unable
term type word word word
word
source start,end 1,3 3,7 11,18
                                1,7


      Query Analyzer


        org.apache.solr.analysis.HTMLStripStandardTokenizerFactory {}

term position 1 2 3
term text unABLE TO CONNECT
term type <ALPHANUM> <ALPHANUM> <ALPHANUM>
source start,end 1,7 8,10 11,18


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position 1 2 3
term text unABLE TO CONNECT
term type <ALPHANUM> <ALPHANUM> <ALPHANUM>
source start,end 1,7 8,10 11,18


        org.apache.solr.analysis.StandardFilterFactory {}

term position 1 2 3
term text unABLE TO CONNECT
term type <ALPHANUM> <ALPHANUM> <ALPHANUM>
source start,end 1,7 8,10 11,18


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position 1 2
term text unABLE CONNECT
term type <ALPHANUM> <ALPHANUM>
source start,end 1,7 11,18


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position 1 2 3
term text un ABLE CONNECT
                        unABLE
term type <ALPHANUM> <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end 1,3 3,7 11,18
                                1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position 1 2 3
term text un able connect
                        unable
term type <ALPHANUM> <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end 1,3 3,7 11,18
                                1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position 1 2 3
term text un able connect
                        unable
term type <ALPHANUM> <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end 1,3 3,7 11,18
                                1,7



Reply | Threaded
Open this post in threaded view
|

Re: Reposting unABLE to match

Bertrand Delacretaz
On 3/27/07, Shridhar Venkatraman <[hidden email]> wrote:

...Reposting unABLE to match

No need to repost if your message made it to the list.

If it hasn't been answered yet, it either means that no one knows the
answer or that no one has had the time to answer yet. We're all
volunteers here.

-Bertrand
Reply | Threaded
Open this post in threaded view
|

Re: Reposting unABLE to match

Maarten.De.Vilder
In reply to this post by shridharv
what exactly is the problem ?

seems like you end up with the same term text in both query and index
analyzer ... you should have found a match...





Shridhar Venkatraman <[hidden email]>
27/03/2007 14:08
Please respond to
[hidden email]


To
[hidden email]
cc

Subject
Reposting unABLE to match






Solr <http://localhost:8084/Genie/>


  Solr Admin (GENIE)

ShridharVAIO:8084
cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin
SolrHome=c:\Documents and
Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/


    Field Analysis

*Field name*
*Field value (Index)*
verbose output
highlight matches                "unABLE TO CONNECT"
*Field value (Query)*
verbose output           "unABLE TO CONNECT"
 


      Index Analyzer


        org.apache.solr.analysis.HTMLStripWhitespaceTokenizerFactory {}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.StandardFilterFactory {}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position            1               2
term text                "unABLE                 CONNECT"
term type                word            word
source start,end                 0,7             11,19


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position            1               2               3
term text                un              ABLE            CONNECT
                                                 unABLE
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


      Query Analyzer


        org.apache.solr.analysis.HTMLStripStandardTokenizerFactory {}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.StandardFilterFactory {}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position            1               2
term text                unABLE          CONNECT
term type                <ALPHANUM>              <ALPHANUM>
source start,end                 1,7             11,18


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position            1               2               3
term text                un              ABLE            CONNECT
                                                 unABLE
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7




Reply | Threaded
Open this post in threaded view
|

Re: Reposting unABLE to match

shridharv
In reply to this post by Bertrand Delacretaz
Sorry for the repeated postings,

i was reposting only because my email text which explained the problem disappeared.
This is what was at the head of the email and did not get posted previously;

    Hi,
    Sorry for this multiple postings...
    My email text did not get posted along with the attachment, don't know why ?
    Here it is again.

    The phrase "unABLE TO CONNECT" does not match in my system. However, any         combination of case is ok as long as the first letter 'U" is in uppercase.

    Bad-> uNABLE, unABLE, unaBLE....
    Gud-> Unable, UNable, UNAble...

    I have attached the output of Solr Admin's Analysis Page..

    Could someone point the way ???

    Thanks
    Shridhar

Regards
Shridhar



Bertrand Delacretaz wrote:
On 3/27/07, Shridhar Venkatraman [hidden email] wrote:

...Reposting unABLE to match

No need to repost if your message made it to the list.

If it hasn't been answered yet, it either means that no one knows the
answer or that no one has had the time to answer yet. We're all
volunteers here.

-Bertrand


Reply | Threaded
Open this post in threaded view
|

Re: Reposting unABLE to match

Maarten.De.Vilder
In reply to this post by shridharv
the only thing i can think of is the fact that in the index analysis the
term-type is "word"
and in the query analysis the term-type is "alphanumeric"

you should be getting a match if that doesnt matter ... you get exactly
the same term texts ...





Shridhar Venkatraman <[hidden email]>
27/03/2007 14:08
Please respond to
[hidden email]


To
[hidden email]
cc

Subject
Reposting unABLE to match






Solr <http://localhost:8084/Genie/>


  Solr Admin (GENIE)

ShridharVAIO:8084
cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin
SolrHome=c:\Documents and
Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/


    Field Analysis

*Field name*
*Field value (Index)*
verbose output
highlight matches                "unABLE TO CONNECT"
*Field value (Query)*
verbose output           "unABLE TO CONNECT"
 


      Index Analyzer


        org.apache.solr.analysis.HTMLStripWhitespaceTokenizerFactory {}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.StandardFilterFactory {}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position            1               2
term text                "unABLE                 CONNECT"
term type                word            word
source start,end                 0,7             11,19


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position            1               2               3
term text                un              ABLE            CONNECT
                                                 unABLE
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


      Query Analyzer


        org.apache.solr.analysis.HTMLStripStandardTokenizerFactory {}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.StandardFilterFactory {}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position            1               2
term text                unABLE          CONNECT
term type                <ALPHANUM>              <ALPHANUM>
source start,end                 1,7             11,18


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position            1               2               3
term text                un              ABLE            CONNECT
                                                 unABLE
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7




Reply | Threaded
Open this post in threaded view
|

Re: Reposting unABLE to match

shridharv
The phrase "unABLE TO CONNECT" does not match in my system. However, any         combination of case is ok as long as the first letter 'U" is in uppercase.

    Bad-> uNABLE, unABLE, unaBLE....
    Gud-> Unable, UNable, UNAble...

Any ideas ?
Reply | Threaded
Open this post in threaded view
|

Re: Reposting unABLE to match

Yonik Seeley-2
On 3/27/07, Shridhar Venkatraman <[hidden email]> wrote:
>  The phrase "unABLE TO CONNECT" does not match in my system. However, any
>      combination of case is ok as long as the first letter 'U" is in
> uppercase.
>
>      Bad-> uNABLE, unABLE, unaBLE....
>      Gud-> Unable, UNable, UNAble...
>
>  Any ideas ?

WordDelimiterFilter

lowercase to uppercase transition => split
uppercase to lowercase => no split  (so capitalized words, and words
like IBMs won't cause a split).

Either configure WordDelimiterFilter differently (use catenation but
not generation), or remove it altogether.
Don't forget to re-index after you have made changes.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Reposting unABLE to match

Chris Hostetter-3
In reply to this post by shridharv

:     Sorry for this multiple postings...
:     My email text did not get posted along with the attachment, don't know why ?
:     Here it is again.

in general: don't use attachments, paste text directly into hte body of
your email, that may have had soemthing to do with your problem.


-Hoss