get the position of matched word in the response

7 messages
Open this post in threaded view
|

get the position of matched word in the response

 hi i'm new to solr so please be patient. how can i get the position of matched word in the results. and no, im not talking about highlighting the words. i talkng about getting the postition of the word in the content i have field content which i do in q=content:"some_word" the content field is not stored but its  Indexed +Tokenized+ Multivalued+ TermVector Stored +Store Offset With TermVector +Store Position With TermVector thx for the help
Open this post in threaded view
|

Re: get the position of matched word in the response

 Eli: What problem are you trying to solve? There’s no really convenient way to do this that know of, although it could be done, probably with some lucene-level code. This may be an XY problem, where you're asking how to do X (find the position of the matched word) because you think it’ll help solve some problem Y. What’s “Y”? Perhaps there’s an easier way to solve that problem if we knew what it was…. Best, Erick > On Aug 4, 2019, at 6:55 AM, eli chen <[hidden email]> wrote: > > hi i'm new to solr so please be patient. > how can i get the position of matched word in the results. > > and no, im not talking about highlighting the words. i talkng about getting > the postition of the word in the content > > i have field content which i do in q=content:"some_word" > > the content field is not stored but its > Indexed +Tokenized+ Multivalued+ TermVector Stored +Store Offset With > TermVector +Store Position With TermVector > > thx for the help
Open this post in threaded view
|

Re: get the position of matched word in the response

 every content field is actually a book content so let say someone search for the word "hello" and i found this word in the book "the story jungle" at position 199 (step by word not char) now i can look at my database and check the OCR of this word in this book (and show highlight on the picture and etc) my db is kinda of (just for simplicity) book     word     ocr ------     -------     --------- th....     199        1,1,1,1 that the reason i need the offest of the word. and btw the content field is just a big text_general field thx again ‫בתאריך יום א׳, 4 באוג׳ 2019 ב-14:30 מאת ‪Erick Erickson‬‏ <‪ [hidden email]‬‏>:‬ > Eli: > > What problem are you trying to solve? There’s no really convenient way to > do this that know of, although it could be done, probably with some > lucene-level code. > > This may be an XY problem, where you're asking how to do X (find the > position of the matched word) because you think it’ll help solve some > problem Y. What’s “Y”? Perhaps there’s an easier way to solve that problem > if we knew what it was…. > > Best, > Erick > > > On Aug 4, 2019, at 6:55 AM, eli chen <[hidden email]> wrote: > > > > hi i'm new to solr so please be patient. > > how can i get the position of matched word in the results. > > > > and no, im not talking about highlighting the words. i talkng about > getting > > the postition of the word in the content > > > > i have field content which i do in q=content:"some_word" > > > > the content field is not stored but its > > Indexed +Tokenized+ Multivalued+ TermVector Stored +Store Offset With > > TermVector +Store Position With TermVector > > > > thx for the help > >
Open this post in threaded view
|

Re: get the position of matched word in the response

 One approach: Payloads. You can store, with each word, an arbitrary amount data. Of course the index is bigger…. Most of the examples use a single float, which could be all you need. You can store an arbitrary binary blob and encode/decode it however you want. Conceivably you could store the coordinates of the word, along with the position and not need to consult the DB at all. That said, be prepared to spend some time on this, it’s not necessarily an easy problem to solve. How many positions are you going to return? All of them in the document? How are you going to handle phrase queries? Highlight any individual word matches or only highlight the occurrences of all the words in the phrase together? For that matter, you’ll have to write some code to actually return the payloads with the results... HTH, Erick > On Aug 4, 2019, at 7:45 AM, eli chen <[hidden email]> wrote: > > every content field is actually a book content > so let say someone search for the word "hello" and i found this word in the > book "the story jungle" at position 199 (step by word not char) > > now i can look at my database and check the OCR of this word in this book > (and show highlight on the picture and etc) > > my db is kinda of (just for simplicity) > > book     word     ocr > ------     -------     --------- > th....     199        1,1,1,1 > > that the reason i need the offest of the word. > > and btw the content field is just a big text_general field > > thx again > > ‫בתאריך יום א׳, 4 באוג׳ 2019 ב-14:30 מאת ‪Erick Erickson‬‏ <‪ > [hidden email]‬‏>:‬ > >> Eli: >> >> What problem are you trying to solve? There’s no really convenient way to >> do this that know of, although it could be done, probably with some >> lucene-level code. >> >> This may be an XY problem, where you're asking how to do X (find the >> position of the matched word) because you think it’ll help solve some >> problem Y. What’s “Y”? Perhaps there’s an easier way to solve that problem >> if we knew what it was…. >> >> Best, >> Erick >> >>> On Aug 4, 2019, at 6:55 AM, eli chen <[hidden email]> wrote: >>> >>> hi i'm new to solr so please be patient. >>> how can i get the position of matched word in the results. >>> >>> and no, im not talking about highlighting the words. i talkng about >> getting >>> the postition of the word in the content >>> >>> i have field content which i do in q=content:"some_word" >>> >>> the content field is not stored but its >>> Indexed +Tokenized+ Multivalued+ TermVector Stored +Store Offset With >>> TermVector +Store Position With TermVector >>> >>> thx for the help >> >>
Open this post in threaded view
|