[lucy-user] Does multivalued field support exist in Lucy?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[lucy-user] Does multivalued field support exist in Lucy?

serkanmulayim@gmail.com
Hi guys,

I would like to understand if multivalued fields can be defined in the
Lucy? To be more specific can I put multiple StringType objects to a single
field?

I could not find any solution for this. And I do not want to use
RegexTokenizer for a few reasons:
1- I do not want to have any complexity or limitations in the way I index
tokens.
2- I would like to have a static library which would not depend on PCRE. (I
know this is a second question but do you know which version of PCRE is
supported. I have 8.39, and I am receiving errors.)

Thanks,
Serkan
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Does multivalued field support exist in Lucy?

Peter Karman
Serkan Mulayim wrote on 11/10/16, 4:38 PM:
> Hi guys,
>
> I would like to understand if multivalued fields can be defined in the
> Lucy? To be more specific can I put multiple StringType objects to a single
> field?
>

There is not native support in Lucy for multi-values fields. You would need to
concatenate multiple strings together to store them under a single field.

See http://markmail.org/message/r2dyzgj6pcewaaq4

Dezi supports multi-value fields through concatenation using the ASCII byte \003
(ETX end of text).

See https://metacpan.org/source/KARMAN/Dezi-App-0.014/lib/Dezi/Lucy/Indexer.pm#L110

--
Peter Karman  .  http://peknet.com/  .  [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Does multivalued field support exist in Lucy?

Nick Wellnhofer
In reply to this post by serkanmulayim@gmail.com
On 10/11/2016 23:38, Serkan Mulayim wrote:
> 2- I would like to have a static library which would not depend on PCRE. (I
> know this is a second question but do you know which version of PCRE is
> supported. I have 8.39, and I am receiving errors.)

This version of PCRE should be supported. What errors are you receiving?

Nick

Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Does multivalued field support exist in Lucy?

serkanmulayim@gmail.com
Thanks Peter and Nick for your quick responses.

I was referring to C library not Perl, sorry for not putting it on my
question.

Peter, regarding the multivalue fields, it seems like I, for sure, need to
create Whitespace tokenizer based on RegexTokenizer, can you or someone
please confirm? This would create the dependency for the PCRE.  In order to
make it I will need PCRE to be built as static library and linked with lucy
and my code then.

Nick, probably I messed up something in my environment. I ran the tests and
with the default Makefile of Lucy, which only creates dynamic library, AND
RegexTokenizer passed the test.

Thanks again guys,
Serkan

On Fri, Nov 11, 2016 at 4:28 AM, Nick Wellnhofer <[hidden email]>
wrote:

> On 10/11/2016 23:38, Serkan Mulayim wrote:
>
>> 2- I would like to have a static library which would not depend on PCRE.
>> (I
>> know this is a second question but do you know which version of PCRE is
>> supported. I have 8.39, and I am receiving errors.)
>>
>
> This version of PCRE should be supported. What errors are you receiving?
>
> Nick
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Does multivalued field support exist in Lucy?

Nick Wellnhofer
On 11/11/2016 19:46, Serkan Mulayim wrote:
> I was referring to C library not Perl, sorry for not putting it on my
> question.
>
> Peter, regarding the multivalue fields, it seems like I, for sure, need to
> create Whitespace tokenizer based on RegexTokenizer, can you or someone
> please confirm? This would create the dependency for the PCRE.  In order to
> make it I will need PCRE to be built as static library and linked with lucy
> and my code then.

You can write a custom Analyzer that simply splits on a predefined character.
Have a look at this thread for how to do this in C:

https://lists.apache.org/thread.html/ea5b19eb7a8f688c85c8268b0119282936eb1d097b3b3306d4b909de@1427747314@%3Cdev.lucy.apache.org%3E

Or here with proper indentation:

http://mail-archives.apache.org/mod_mbox/lucy-dev/201503.mbox/%3cCAAS6=7hPSMNA=RrT63q1YPvTS=2Jphzfxu5ArXXS4fEgUGLLDA@...%3e

Nick

Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Does multivalued field support exist in Lucy?

serkanmulayim@gmail.com
Thank you Nick, I will give it a try. It sounds really promising.



On Fri, Nov 11, 2016 at 11:03 AM, Nick Wellnhofer <[hidden email]>
wrote:

> On 11/11/2016 19:46, Serkan Mulayim wrote:
>
>> I was referring to C library not Perl, sorry for not putting it on my
>> question.
>>
>> Peter, regarding the multivalue fields, it seems like I, for sure, need to
>> create Whitespace tokenizer based on RegexTokenizer, can you or someone
>> please confirm? This would create the dependency for the PCRE.  In order
>> to
>> make it I will need PCRE to be built as static library and linked with
>> lucy
>> and my code then.
>>
>
> You can write a custom Analyzer that simply splits on a predefined
> character. Have a look at this thread for how to do this in C:
>
> https://lists.apache.org/thread.html/ea5b19eb7a8f688c85c8268
> b0119282936eb1d097b3b3306d4b909de@1427747314@%3Cdev.lucy.apache.org%3E
>
> Or here with proper indentation:
>
> http://mail-archives.apache.org/mod_mbox/lucy-dev/201503.mbo
> x/%3cCAAS6=7hPSMNA=RrT63q1YPvTS=[hidden email]%3e
>
> Nick
>
>