How to rename fields in an index

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

How to rename fields in an index

Antoine Baudoux-2
Is it possible to rename fields in an existing index without having  
to re-index all documents?

thx


--
Antoine Baudoux
Development Manager
[hidden email]
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53


Reply | Threaded
Open this post in threaded view
|

RE: How to rename fields in an index

Jun.Chen-2

Document.removefield()
Then, add() ?
Is this your mean?


-----Original Message-----
From: Antoine Baudoux [mailto:[hidden email]]
Sent: 2007年8月22日 3:55 下午好,Daniel
To: [hidden email]
Subject: How to rename fields in an index

Is it possible to rename fields in an existing index without having
to re-index all documents?

thx


--
Antoine Baudoux
Development Manager
[hidden email]
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53



This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly
prohibited and may be unlawful.

  Visit us at http://www.cognizant.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Antoine Baudoux-2
No, i just want to change the field labels.

For example, i have a "Keyword" field that i want to rename into "kw".
--
Antoine Baudoux
Development Manager
[hidden email]
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53


On 22 Aug 2007, at 10:27, <[hidden email]>  
<[hidden email]> wrote:

>
> Document.removefield()
> Then, add() ?
> Is this your mean?
>
>
> -----Original Message-----
> From: Antoine Baudoux [mailto:[hidden email]]
> Sent: 2007年8月22日 3:55 下午好,Daniel
> To: [hidden email]
> Subject: How to rename fields in an index
>
> Is it possible to rename fields in an existing index without having
> to re-index all documents?
>
> thx
>
>
> --
> Antoine Baudoux
> Development Manager
> [hidden email]
> Tél.: +32 2 333 58 44
> GSM: +32 499 534 538
> Fax.: +32 2 648 16 53
>
>
>
> This e-mail and any files transmitted with it are for the sole use  
> of the intended recipient(s) and may contain confidential and  
> privileged information.
> If you are not the intended recipient, please contact the sender by  
> reply e-mail and destroy all copies of the original message.
> Any unauthorized review, use, disclosure, dissemination,  
> forwarding, printing or copying of this email or any action taken  
> in reliance on this e-mail is strictly
> prohibited and may be unlawful.
>
>   Visit us at http://www.cognizant.com
>

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Andrzej Białecki-2
Antoine Baudoux wrote:
> No, i just want to change the field labels.
>
> For example, i have a "Keyword" field that i want to rename into "kw".

(note: this is a low-level hack, you can damage your index beyond repair).

Take a look at FieldInfos class, and how it creates the *.fnm file for
each segment. You can re-write these fnm files using new field names. In
case of compound indexes you will need to "explode" them first to a
non-compound format.

Make sure you write out these files using exactly the same order of
fields, otherwise you will end up in big trouble ;)


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Antoine Baudoux-2
Thanks!
--
Antoine Baudoux
Development Manager
[hidden email]
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53


On 22 Aug 2007, at 14:03, Andrzej Bialecki wrote:

> Antoine Baudoux wrote:
>> No, i just want to change the field labels.
>> For example, i have a "Keyword" field that i want to rename into  
>> "kw".
>
> (note: this is a low-level hack, you can damage your index beyond  
> repair).
>
> Take a look at FieldInfos class, and how it creates the *.fnm file  
> for each segment. You can re-write these fnm files using new field  
> names. In case of compound indexes you will need to "explode" them  
> first to a non-compound format.
>
> Make sure you write out these files using exactly the same order of  
> fields, otherwise you will end up in big trouble ;)
>
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Erick Erickson
Unless it's really, really, really prohibitive or impossible,
I'd recommend regenerating your index. Messing around in
the low-level file formats is just asking for trouble. Not to
mention that you'll probably have to remanufcture your
index sometime, somewhere and hack all over again or
*hope* that your code changes would match your new
index. Whereas if you remanufacture with "kw", you'll
be sure things are consistent

Or worst of all, regenerate the index and have some other
poor soul try to figure out what the heck is going on with the
application. "It doesn't work whenever I search on the 'kw' field".

I guess, if I were looking at it, I'd have to say that either
making a new index so I could use "kw" rather than
"keyword" was valuable enough to remanufacture the
index or not valuable enough to do <G>...

Best
Erick

On 8/22/07, Antoine Baudoux <[hidden email]> wrote:

>
> Thanks!
> --
> Antoine Baudoux
> Development Manager
> [hidden email]
> Tél.: +32 2 333 58 44
> GSM: +32 499 534 538
> Fax.: +32 2 648 16 53
>
>
> On 22 Aug 2007, at 14:03, Andrzej Bialecki wrote:
>
> > Antoine Baudoux wrote:
> >> No, i just want to change the field labels.
> >> For example, i have a "Keyword" field that i want to rename into
> >> "kw".
> >
> > (note: this is a low-level hack, you can damage your index beyond
> > repair).
> >
> > Take a look at FieldInfos class, and how it creates the *.fnm file
> > for each segment. You can re-write these fnm files using new field
> > names. In case of compound indexes you will need to "explode" them
> > first to a non-compound format.
> >
> > Make sure you write out these files using exactly the same order of
> > fields, otherwise you will end up in big trouble ;)
> >
> >
> > --
> > Best regards,
> > Andrzej Bialecki     <><
> >  ___. ___ ___ ___ _ _   __________________________________
> > [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> > ___|||__||  \|  ||  |  Embedded Unix, System Integration
> > http://www.sigram.com  Contact: info at sigram dot com
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Andrzej Białecki-2
Erick Erickson wrote:

> Unless it's really, really, really prohibitive or impossible,
> I'd recommend regenerating your index. Messing around in
> the low-level file formats is just asking for trouble. Not to
> mention that you'll probably have to remanufcture your
> index sometime, somewhere and hack all over again or
> *hope* that your code changes would match your new
> index. Whereas if you remanufacture with "kw", you'll
> be sure things are consistent
>
> Or worst of all, regenerate the index and have some other
> poor soul try to figure out what the heck is going on with the
> application. "It doesn't work whenever I search on the 'kw' field".
>
> I guess, if I were looking at it, I'd have to say that either
> making a new index so I could use "kw" rather than
> "keyword" was valuable enough to remanufacture the
> index or not valuable enough to do <G>...

Alternatively, we could just take this code and add it to
IndexReader.renameField(String old, String new) ... ;)


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Antoine Baudoux-2
In reply to this post by Erick Erickson
Re-indexing would take a lot of time.


In fact I need this change just for the Query parser class, to be  
able to make queries such as kw:blah. Now my field is called  
"org.mycompany.mediafield.keyword". Not very easy to make queries  
with this field name!


  It would be cool if QueryParser had some sort of field name  
aliasing functionality, so that kw:blah is automatically translated  
into  org.mycompany.mediafield.keyword:blah


Antoine
--
Antoine Baudoux
Development Manager
[hidden email]
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53


On 22 Aug 2007, at 16:02, Erick Erickson wrote:

> Unless it's really, really, really prohibitive or impossible,
> I'd recommend regenerating your index. Messing around in
> the low-level file formats is just asking for trouble. Not to
> mention that you'll probably have to remanufcture your
> index sometime, somewhere and hack all over again or
> *hope* that your code changes would match your new
> index. Whereas if you remanufacture with "kw", you'll
> be sure things are consistent
>
> Or worst of all, regenerate the index and have some other
> poor soul try to figure out what the heck is going on with the
> application. "It doesn't work whenever I search on the 'kw' field".
>
> I guess, if I were looking at it, I'd have to say that either
> making a new index so I could use "kw" rather than
> "keyword" was valuable enough to remanufacture the
> index or not valuable enough to do <G>...
>
> Best
> Erick
>
> On 8/22/07, Antoine Baudoux <[hidden email]> wrote:
>>
>> Thanks!
>> --
>> Antoine Baudoux
>> Development Manager
>> [hidden email]
>> Tél.: +32 2 333 58 44
>> GSM: +32 499 534 538
>> Fax.: +32 2 648 16 53
>>
>>
>> On 22 Aug 2007, at 14:03, Andrzej Bialecki wrote:
>>
>>> Antoine Baudoux wrote:
>>>> No, i just want to change the field labels.
>>>> For example, i have a "Keyword" field that i want to rename into
>>>> "kw".
>>>
>>> (note: this is a low-level hack, you can damage your index beyond
>>> repair).
>>>
>>> Take a look at FieldInfos class, and how it creates the *.fnm file
>>> for each segment. You can re-write these fnm files using new field
>>> names. In case of compound indexes you will need to "explode" them
>>> first to a non-compound format.
>>>
>>> Make sure you write out these files using exactly the same order of
>>> fields, otherwise you will end up in big trouble ;)
>>>
>>>
>>> --
>>> Best regards,
>>> Andrzej Bialecki     <><
>>>  ___. ___ ___ ___ _ _   __________________________________
>>> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>>> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>>> http://www.sigram.com  Contact: info at sigram dot com
>>>
>>>
>>> --------------------------------------------------------------------
>>> -
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Erik Hatcher

On Aug 22, 2007, at 11:02 AM, Antoine Baudoux wrote:
> In fact I need this change just for the Query parser class, to be  
> able to make queries such as kw:blah. Now my field is called  
> "org.mycompany.mediafield.keyword". Not very easy to make queries  
> with this field name!
>
>
>  It would be cool if QueryParser had some sort of field name  
> aliasing functionality, so that kw:blah is automatically translated  
> into  org.mycompany.mediafield.keyword:blah

You could, instead, do a string substitution on the string the user  
enters and replace "kw:" with something else.  That'd be quick and  
dirty and probably pretty robust too.

However, it is a good point about allowing QueryParser a way to  
translate field names.  A patch along these lines would be well  
received, I think.

        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Antoine Baudoux-2
I really don't like the dirty solution ;-)

I could come up with a patch if many are interested in this.

What do you like be the best : patching QueryParser ,or a new  
QueryParser subclass ?

--
Antoine Baudoux
Development Manager
[hidden email]
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53


On 22 Aug 2007, at 17:26, Erik Hatcher wrote:

>
> On Aug 22, 2007, at 11:02 AM, Antoine Baudoux wrote:
>> In fact I need this change just for the Query parser class, to be  
>> able to make queries such as kw:blah. Now my field is called  
>> "org.mycompany.mediafield.keyword". Not very easy to make queries  
>> with this field name!
>>
>>
>>  It would be cool if QueryParser had some sort of field name  
>> aliasing functionality, so that kw:blah is automatically  
>> translated into  org.mycompany.mediafield.keyword:blah
>
> You could, instead, do a string substitution on the string the user  
> enters and replace "kw:" with something else.  That'd be quick and  
> dirty and probably pretty robust too.
>
> However, it is a good point about allowing QueryParser a way to  
> translate field names.  A patch along these lines would be well  
> received, I think.
>
> Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

mark harwood
In reply to this post by Antoine Baudoux-2
Might another option be to wrap IndexReader?

Something similar to this in principle:  https://issues.apache.org/jira/browse/LUCENE-835



----- Original Message ----
From: Antoine Baudoux <[hidden email]>
To: [hidden email]
Sent: Wednesday, 22 August, 2007 4:44:35 PM
Subject: Re: How to rename fields in an index

I really don't like the dirty solution ;-)

I could come up with a patch if many are interested in this.

What do you like be the best : patching QueryParser ,or a new  
QueryParser subclass ?

--
Antoine Baudoux
Development Manager
[hidden email]
Tél.: +32 2 333 58 44
GSM: +32 499 534 538
Fax.: +32 2 648 16 53


On 22 Aug 2007, at 17:26, Erik Hatcher wrote:

>
> On Aug 22, 2007, at 11:02 AM, Antoine Baudoux wrote:
>> In fact I need this change just for the Query parser class, to be  
>> able to make queries such as kw:blah. Now my field is called  
>> "org.mycompany.mediafield.keyword". Not very easy to make queries  
>> with this field name!
>>
>>
>>  It would be cool if QueryParser had some sort of field name  
>> aliasing functionality, so that kw:blah is automatically  
>> translated into  org.mycompany.mediafield.keyword:blah
>
> You could, instead, do a string substitution on the string the user  
> enters and replace "kw:" with something else.  That'd be quick and  
> dirty and probably pretty robust too.
>
> However, it is a good point about allowing QueryParser a way to  
> translate field names.  A patch along these lines would be well  
> received, I think.
>
>     Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>






      ___________________________________________________________
Want ideas for reducing your carbon footprint? Visit Yahoo! For Good  http://uk.promotions.yahoo.com/forgood/environment.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: How to rename fields in an index

Jun.Chen-2
In reply to this post by Andrzej Białecki-2

Dear Andrzej Bialecki

Can we change the field name in *.fnm directly by hand?

-----Original Message-----
From: Andrzej Bialecki [mailto:[hidden email]]
Sent: 2007年8月22日 8:04 下午好,Daniel
To: [hidden email]
Subject: Re: How to rename fields in an index

Antoine Baudoux wrote:
> No, i just want to change the field labels.
>
> For example, i have a "Keyword" field that i want to rename into "kw".

(note: this is a low-level hack, you can damage your index beyond repair).

Take a look at FieldInfos class, and how it creates the *.fnm file for
each segment. You can re-write these fnm files using new field names. In
case of compound indexes you will need to "explode" them first to a
non-compound format.

Make sure you write out these files using exactly the same order of
fields, otherwise you will end up in big trouble ;)


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly
prohibited and may be unlawful.

  Visit us at http://www.cognizant.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

Andrzej Białecki-2
[hidden email] wrote:
> Dear Andrzej Bialecki
>
> Can we change the field name in *.fnm directly by hand?

Yes, but you need to be consistent about it, i.e. change it the same way
for every segment that the index consists of. Also, fnm files are binary
files, so you need to know the format (unless you preserve the length of
the name, e.g. "secret" -> "public", then you can do it with a binary
editor).

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: How to rename fields in an index

Jun.Chen-2

Got it.
Thank you very much :)

-----Original Message-----
From: Andrzej Bialecki [mailto:[hidden email]]
Sent: 2007年8月23日 3:09 下午好,Daniel
To: [hidden email]
Subject: Re: How to rename fields in an index

[hidden email] wrote:
> Dear Andrzej Bialecki
>
> Can we change the field name in *.fnm directly by hand?

Yes, but you need to be consistent about it, i.e. change it the same way
for every segment that the index consists of. Also, fnm files are binary
files, so you need to know the format (unless you preserve the length of
the name, e.g. "secret" -> "public", then you can do it with a binary
editor).

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly
prohibited and may be unlawful.

  Visit us at http://www.cognizant.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to rename fields in an index

jjlarrea
In reply to this post by Andrzej Białecki-2
Did anyone ever post a packaged solution for simple field renaming?  Since I didn't see one, I offer (link below) a beanshell script 'fieldrename' which uses the Lucene API to run through the segments, gather fieldnames, pass then through a user-supplied regular-expression transformation, and rewrite the .fnm file.  For example,

    fieldrename /path/to/index  ^Abstract$ Summary
        (renaming of a single field via exact match: Abstract -> Summary)
    fieldrename /path/to/index  Number Code
        (change substring within every fieldname: ProductNumber -> ProductCode, etc.)

It does save the original .fnm files as .fnm.bak just in case, but caveat emptor: Used incorrectly it can destroy a good index in a flash, so always use on a backup index.

It works under Lucene 2.2 but has not been tested with 2.3, and it needs a few more lines of code to work with compound-file indexes (which I'd be happy to add if someone were actually interested in using it).
FYI I used beanshell rather than Java not only for rapid prototyping, but also because it allows access to a non-public method in the Lucene API.

Go wild,
J.J. Larrea  

fieldrename

Andrzej Bialecki wrote
Jun.Chen@cognizant.com wrote:
> Dear Andrzej Bialecki
>
> Can we change the field name in *.fnm directly by hand?

Yes, but you need to be consistent about it, i.e. change it the same way
for every segment that the index consists of. Also, fnm files are binary
files, so you need to know the format (unless you preserve the length of
the name, e.g. "secret" -> "public", then you can do it with a binary
editor).