dynamic fields revisited

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

dynamic fields revisited

gearond
Well, getting close to the time when the 'rubber meets the road'.

A couple of questions about dynamic fields.

  A/ How much room in the index do 'non used' dynamic fields add per record,
any?
  B/ Is the search done on the dynamic filed name in the schema, or on the name
that was matched?
  C/ Anyone done something like:

    //schema file// (representative, not actual)
    *_int1
    *_int2
    *_int3
    *_int4

    *_datetime1
    *_datetime2
      .
      .

Then have fields in the imported data (especially using a DIH importing from a
VIEW) that have custom names like:
    //import source//(representative, not actual)
    custom_labelA_int1
    custom_labelB_int2

    custom_labelC_datetime1
    custom_labelD_datetime2

Is this how dynamic fields are used? I was thinking of having approximately 1-20
dynamic fields per datatype of interest.

  D/ If I wanted all text based dynamic fields added to some common field in the
index (sorry, bad terminology), how is that done?


   

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.

Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

Lance Norskog-2
>A/ How much room in the index do 'non used' dynamic fields add per record, any?
If you use field norms or document boosts in that field, there is a
one-byte array[# of documents]. Otherwise there is no space used.

>  B/ Is the search done on the dynamic filed name in the schema, or on the name
> that was matched?
The dynamic wildcard field name convention is only implemented by the
code that checks the schema.
It is not in the query syntax. Only the real field names are in the
query syntax or returned facets.

On Tue, Dec 28, 2010 at 8:51 PM, Dennis Gearon <[hidden email]> wrote:

> Well, getting close to the time when the 'rubber meets the road'.
>
> A couple of questions about dynamic fields.
>
>  A/ How much room in the index do 'non used' dynamic fields add per record,
> any?
>  B/ Is the search done on the dynamic filed name in the schema, or on the name
> that was matched?
>  C/ Anyone done something like:
>
>    //schema file// (representative, not actual)
>    *_int1
>    *_int2
>    *_int3
>    *_int4
>
>    *_datetime1
>    *_datetime2
>      .
>      .
>
> Then have fields in the imported data (especially using a DIH importing from a
> VIEW) that have custom names like:
>    //import source//(representative, not actual)
>    custom_labelA_int1
>    custom_labelB_int2
>
>    custom_labelC_datetime1
>    custom_labelD_datetime2
>
> Is this how dynamic fields are used? I was thinking of having approximately 1-20
> dynamic fields per datatype of interest.
>
>  D/ If I wanted all text based dynamic fields added to some common field in the
> index (sorry, bad terminology), how is that done?
>
>
>
>
>  Dennis Gearon
>
>
> Signature Warning
> ----------------
> It is always a good idea to learn from your own mistakes. It is usually a better
> idea to learn from others’ mistakes, so you do not have to make them yourself.
> from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
>
>
> EARTH has a Right To Life,
> otherwise we all die.
>
>



--
Lance Norskog
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

gearond

----- Original Message ----
From: Lance Norskog <[hidden email]>
To: [hidden email]
Sent: Wed, December 29, 2010 6:11:32 PM
Subject: Re: dynamic fields revisited

>>>  B/ Is the search done on the dynamic filed name in the schema, or on the
>name
>> that was matched?
>The dynamic wildcard field name convention is only implemented by the
>code that checks the schema.
>It is not in the query syntax. Only the real field names are in the
>query syntax or returned facets.

If I understand you correctly, for an INT dynamic field called *_int2
filled with field callled my_number_int2 during data import
in a query, I will search in the index on the field called:
  "my_number_int2"

correct?
Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

iorixxx
> If I understand you correctly, for an INT dynamic field
> called *_int2
> filled with field callled my_number_int2 during data
> import
> in a query, I will search in the index on the field
> called:
>   "my_number_int2"
>
> correct?
>

Exactly.

Using http://wiki.apache.org/solr/LukeRequestHandler you can retrieve real field names under *_int2, if thats help.



Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

Lance Norskog-2
solr/admin/analysis.jsp uses the Luke handler. You can browse facets and fields.

On Wed, Dec 29, 2010 at 7:46 PM, Ahmet Arslan <[hidden email]> wrote:

>> If I understand you correctly, for an INT dynamic field
>> called *_int2
>> filled with field callled my_number_int2 during data
>> import
>> in a query, I will search in the index on the field
>> called:
>>   "my_number_int2"
>>
>> correct?
>>
>
> Exactly.
>
> Using http://wiki.apache.org/solr/LukeRequestHandler you can retrieve real field names under *_int2, if thats help.
>
>
>
>



--
Lance Norskog
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

gearond
When my Solr guru gets back, we'll redo the schema and see what happens, thanks!

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: Lance Norskog <[hidden email]>
To: [hidden email]
Sent: Thu, December 30, 2010 4:26:58 PM
Subject: Re: dynamic fields revisited

solr/admin/analysis.jsp uses the Luke handler. You can browse facets and fields.

On Wed, Dec 29, 2010 at 7:46 PM, Ahmet Arslan <[hidden email]> wrote:

>> If I understand you correctly, for an INT dynamic field
>> called *_int2
>> filled with field callled my_number_int2 during data
>> import
>> in a query, I will search in the index on the field
>> called:
>>   "my_number_int2"
>>
>> correct?
>>
>
> Exactly.
>
> Using http://wiki.apache.org/solr/LukeRequestHandler you can retrieve real
>field names under *_int2, if thats help.
>
>
>
>



--
Lance Norskog
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

gearond
Just so anyone else can know and save themselves 1/2 hour if they spend 4 minutes searching.

When putting a dynamic field into a document into an index, the name of the field RETAINS the 'constant' part of the dynamic field name.

Example
-------------
If a dynamic integer field is named '*_i' in the schema.xml file,
  __and__
you insert a field names 'my_integer_i', which matches the globbed field name '*_i',
  __then__
the name of the field will be 'my_integer_i' in the index
and in your GETs/(updating)POSTs to the index on that document and
  __NOT__
'my_integer' like I was kind of hoping that it would be :-(

I.E., the suffix (or prefix if you set it up that way,) will NOT be dropped. I was hoping that everything except the globbing character, '*', would just be a flag to the query processor and disappear after being 'noticed'.

Not so :-)
Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

Markus Jelsma-2
It would be quite annoying if it behaves as you were hoping for. This way it
is possible to use different field types (and analyzers) for the same field
value. In faceting, for example, this can be important because you should use
analyzed fields for q and fq but unanalyzed fields for facet.field.

The same goes for sorting and range queries where you can use the same field
value to end up in different field types, one for sorting and one for a range
query.

Without the prefix or suffix of the dynamic field, one must statically declare the
fields beforehand and loose the dynamic advantage.

> Just so anyone else can know and save themselves 1/2 hour if they spend 4
> minutes searching.
>
> When putting a dynamic field into a document into an index, the name of the
> field RETAINS the 'constant' part of the dynamic field name.
>
> Example
> -------------
> If a dynamic integer field is named '*_i' in the schema.xml file,
>   __and__
> you insert a field names 'my_integer_i', which matches the globbed field
> name '*_i',
>   __then__
> the name of the field will be 'my_integer_i' in the index
> and in your GETs/(updating)POSTs to the index on that document and
>   __NOT__
> 'my_integer' like I was kind of hoping that it would be :-(
>
> I.E., the suffix (or prefix if you set it up that way,) will NOT be
> dropped. I was hoping that everything except the globbing character, '*',
> would just be a flag to the query processor and disappear after being
> 'noticed'.
>
> Not so :-)
Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

gearond
I have  a long way to go to understand all those implications. Mind you, I never
-was- whining :-). Just ignorantly surprised.

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.




________________________________
From: Markus Jelsma <[hidden email]>
To: [hidden email]
Cc: gearond <[hidden email]>
Sent: Mon, February 7, 2011 3:28:18 PM
Subject: Re: dynamic fields revisited

It would be quite annoying if it behaves as you were hoping for. This way it
is possible to use different field types (and analyzers) for the same field
value. In faceting, for example, this can be important because you should use
analyzed fields for q and fq but unanalyzed fields for facet.field.

The same goes for sorting and range queries where you can use the same field
value to end up in different field types, one for sorting and one for a range
query.

Without the prefix or suffix of the dynamic field, one must statically declare
the

fields beforehand and loose the dynamic advantage.

> Just so anyone else can know and save themselves 1/2 hour if they spend 4
> minutes searching.
>
> When putting a dynamic field into a document into an index, the name of the
> field RETAINS the 'constant' part of the dynamic field name.
>
> Example
> -------------
> If a dynamic integer field is named '*_i' in the schema.xml file,
>   __and__
> you insert a field names 'my_integer_i', which matches the globbed field
> name '*_i',
>   __then__
> the name of the field will be 'my_integer_i' in the index
> and in your GETs/(updating)POSTs to the index on that document and
>   __NOT__
> 'my_integer' like I was kind of hoping that it would be :-(
>
> I.E., the suffix (or prefix if you set it up that way,) will NOT be
> dropped. I was hoping that everything except the globbing character, '*',
> would just be a flag to the query processor and disappear after being
> 'noticed'.
>
> Not so :-)
Reply | Threaded
Open this post in threaded view
|

Re: dynamic fields revisited

Billnbell
In reply to this post by gearond
You can change the match to be my* and then insert the name you want.

Bill Bell
Sent from mobile


On Feb 7, 2011, at 4:15 PM, gearond <[hidden email]> wrote:

>
> Just so anyone else can know and save themselves 1/2 hour if they spend 4
> minutes searching.
>
> When putting a dynamic field into a document into an index, the name of the
> field RETAINS the 'constant' part of the dynamic field name.
>
> Example
> -------------
> If a dynamic integer field is named '*_i' in the schema.xml file,
>  __and__
> you insert a field names 'my_integer_i', which matches the globbed field
> name '*_i',
>  __then__
> the name of the field will be 'my_integer_i' in the index
> and in your GETs/(updating)POSTs to the index on that document and
>  __NOT__
> 'my_integer' like I was kind of hoping that it would be :-(
>
> I.E., the suffix (or prefix if you set it up that way,) will NOT be dropped.
> I was hoping that everything except the globbing character, '*', would just
> be a flag to the query processor and disappear after being 'noticed'.
>
> Not so :-)
> --
> View this message in context: http://lucene.472066.n3.nabble.com/dynamic-fields-revisited-tp2161080p2447814.html
> Sent from the Solr - User mailing list archive at Nabble.com.