What is creating certain fields?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

What is creating certain fields?

Keith Dopson
My default query produces this:

|  {
         "id":"44419",
         "date":["11/13/17 13:18"],
         "url":["http://www.someurl.com"],
         "title":["some title"],
         "content":["some indexed content..........."],
         "date_str":["11/13/17 13:18"],
         "url_str":["http://www.someurl.com"],
         "title_str":["some title"],
         "_version_":1594211356390719488,
         "content_str":["some indexed content.........."]
},


In my managed_schema file, I only have five populated fields,

    <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />

    <field name="date"    type="text_general" indexed="false" stored="true"/>
    <field name="url"     type="text_general" indexed="false" stored="true"/>
    <field name="title"   type="text_general" indexed="true"  stored="true"/>
    <field name="content" type="text_general" indexed="true"  stored="true"/>

While other fields are declared, none of them are populated by my "post" command.

My question is "Where are the xxxxx_str fields coming from?
I.e., what is producing the
|
||"date_str":["...
"url_str":["...
"title_str":["...
"content_str":["...|

entries?

Thanks in advance.
|


Reply | Threaded
Open this post in threaded view
|

Re: What is creating certain fields?

David Hastings
those are dynamic fields.

  <dynamicField name="*_str" type="strings" docValues="true"
indexed="false" stored="false"/>


On Wed, Mar 7, 2018 at 12:43 AM, Keith Dopson <[hidden email]> wrote:

> My default query produces this:
>
> |  {
>         "id":"44419",
>         "date":["11/13/17 13:18"],
>         "url":["http://www.someurl.com"],
>         "title":["some title"],
>         "content":["some indexed content..........."],
>         "date_str":["11/13/17 13:18"],
>         "url_str":["http://www.someurl.com"],
>         "title_str":["some title"],
>         "_version_":1594211356390719488,
>         "content_str":["some indexed content.........."]
> },
>
>
> In my managed_schema file, I only have five populated fields,
>
>    <field name="id" type="string" indexed="true" stored="true"
> required="true" multiValued="false" />
>
>    <field name="date"    type="text_general" indexed="false"
> stored="true"/>
>    <field name="url"     type="text_general" indexed="false"
> stored="true"/>
>    <field name="title"   type="text_general" indexed="true"
> stored="true"/>
>    <field name="content" type="text_general" indexed="true"
> stored="true"/>
>
> While other fields are declared, none of them are populated by my "post"
> command.
>
> My question is "Where are the xxxxx_str fields coming from?
> I.e., what is producing the
> |
> ||"date_str":["...
> "url_str":["...
> "title_str":["...
> "content_str":["...|
>
> entries?
>
> Thanks in advance.
> |
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: What is creating certain fields?

Erick Erickson
Maybe  a copyField is realizing the dynamic fields?


On Wed, Mar 7, 2018 at 7:43 AM, David Hastings
<[hidden email]> wrote:

> those are dynamic fields.
>
>   <dynamicField name="*_str" type="strings" docValues="true"
> indexed="false" stored="false"/>
>
>
> On Wed, Mar 7, 2018 at 12:43 AM, Keith Dopson <[hidden email]> wrote:
>
>> My default query produces this:
>>
>> |  {
>>         "id":"44419",
>>         "date":["11/13/17 13:18"],
>>         "url":["http://www.someurl.com"],
>>         "title":["some title"],
>>         "content":["some indexed content..........."],
>>         "date_str":["11/13/17 13:18"],
>>         "url_str":["http://www.someurl.com"],
>>         "title_str":["some title"],
>>         "_version_":1594211356390719488,
>>         "content_str":["some indexed content.........."]
>> },
>>
>>
>> In my managed_schema file, I only have five populated fields,
>>
>>    <field name="id" type="string" indexed="true" stored="true"
>> required="true" multiValued="false" />
>>
>>    <field name="date"    type="text_general" indexed="false"
>> stored="true"/>
>>    <field name="url"     type="text_general" indexed="false"
>> stored="true"/>
>>    <field name="title"   type="text_general" indexed="true"
>> stored="true"/>
>>    <field name="content" type="text_general" indexed="true"
>> stored="true"/>
>>
>> While other fields are declared, none of them are populated by my "post"
>> command.
>>
>> My question is "Where are the xxxxx_str fields coming from?
>> I.e., what is producing the
>> |
>> ||"date_str":["...
>> "url_str":["...
>> "title_str":["...
>> "content_str":["...|
>>
>> entries?
>>
>> Thanks in advance.
>> |
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: What is creating certain fields?

Cassandra Targett
I'll guess you're using Solr 7.x and those fields in your schema were
created automatically?

As of Solr 7.0, the schemaless mode field guessing added a copyField rule
for any field that's guessed to be text to copy the first 256 characters to
a multivalued string field. The way it works is a field is created with the
type "text_general", and a copyField is then automatically created with the
dynamic field rule "*_str" to create the multivalued string field.

This came from https://issues.apache.org/jira/browse/SOLR-9526.

You can prohibit the behavior if you want to by removing the copyField rule
section. See the docs for where in the solrconfig.xml you will want to
edit:
https://lucene.apache.org/solr/guide/schemaless-mode.html#enable-field-class-guessing
.

Cassandra

On Wed, Mar 7, 2018 at 9:46 AM, Erick Erickson <[hidden email]>
wrote:

> Maybe  a copyField is realizing the dynamic fields?
>
>
> On Wed, Mar 7, 2018 at 7:43 AM, David Hastings
> <[hidden email]> wrote:
> > those are dynamic fields.
> >
> >   <dynamicField name="*_str" type="strings" docValues="true"
> > indexed="false" stored="false"/>
> >
> >
> > On Wed, Mar 7, 2018 at 12:43 AM, Keith Dopson <[hidden email]>
> wrote:
> >
> >> My default query produces this:
> >>
> >> |  {
> >>         "id":"44419",
> >>         "date":["11/13/17 13:18"],
> >>         "url":["http://www.someurl.com"],
> >>         "title":["some title"],
> >>         "content":["some indexed content..........."],
> >>         "date_str":["11/13/17 13:18"],
> >>         "url_str":["http://www.someurl.com"],
> >>         "title_str":["some title"],
> >>         "_version_":1594211356390719488,
> >>         "content_str":["some indexed content.........."]
> >> },
> >>
> >>
> >> In my managed_schema file, I only have five populated fields,
> >>
> >>    <field name="id" type="string" indexed="true" stored="true"
> >> required="true" multiValued="false" />
> >>
> >>    <field name="date"    type="text_general" indexed="false"
> >> stored="true"/>
> >>    <field name="url"     type="text_general" indexed="false"
> >> stored="true"/>
> >>    <field name="title"   type="text_general" indexed="true"
> >> stored="true"/>
> >>    <field name="content" type="text_general" indexed="true"
> >> stored="true"/>
> >>
> >> While other fields are declared, none of them are populated by my "post"
> >> command.
> >>
> >> My question is "Where are the xxxxx_str fields coming from?
> >> I.e., what is producing the
> >> |
> >> ||"date_str":["...
> >> "url_str":["...
> >> "title_str":["...
> >> "content_str":["...|
> >>
> >> entries?
> >>
> >> Thanks in advance.
> >> |
> >>
> >>
> >>
>