Limit on number of schema fields?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Limit on number of schema fields?

Mike Baranczak-2
Is there any significant penalty for having a large number of fields  
in a Solr schema (like between 50 and 100)?

We have a site with several different types of searchable content,  
and each of those types will require several different fields (most  
of which are not shared). I figured that it'd be easier to have  
everything in one index, but I just want to make sure this won't  
cause any problems.

-MB

Reply | Threaded
Open this post in threaded view
|

Re: Limit on number of schema fields?

Yonik Seeley-2
On 10/18/06, Mike Baranczak <[hidden email]> wrote:
> Is there any significant penalty for having a large number of fields
> in a Solr schema (like between 50 and 100)?

We have had far larger numbers than that.

The only thing to watch out for is norms, which take up a byte per
document regardless of the number of documents containing the field.
You can omit norms if you don't need length normalization or
index-time boosting.

For fields that need norms (like full-text fields), you may be able to
share some of them between documents of different types.  Whether you
should worry about it really depends on the total number of documents
in the index

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Limit on number of schema fields?

Otis Gospodnetic-2
In reply to this post by Mike Baranczak-2


----- Original Message ----
From: Yonik Seeley <[hidden email]>
To: [hidden email]
Sent: Wednesday, October 18, 2006 2:31:44 PM
Subject: Re: Limit on number of schema fields?

On 10/18/06, Mike Baranczak <[hidden email]> wrote:
> Is there any significant penalty for having a large number of fields
> in a Solr schema (like between 50 and 100)?

We have had far larger numbers than that.

The only thing to watch out for is norms, which take up a byte per
document regardless of the number of documents containing the field.
You can omit norms if you don't need length normalization or
index-time boosting.

OG: Yonik, wasn't the saving 1 byte * # of indexed fields * # of docs?

Otis



Reply | Threaded
Open this post in threaded view
|

Re: Limit on number of schema fields?

Yonik Seeley-2
On 10/19/06, Otis Gospodnetic <[hidden email]> wrote:

> On 10/18/06, Mike Baranczak <[hidden email]> wrote:
> > Is there any significant penalty for having a large number of fields
> > in a Solr schema (like between 50 and 100)?
>
> We have had far larger numbers than that.
>
> The only thing to watch out for is norms, which take up a byte per
> document regardless of the number of documents containing the field.
> You can omit norms if you don't need length normalization or
> index-time boosting.
>
> OG: Yonik, wasn't the saving 1 byte * # of indexed fields * # of docs?

Right, I was talking per indexed field, since omitting norms can be
controlled per-field.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Query and UTF8

PanosJee
Hi everyone,
i have the following problem... i use UTF-8 and my solr is deployed with
the aid of jetty
So far i have no problem with UTF-8 in the sense than i can store
anything and then it is displayed properly the problem is when i want to
retrieve according to a text field which is not encoded in latin
characters....

for example q=name:ILSP will return the results i want
but q=name:ΙΕΛ would not....
Reply | Threaded
Open this post in threaded view
|

Re: Query and UTF8

Yonik Seeley-2
On 10/23/06, Panayiotis Papadopoulos <[hidden email]> wrote:

> Hi everyone,
> i have the following problem... i use UTF-8 and my solr is deployed with
> the aid of jetty
> So far i have no problem with UTF-8 in the sense than i can store
> anything and then it is displayed properly the problem is when i want to
> retrieve according to a text field which is not encoded in latin
> characters....
>
> for example q=name:ILSP will return the results i want
> but q=name:ΙΕΛ would not....

What are you using as the client?
The problem is most likely related to client encoding...

One way to nail this down is to use netcat (nc) and have it act as the
solr server, then see what the client sends.

nc -l -p 8983
will accept incoming connections and print what is sent on the terminal.
nc -l -p 8983 -o out.txt
will put a hex dump of what the client sent to out.txt

-Yonik