stylesheet issue

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

stylesheet issue

Tim Archambault-2
I've got solr installed and running, with only one failure left to date.
Whenver I try to select a stylesheet for my search, I get an error message
such as this:

Error loading stylesheet: A network error occured loading an XSLT
stylesheet:http://localhost:8983/admin/tabular.xsl

Something tells me something isn't mapped correctly here either in Jetty or
in a Solar config. My hunch is the path should be "
http://localhost:8983/solr/admin/tabular.xsl"

I must say the product is great and the synonym tool is unbelievable. Can't
say enough.

Any help with this stylesheet issue is greatly appreciated.

Tim
Reply | Threaded
Open this post in threaded view
|

Re: stylesheet issue

Yonik Seeley
On 6/2/06, Tim Archambault <[hidden email]> wrote:
> I've got solr installed and running, with only one failure left to date.
> Whenver I try to select a stylesheet for my search, I get an error message
> such as this:

Hi Tim,

There is no stylesheet :-)

It's a hold-over from an old XML format that Solr used to support
before it was open-sourced.  That old XML format was for compatibility
with another internal product.  It turned out that it wasn't flexible
enough to add extra info like multiple result sets, or faceted
browsing info, so we came up with v2 of the XML (but no new stylesheet
to go with it).

The XML is fairly readable though, so it hasn't been much of a problem
in practice.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: stylesheet issue

Tim Archambault-2
That'll be fine. As you can probably tell, I'm not a programmer. I am just a
dangerous end-user with expertise in marketing & online operations trying to
save a buck. I am going to try to learn XSL or if that doesn't work, I'll
bastardize the results into a coldfusion recordset.

I know I shouldn't ask you questions directly, but I have to ask you.

How many queries per minute can Solr handle in a high use situation? Our
website gets about 4 million page views a month and about 40,000 daily
visitors, which is about an hour for CNET probably. I am envisioning Solr
being the search engine for our jobs, autos, classifieds, and as a "global"
search experience that includes them all. I really want to greatly limit the
use of database connections on our site. Do you think Solr can be a "global"
solution for search on our site. It's one thing to test, yet another in a
production environment.

Which java-based web server component do you recommend for a windows
platform? Tomcat? Another? I know nothing about these tools. I am using
Jetty for testing.

Thank you for all your help.

Tim



On 6/2/06, Yonik Seeley <[hidden email]> wrote:

>
> On 6/2/06, Tim Archambault <[hidden email]> wrote:
> > I've got solr installed and running, with only one failure left to date.
> > Whenver I try to select a stylesheet for my search, I get an error
> message
> > such as this:
>
> Hi Tim,
>
> There is no stylesheet :-)
>
> It's a hold-over from an old XML format that Solr used to support
> before it was open-sourced.  That old XML format was for compatibility
> with another internal product.  It turned out that it wasn't flexible
> enough to add extra info like multiple result sets, or faceted
> browsing info, so we came up with v2 of the XML (but no new stylesheet
> to go with it).
>
> The XML is fairly readable though, so it hasn't been much of a problem
> in practice.
>
> -Yonik
>
Reply | Threaded
Open this post in threaded view
|

Re: stylesheet issue

Yonik Seeley
On 6/2/06, Tim Archambault <[hidden email]> wrote:
> That'll be fine. As you can probably tell, I'm not a programmer. I am just a
> dangerous end-user with expertise in marketing & online operations trying to
> save a buck. I am going to try to learn XSL or if that doesn't work, I'll
> bastardize the results into a coldfusion recordset.
>
> I know I shouldn't ask you questions directly, but I have to ask you.
>
> How many queries per minute can Solr handle in a high use situation?

It depends on how many documents are in the collection, the nature of
the documents (unique terms, size of fields, etc), and heavily depends
on the nature of the queries, and the CPU and memory of your hardware.

I've seen up to 1000 queries/sec for very simple queries on a 1M doc index.

> Our
> website gets about 4 million page views a month and about 40,000 daily
> visitors,

That shouldn't be a problem unless the collection is just too big.
It's pretty easy to scale Solr to higher query traffic by putting more
query servers behind a load balancer, *provided* that the latency of a
single query is acceptable.  If the collection is too big (to many
documents, to big of documents), then you need to split up the
collection and use federated search (Solr doesn't have it yet, but it
will in the future).

> I am envisioning Solr
> being the search engine for our jobs, autos, classifieds, and as a "global"
> search experience that includes them all. I really want to greatly limit the
> use of database connections on our site. Do you think Solr can be a "global"
> solution for search on our site.

By "global" do you mean Solr as the search solution for all those
collections, or do you mean having all those different types of
documents (jobs, autos, classifieds) in a single Solr index?

Unless there is a good reason to put multiple document types in the
same index, you will get better performance by putting them in their
own index.

> Which java-based web server component do you recommend for a windows
> platform? Tomcat? Another? I know nothing about these tools. I am using
> Jetty for testing.

Tomcat is the most widely used I think... and therefore easier to find
docs and find help/support for it.  I started a little Tomcat
installation guide on the Wiki last night.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: stylesheet issue

Tim Archambault-2
By "global" do you mean Solr as the search solution for all those
collections, or do you mean having all those different types of
documents (jobs, autos, classifieds) in a single Solr index?
Yes I did. I envisioned separating them by custom fields named "vertical"
and then within vertical "category"

Unless there is a good reason to put multiple document types in the
same index, you will get better performance by putting them in their
own index.
So my educated guess would be that I would create additional "schema" xml
elements in my schema.xml separately for jobs, homes, cars, news, obits, etc
( in the tutorial, I note the schema name "example") and my search query
strings would have to specify which schema to use in the query, but I don't
see a variable for "schema".

NumDocs: It looks like I am going to have an index of about 300,000
documents initially and should grow by about 150 per day..


On 6/2/06, Yonik Seeley <[hidden email]> wrote:

>
> On 6/2/06, Tim Archambault <[hidden email]> wrote:
> > That'll be fine. As you can probably tell, I'm not a programmer. I am
> just a
> > dangerous end-user with expertise in marketing & online operations
> trying to
> > save a buck. I am going to try to learn XSL or if that doesn't work,
> I'll
> > bastardize the results into a coldfusion recordset.
> >
> > I know I shouldn't ask you questions directly, but I have to ask you.
> >
> > How many queries per minute can Solr handle in a high use situation?
>
> It depends on how many documents are in the collection, the nature of
> the documents (unique terms, size of fields, etc), and heavily depends
> on the nature of the queries, and the CPU and memory of your hardware.
>
> I've seen up to 1000 queries/sec for very simple queries on a 1M doc
> index.
>
> > Our
> > website gets about 4 million page views a month and about 40,000 daily
> > visitors,
>
> That shouldn't be a problem unless the collection is just too big.
> It's pretty easy to scale Solr to higher query traffic by putting more
> query servers behind a load balancer, *provided* that the latency of a
> single query is acceptable.  If the collection is too big (to many
> documents, to big of documents), then you need to split up the
> collection and use federated search (Solr doesn't have it yet, but it
> will in the future).
>
> > I am envisioning Solr
> > being the search engine for our jobs, autos, classifieds, and as a
> "global"
> > search experience that includes them all. I really want to greatly limit
> the
> > use of database connections on our site. Do you think Solr can be a
> "global"
> > solution for search on our site.
>
> By "global" do you mean Solr as the search solution for all those
> collections, or do you mean having all those different types of
> documents (jobs, autos, classifieds) in a single Solr index?
>
> Unless there is a good reason to put multiple document types in the
> same index, you will get better performance by putting them in their
> own index.
>
> > Which java-based web server component do you recommend for a windows
> > platform? Tomcat? Another? I know nothing about these tools. I am using
> > Jetty for testing.
>
> Tomcat is the most widely used I think... and therefore easier to find
> docs and find help/support for it.  I started a little Tomcat
> installation guide on the Wiki last night.
>
> -Yonik
>
Reply | Threaded
Open this post in threaded view
|

Re: stylesheet issue

Yonik Seeley
On 6/2/06, Tim Archambault <[hidden email]> wrote:
> So my educated guess would be that I would create additional "schema" xml

Solr doesn't support multiple schemas.  The current way to do this is
to run multiple instances of Solr.  Another way is to run multiple
Solr webapps in the same servlet container... slightly harder for
config, but easier on memory.

> NumDocs: It looks like I am going to have an index of about 300,000
> documents initially and should grow by about 150 per day..

300,000 isn't too bad at all... you should be able to get away will
adding all the different document types to the same index.  If you
want to be able to search across multiple verticles in a single
request, this is the way to go.

You could always split it out later if performance becomes an issue.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: stylesheet issue

Tim Archambault-2
Thanks again for all your help. You've been great. Someday I may want to
convert our xml archives into the search, but not yet. Sounds like Solr will
be more scalable in the future and that may be feasible. Have a great
weekend.

On 6/2/06, Yonik Seeley <[hidden email]> wrote:

>
> On 6/2/06, Tim Archambault <[hidden email]> wrote:
> > So my educated guess would be that I would create additional "schema"
> xml
>
> Solr doesn't support multiple schemas.  The current way to do this is
> to run multiple instances of Solr.  Another way is to run multiple
> Solr webapps in the same servlet container... slightly harder for
> config, but easier on memory.
>
> > NumDocs: It looks like I am going to have an index of about 300,000
> > documents initially and should grow by about 150 per day..
>
> 300,000 isn't too bad at all... you should be able to get away will
> adding all the different document types to the same index.  If you
> want to be able to search across multiple verticles in a single
> request, this is the way to go.
>
> You could always split it out later if performance becomes an issue.
>
> -Yonik
>
Reply | Threaded
Open this post in threaded view
|

Re: stylesheet issue

Chris Hostetter-3
In reply to this post by Yonik Seeley
: There is no stylesheet :-)
:
: It's a hold-over from an old XML format that Solr used to support
: before it was open-sourced.  That old XML format was for compatibility
: with another internal product.  It turned out that it wasn't flexible
: enough to add extra info like multiple result sets, or faceted
: browsing info, so we came up with v2 of the XML (but no new stylesheet
: to go with it).
:
: The XML is fairly readable though, so it hasn't been much of a problem
: in practice.

Yeah ... the whole way the stylesheet param is handled has allwyas kind of
bugged me ... in the back of my mind, i've been thinking that the right
thing to do would be to change it so if it's specified, the string is used
verbatim as the stylehseet URL instead of hte current practice of
assuming it's in the admin directory -- that way people could either
specify fully qualified URLs on another host, or quasi-relative paths
rooted with / on another webapp of the current host/port, or it could even
be a refrence to get-files.jsp so they could store the XSLTs in their
./solr directory.

another way to go if we add init() params to QueryResponseWriter would be
to make the XmlResponseWriter take in a NamedList of alias=>URL mappings
of all the stylesheets it wanted to support (which could still be served
via get-files.jsp)


-Hoss