Re: If you could have one feature in Solr...

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Lance Norskog-2
Error messages that make sense. I have to read the source far too
often when a simple change to errror-handling would make some feature
easy to use. If I want to read Java I'll use Lucene!

Passive-aggressive error handling is a related problem: when I do
something nonsensical I too often get "0 results found" instead of
"what does that mean?".

On Thu, Feb 25, 2010 at 12:52 PM, Smiley, David W. <[hidden email]> wrote:

> 1. Spatial search
> 2. Ease of managing a sharded index, multi-server Solr instance.
>
> I am aware these are in-progress, slated for Solr 1.5.
>
> I may find myself getting involved on these shortly because I'm working on a very large scale search project requiring both.
>
> ~ David
>
> On Feb 24, 2010, at 8:42 AM, Grant Ingersoll wrote:
>
>> What would it be?
>
>



--
Lance Norskog
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Don Werve-3
Realtime search, hands down.
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Stephen Weiss-2
+1

I have several projects backburnered in the hope realtime search will  
come to solr soon...

[m]

On Feb 26, 2010, at 8:37 PM, Don Werve <[hidden email]> wrote:

> Realtime search, hands down.
Reply | Threaded
Open this post in threaded view
|

RE: If you could have one feature in Solr...

stuart yeates-3
The indexer looking for an xml:lang attribute on text fields and using the value to pick, tokeniser, dictionaries, etc, etc automatically (and knowing to look for them in the standard places).

cheers
stuart
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Dave Searle
To have a coffee waiting for me every morning when I wake up. Marriage  
material indeed.
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Grant Ingersoll-2

On Feb 26, 2010, at 11:28 PM, Dave Searle wrote:

> To have a coffee waiting for me every morning when I wake up. Marriage  
> material indeed.


Dave,

Didn't you know that one already exists?

http://localhost:8983/solr/admin/coffeehandler?type=ethiopian&cream=false&sugar=true&togo=true

:-)

-Grant
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Dave Searle
Haha, superb! Have never noticed that before! That's made my day  
Grant :-)

On 27 Feb 2010, at 12:13, "Grant Ingersoll" <[hidden email]> wrote:

>
> On Feb 26, 2010, at 11:28 PM, Dave Searle wrote:
>
>> To have a coffee waiting for me every morning when I wake up.  
>> Marriage
>> material indeed.
>
>
> Dave,
>
> Didn't you know that one already exists?
>
> http://localhost:8983/solr/admin/coffeehandler?type=ethiopian&cream=false&sugar=true&togo=true
>
> :-)
>
> -Grant
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Stephen Weiss-2
In reply to this post by Lance Norskog-2
I think an examples page would be a good idea.  We've already  
implemented search in Chinese, Japanese, and Spanish back with 1.3,  
but it was not really very well laid out how it was supposed to work -  
I had to dig through bits and pieces of people's configs left in the  
mailing list archives - and to be honest, I've never been 100%  
positive that we did it the "right" way.  On the other hand, that it  
was possible was pretty obvious to me from reading the documentation  
(it was all in the API docs), it was just *how* to implement it that  
wasn't very clear for a non-java/lucene programmer like myself.

--
Steve

On Feb 25, 2010, at 1:06 PM, Robert Muir wrote:

> Yeah, Thai and Arabic have the stuff in Solr 1.4
> For Chinese, if you want to do CJK bigram indexing, this is there too.
> If you want to do word-based "smart" indexing, you need to add an  
> additional
> jar file to your classpath.
>
> we can add a wiki page with examples of how to use these maybe to  
> make it
> easier?
>
> we could also add notes to new ones in lucene (hindi, czech,  
> bulgarian,
> etc), as it might be easier to copy some code around and get them  
> working
> with solr 1.4 than to write your own!
>
> separately, would you be interesting in helping with Bengali and  
> Marathi?

Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Adrien Specq
 - Built-in hierarchical faceting
and
 - langage attribute for each field

On Sat, Feb 27, 2010 at 9:59 PM, Stephen Weiss <[hidden email]>wrote:

> I think an examples page would be a good idea.  We've already implemented
> search in Chinese, Japanese, and Spanish back with 1.3, but it was not
> really very well laid out how it was supposed to work - I had to dig through
> bits and pieces of people's configs left in the mailing list archives - and
> to be honest, I've never been 100% positive that we did it the "right" way.
>  On the other hand, that it was possible was pretty obvious to me from
> reading the documentation (it was all in the API docs), it was just *how* to
> implement it that wasn't very clear for a non-java/lucene programmer like
> myself.
>
> --
> Steve
>
>
> On Feb 25, 2010, at 1:06 PM, Robert Muir wrote:
>
>  Yeah, Thai and Arabic have the stuff in Solr 1.4
>> For Chinese, if you want to do CJK bigram indexing, this is there too.
>> If you want to do word-based "smart" indexing, you need to add an
>> additional
>> jar file to your classpath.
>>
>> we can add a wiki page with examples of how to use these maybe to make it
>> easier?
>>
>> we could also add notes to new ones in lucene (hindi, czech, bulgarian,
>> etc), as it might be easier to copy some code around and get them working
>> with solr 1.4 than to write your own!
>>
>> separately, would you be interesting in helping with Bengali and Marathi?
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Ian Holsman (Lists)
In reply to this post by Lance Norskog-2
On 2/24/10 8:42 AM, Grant Ingersoll wrote:
> What would it be?
>
>    
most of this will be coming in 1.5,
but for me it's

- sharding.. it still seems a bit clunky

secondly.. this one isn't in 1.5.
I'd like to be able to find "interesting" terms that appear in my result
set that don't appear in the global corpus.

it's kind of like doing a facet count on *:* and then on the search term
and discount the terms that appear heavily on the global one.
(sorry.. there is a textbook definition of this.. XX distance.. but I
haven't got the books in front of me).





Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Andrzej Białecki-2
On 2010-02-28 17:26, Ian Holsman wrote:

> On 2/24/10 8:42 AM, Grant Ingersoll wrote:
>> What would it be?
>>
> most of this will be coming in 1.5,
> but for me it's
>
> - sharding.. it still seems a bit clunky
>
> secondly.. this one isn't in 1.5.
> I'd like to be able to find "interesting" terms that appear in my result
> set that don't appear in the global corpus.
>
> it's kind of like doing a facet count on *:* and then on the search term
> and discount the terms that appear heavily on the global one.
> (sorry.. there is a textbook definition of this.. XX distance.. but I
> haven't got the books in front of me).

Kullback-Leibler divergence?


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Erik Hatcher-4
In reply to this post by Adrien Specq

On Feb 28, 2010, at 8:47 AM, Adrien Specq wrote:
> - Built-in hierarchical faceting

Adrien - I'm curious what you mean by this exactly.  Could you  
describe your hierarchical faceting needs by example?    Often  
hierarchical faceting can be accomplished by simply indexing "/level1/
level2/..." type string fields, but there certainly are other ways to  
go about it as well.

Thanks,
        Erik

Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Noble Paul നോബിള്‍  नोब्ळ्
In reply to this post by Lance Norskog-2
On Wed, Feb 24, 2010 at 7:18 PM, Patrick Sauts <[hidden email]> wrote:
> Synchronisation between the slaves to switch the new index at the same time
> after replication.

I shall open as issue for this. And let us figure out how best it should be done
https://issues.apache.org/jira/browse/SOLR-1800
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Jorg Heymans-4
The ability to read solr configuration files from the classpath instead of
solr.solr.home directory.

Jorg

2010/3/1 Noble Paul നോബിള്‍ नोब्ळ् <[hidden email]>

> On Wed, Feb 24, 2010 at 7:18 PM, Patrick Sauts <[hidden email]>
> wrote:
> > Synchronisation between the slaves to switch the new index at the same
> time
> > after replication.
>
> I shall open as issue for this. And let us figure out how best it should be
> done
> https://issues.apache.org/jira/browse/SOLR-1800
>
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

hossman
: The ability to read solr configuration files from the classpath instead of
: solr.solr.home directory.

Solr has always supported this.  

When SolrResourceLoader.openResourceLoader is asked to open a resource it
first checks if it's an absolute path -- if it's not then it checks
relative the "conf" dir (under whatever the instanceDir is, ie: Solr Home
in a single core setup), then it checks relative the current working dir
and if it still can't find it it checks via the current ClassLoader.

that said: it's not something that a lot of people have ever taken
advantage of, so it wouldn't suprise me if some features in Solr are
buggy because they try to open files directly w/o utilizing
openResourceLoader -- in particular a quick test of the trunk example
using...
java -Djetty.class.path="./solr/conf" -Dsolr.solr.home=/tmp/new-solr-home -jar start.jar

...seems to suggest that QueryElevationComponent isn't using openResource
to look for elevate.xml  (i set solr.solr.home in that line so solr would
*NOT* attempt to look at "./solr" ... it does need some sort of Solr Home,
but in this case it was a completley empty directory)


-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Mark Miller-3
On 03/04/2010 05:56 PM, Chris Hostetter wrote:

> : The ability to read solr configuration files from the classpath instead of
> : solr.solr.home directory.
>
> Solr has always supported this.
>
> When SolrResourceLoader.openResourceLoader is asked to open a resource it
> first checks if it's an absolute path -- if it's not then it checks
> relative the "conf" dir (under whatever the instanceDir is, ie: Solr Home
> in a single core setup), then it checks relative the current working dir
> and if it still can't find it it checks via the current ClassLoader.
>
> that said: it's not something that a lot of people have ever taken
> advantage of, so it wouldn't suprise me if some features in Solr are
> buggy because they try to open files directly w/o utilizing
> openResourceLoader -- in particular a quick test of the trunk example
> using...
> java -Djetty.class.path="./solr/conf" -Dsolr.solr.home=/tmp/new-solr-home -jar start.jar
>
> ...seems to suggest that QueryElevationComponent isn't using openResource
> to look for elevate.xml  (i set solr.solr.home in that line so solr would
> *NOT* attempt to look at "./solr" ... it does need some sort of Solr Home,
> but in this case it was a completley empty directory)
>
>
> -Hoss
>
>    

I've been trying to think of ways to tackle this. I hate getConfigDir -
it lets anyone just get around the ResourceLoader basically.

It would be awesome to get rid of it somehow - it would make
ZooKeeperSolrResourceLoader so much easier to get working correctly
across the board.

The main thing I'm hung up on is how to update a file - some code I've
seen uses getConfigDir to update files eg you get the content of
solrconfig, then
you want to update it and reload the core. Most other things, I think
are doable without getConfigDir.

QueryElevationComponent is actually sort of simple to get around - we
just need to add an exists method that return true/false if the resource
exists.
QEC just uses getConfigDir to a do an exists on the elevate.xml - if its
not there, it looks in the data dir.

--
- Mark

http://www.lucidimagination.com



Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

Noble Paul നോബിള്‍  नोब्ळ्
On Fri, Mar 5, 2010 at 4:34 AM, Mark Miller <[hidden email]> wrote:

> On 03/04/2010 05:56 PM, Chris Hostetter wrote:
>>
>> : The ability to read solr configuration files from the classpath instead
>> of
>> : solr.solr.home directory.
>>
>> Solr has always supported this.
>>
>> When SolrResourceLoader.openResourceLoader is asked to open a resource it
>> first checks if it's an absolute path -- if it's not then it checks
>> relative the "conf" dir (under whatever the instanceDir is, ie: Solr Home
>> in a single core setup), then it checks relative the current working dir
>> and if it still can't find it it checks via the current ClassLoader.
>>
>> that said: it's not something that a lot of people have ever taken
>> advantage of, so it wouldn't suprise me if some features in Solr are
>> buggy because they try to open files directly w/o utilizing
>> openResourceLoader -- in particular a quick test of the trunk example
>> using...
>> java -Djetty.class.path="./solr/conf" -Dsolr.solr.home=/tmp/new-solr-home
>> -jar start.jar
>>
>> ...seems to suggest that QueryElevationComponent isn't using openResource
>> to look for elevate.xml  (i set solr.solr.home in that line so solr would
>> *NOT* attempt to look at "./solr" ... it does need some sort of Solr Home,
>> but in this case it was a completley empty directory)
>>
>>
>> -Hoss
>>
>>
>
> I've been trying to think of ways to tackle this. I hate getConfigDir - it
> lets anyone just get around the ResourceLoader basically.
>
> It would be awesome to get rid of it somehow - it would make
> ZooKeeperSolrResourceLoader so much easier to get working correctly across
> the board.
Why not just get rid of it? Components depending on filesystems is a
big headache.

>
> The main thing I'm hung up on is how to update a file - some code I've seen
> uses getConfigDir to update files eg you get the content of solrconfig, then
> you want to update it and reload the core. Most other things, I think are
> doable without getConfigDir.
>
> QueryElevationComponent is actually sort of simple to get around - we just
> need to add an exists method that return true/false if the resource exists.
> QEC just uses getConfigDir to a do an exists on the elevate.xml - if its not
> there, it looks in the data dir.
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>



--
-----------------------------------------------------
Noble Paul | Systems Architect| AOL | http://aol.com
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

T. Kuro Kurosaka
In reply to this post by Adrien Specq
(Sorry for very late response on this topic.)

On Feb 28, 2010, at 5:47 AM, Adrien Specq wrote:

> - langage attribute for each field

I was thinking about it and it was one of my wishes.
Currently, Solr practically requires that we have
a field for each natural language that an application
supports.  If the app needs to support English, French and
German, we would have to have title_en, title_fr, and title_de
(suffixes are ISO 2-letter lang codes) instead of just
a title field.  This isn't pretty.  

What if we want to support 15 languages?  It would be much
better if we can have just one title field and language
information associated with the value.  

But after I thought about it a bit deeper, I think the
current ugly solution is actually practical.  This is because
most users want to find documents of the languages they
understand.  So if a user indicate they understand English and
German only, we just need to search title_en and title_de.

Maybe I'm missing something...

----
Teruhiko "Kuro" Kurosaka, 415-227-9600 x122
RLP + Lucene & Solr = powerful search for global contents

Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

gearond
Most databases only RECENTLY have set up langauges per column. Languages per ENTRY in a column? I don't think any support that yet. How would you get that information from a database with the corresponding language attribute?


Dennis Gearon

Signature Warning
----------------
EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Wed, 3/24/10, Teruhiko Kurosaka <[hidden email]> wrote:

> From: Teruhiko Kurosaka <[hidden email]>
> Subject: Re: If you could have one feature in Solr...
> To: "[hidden email]" <[hidden email]>
> Date: Wednesday, March 24, 2010, 11:36 AM
> (Sorry for very late response on this
> topic.)
>
> On Feb 28, 2010, at 5:47 AM, Adrien Specq wrote:
>
> > - langage attribute for each field
>
> I was thinking about it and it was one of my wishes.
> Currently, Solr practically requires that we have
> a field for each natural language that an application
> supports.  If the app needs to support English, French
> and
> German, we would have to have title_en, title_fr, and
> title_de
> (suffixes are ISO 2-letter lang codes) instead of just
> a title field.  This isn't pretty. 
>
> What if we want to support 15 languages?  It would be
> much
> better if we can have just one title field and language
> information associated with the value. 
>
> But after I thought about it a bit deeper, I think the
> current ugly solution is actually practical.  This is
> because
> most users want to find documents of the languages they
> understand.  So if a user indicate they understand
> English and
> German only, we just need to search title_en and title_de.
>
> Maybe I'm missing something...
>
> ----
> Teruhiko "Kuro" Kurosaka, 415-227-9600 x122
> RLP + Lucene & Solr = powerful search for global
> contents
>
>
Reply | Threaded
Open this post in threaded view
|

Re: If you could have one feature in Solr...

T. Kuro Kurosaka
First of all, I am not really concerned with "per field"
(or per-column in DB term) portion of the original request.
Most documents are monolingual.

How languages are identified depends on your application,
and database support of language tagging is not necessary.

The database schema designer may have created a field that
stores the language information, for example.

If you are indexing documents that live in a file system,
the directory hierarchy or the name of the documents might
tell the language, assuming you have set up some standard
naming convention.

HTML documents may have the META tag for Content-Language.  
If it is from an HTTP feed, there may be Content-Language header.

And if all else fails, or the information is not reliable, the language
can be determined by analyzing the document statistically by software
such as Nutch's Language Identifier, or commercial language identifier
software like my employer, Basis Technology, sells.

> Most databases only RECENTLY have set up langauges per column. Languages per ENTRY in a column? I don't think any support that yet. How would you get that information from a database with the corresponding language attribute?
>
>
> Dennis Gearon
>
> Signature Warning
> ----------------
> EARTH has a Right To Life,
>  otherwise we all die.
>
> Read 'Hot, Flat, and Crowded'
> Laugh at http://www.yert.com/film.php
>
>
> --- On Wed, 3/24/10, Teruhiko Kurosaka <[hidden email]> wrote:
>
>> From: Teruhiko Kurosaka <[hidden email]>
>> Subject: Re: If you could have one feature in Solr...
>> To: "[hidden email]" <[hidden email]>
>> Date: Wednesday, March 24, 2010, 11:36 AM
>> (Sorry for very late response on this
>> topic.)
>>
>> On Feb 28, 2010, at 5:47 AM, Adrien Specq wrote:
>>
>>> - langage attribute for each field
>>
>> I was thinking about it and it was one of my wishes.
>> Currently, Solr practically requires that we have
>> a field for each natural language that an application
>> supports.  If the app needs to support English, French
>> and
>> German, we would have to have title_en, title_fr, and
>> title_de
>> (suffixes are ISO 2-letter lang codes) instead of just
>> a title field.  This isn't pretty.  
>>
>> What if we want to support 15 languages?  It would be
>> much
>> better if we can have just one title field and language
>> information associated with the value.  
>>
>> But after I thought about it a bit deeper, I think the
>> current ugly solution is actually practical.  This is
>> because
>> most users want to find documents of the languages they
>> understand.  So if a user indicate they understand
>> English and
>> German only, we just need to search title_en and title_de.
>>
>> Maybe I'm missing something...
>>
>> ----
>> Teruhiko "Kuro" Kurosaka, 415-227-9600 x122
>> RLP + Lucene & Solr = powerful search for global
>> contents
>>
>>

----
Teruhiko "Kuro" Kurosaka, 415-227-9600 x122
RLP + Lucene & Solr = powerful search for global contents
12