SIP-9: Library of Congress Donation: Advanced Query Parser

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

SIP-9: Library of Congress Donation: Advanced Query Parser

Nibeck, Mike

Solr has a wide array of query parsers, but lacks a comprehensive parser targeted at non-technical, but professional users that use search daily. Such users will be willing to learn significant syntax to gain access to complex features such as span queries, or literal search. However, such users are not search engineers and would not normally be savvy enough to wrangle the full set of options provided via parser switching using local parameters etc.  

Similarly, the providers of such search may often want to recognize and synonomize patterns of text important to their users. Often these patterns include punctuation that is discarded by the current query parsers. Finally a parser that minimizes the risk of disastrous queries and prevents the user from arbitrarily invoking any parser they feel like via local parameters is desirable for security and system safety.

The Library of congress has many such users on Congress.gov and has developed a query parser and associated analysis filters to meet these needs. The Library of Congress wishes to donate this parser to the ASF and the Lucene-Solr project, and to this end we have published a SIP here: 

<a href="https://cwiki.apache.org/confluence/display/SOLR/SIP-9&#43;Advanced&#43;Query&#43;Parser">https://cwiki.apache.org/confluence/display/SOLR/SIP-9+Advanced+Query+Parser

 

Please read the SIP description and then come back here for discussion.

 

Mike Nibeck

Software Engineering Manager

OCIO – IT Design and Development

Library of Congress