Range query syntax on a polygon field is returning all documents

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Range query syntax on a polygon field is returning all documents

Mitchell Bösecke
Hi everyone,

I'm trying to index geodetic polygons and then query them out using an
arbitrary rectangle. When using the Geo3D spatial context factory, the data
indexes just fine but using a range query (as per the solr documentation)
does not seem to filter the results appropriately (I get all documents
back).

When I switch it to JTS, everything works as expected. However, it
significantly slowed down the initial indexing time. A sample size of 3000
documents took 3 seconds with Geo3D and 50 seconds with JTS.

I've documented my journey in detail on stack overflow:
https://stackoverflow.com/q/55212622/1017571

   1. Can I not use the range query syntax with Geo3D? I.e. am I
   misreading the documentation?
   2. Is it expected that using JTS will *significantly* slow down the
   indexing time?

Thanks for any insight.

--
Mitchell Bosecke, B.Sc.
Senior Application Developer

FORCORP
Suite 200, 15015 - 123 Ave NW,
Edmonton, AB, T5V 1J7
www.forcorp.com
(d) 780.733.0494
(o) 780.452.5878 ext. 263
(f) 780.453.3986
Reply | Threaded
Open this post in threaded view
|

Re: Range query syntax on a polygon field is returning all documents

david.w.smiley@gmail.com
Hi Mitchell,

Seems like there's a bug based on what you've shown.
* Can you please try RptWithGeometrySpatialField instead
of SpatialRecursivePrefixTreeFieldType to see if the problem goes away?
This could point to a precision issue; though still what you've seen is
suspicious.
* Can you try one other query syntax e.g. bbox query parser to see if the
problem goes away?  I doubt this is it but you seem to point to the syntax
being related.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Mar 18, 2019 at 12:24 AM Mitchell Bösecke <
[hidden email]> wrote:

> Hi everyone,
>
> I'm trying to index geodetic polygons and then query them out using an
> arbitrary rectangle. When using the Geo3D spatial context factory, the data
> indexes just fine but using a range query (as per the solr documentation)
> does not seem to filter the results appropriately (I get all documents
> back).
>
> When I switch it to JTS, everything works as expected. However, it
> significantly slowed down the initial indexing time. A sample size of 3000
> documents took 3 seconds with Geo3D and 50 seconds with JTS.
>
> I've documented my journey in detail on stack overflow:
> https://stackoverflow.com/q/55212622/1017571
>
>    1. Can I not use the range query syntax with Geo3D? I.e. am I
>    misreading the documentation?
>    2. Is it expected that using JTS will *significantly* slow down the
>    indexing time?
>
> Thanks for any insight.
>
> --
> Mitchell Bosecke, B.Sc.
> Senior Application Developer
>
> FORCORP
> Suite 200, 15015 - 123 Ave NW,
> Edmonton, AB, T5V 1J7
> www.forcorp.com
> (d) 780.733.0494
> (o) 780.452.5878 ext. 263
> (f) 780.453.3986
>
Reply | Threaded
Open this post in threaded view
|

Re: Range query syntax on a polygon field is returning all documents

david.w.smiley@gmail.com
I answered in StackOverflow but will paste it here:

Geo3D requires that polygons adhere to the "right hand rule", and thus the
exterior ring must be in counter-clockwise order and holes must be
clockwise.  If you make this mistake then the meaning of the shape is
inverted, and thus that little rectangle in Alberta Canada represents the
inverse of that place.  Consequently most shapes will cover nearly the
entire globe!  There is certainly a documentation issue needed in Solr to
this effect.  Even I didn't know until I debugged this today!  It appears
some of the GIS industry is migrating to this rule as well:
http://mapster.me/right-hand-rule-geojson-fixer/

Separately: I would be very curious to see how Geo3D compares to JTS after
you get it working.  Additionally, you likely ought to use
solr.RptWithGeometrySpatialField instead of
solr.SpatialRecursivePrefixTreeFieldType to get the full accuracy of the
vector geometry instead of settling on a grid representation of shapes,
otherwise your queries might get false-positives for just being close to an
indexed shape.  Another thing to try is using prefixTree="s2" which is a
not-yet-documented prefixTree that supposedly is much more efficient for
Geo3D specifically.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Mar 20, 2019 at 2:00 PM David Smiley <[hidden email]>
wrote:

> Hi Mitchell,
>
> Seems like there's a bug based on what you've shown.
> * Can you please try RptWithGeometrySpatialField instead
> of SpatialRecursivePrefixTreeFieldType to see if the problem goes away?
> This could point to a precision issue; though still what you've seen is
> suspicious.
> * Can you try one other query syntax e.g. bbox query parser to see if the
> problem goes away?  I doubt this is it but you seem to point to the syntax
> being related.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Mar 18, 2019 at 12:24 AM Mitchell Bösecke <
> [hidden email]> wrote:
>
>> Hi everyone,
>>
>> I'm trying to index geodetic polygons and then query them out using an
>> arbitrary rectangle. When using the Geo3D spatial context factory, the
>> data
>> indexes just fine but using a range query (as per the solr documentation)
>> does not seem to filter the results appropriately (I get all documents
>> back).
>>
>> When I switch it to JTS, everything works as expected. However, it
>> significantly slowed down the initial indexing time. A sample size of 3000
>> documents took 3 seconds with Geo3D and 50 seconds with JTS.
>>
>> I've documented my journey in detail on stack overflow:
>> https://stackoverflow.com/q/55212622/1017571
>>
>>    1. Can I not use the range query syntax with Geo3D? I.e. am I
>>    misreading the documentation?
>>    2. Is it expected that using JTS will *significantly* slow down the
>>    indexing time?
>>
>> Thanks for any insight.
>>
>> --
>> Mitchell Bosecke, B.Sc.
>> Senior Application Developer
>>
>> FORCORP
>> Suite 200, 15015 - 123 Ave NW,
>> Edmonton, AB, T5V 1J7
>> www.forcorp.com
>> (d) 780.733.0494
>> (o) 780.452.5878 ext. 263
>> (f) 780.453.3986
>>
>