[DISCUSS] Geo/spatial organization in Lucene

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Geo/spatial organization in Lucene

david.w.smiley@gmail.com
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Adrien Grand
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.

Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Nicholas Knize
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Alan Woodward-3
I don’t normally speak up on spatial issues because I don’t know anything about spatial stuff, but I suppose a point of view from somebody outside the code may be helpful, so…

I think I’d lean towards B. Having the 99% case in core makes most sense to me, and it means that we can add some pointers to the search package-info to make it easier for people starting out.  Common interfaces in core make it easier to put specialist classes into separate modules without having cross-dependencies.

I’m not sure that having separate ‘spatial’ and ‘spatial3d’ modules is particularly useful, though.  I’d combine these into a single module, with clear package docs explaining what each part is useful for - fast shape searching vs high-precision, etc.

I spent a bit of time in the spatial-extras code last year when I was working on replacing ValueSource.  One question I have, again as an outsider to all this, is this: are there still circumstances where indexing spatial data into the terms index, as spatial-extras does, is better in terms of accuracy or performance than using the Points API?  Or should we think about spatial-extras as we do about the legacy numeric encodings, and direct users to LatLonPoint or the geo3d classes instead?  It’s not at all clear to me what the trade-offs are here.

- Alan

On 20 Jun 2018, at 18:00, Nicholas Knize <[hidden email]> wrote:

If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Karl Wright-2
The abstractions in spatial3d are not the same abstractions as you find in spatial, and yet they sound similar.  So I'd worry that if we threw them together we'd be setting ourselves up for a shotgun marriage at some point.  I would strongly disagree that that was in any way a good idea.

The fundamental difference between the two is one treats the world as a 2D surface, and the other treats the world as an actual ellipsoid.  So it's not just about numeric accuracy, since lines in 2D are not the proper great circles, and there are singularities at the poles.

Karl


On Fri, Jun 22, 2018 at 4:44 AM Alan Woodward <[hidden email]> wrote:
I don’t normally speak up on spatial issues because I don’t know anything about spatial stuff, but I suppose a point of view from somebody outside the code may be helpful, so…

I think I’d lean towards B. Having the 99% case in core makes most sense to me, and it means that we can add some pointers to the search package-info to make it easier for people starting out.  Common interfaces in core make it easier to put specialist classes into separate modules without having cross-dependencies.

I’m not sure that having separate ‘spatial’ and ‘spatial3d’ modules is particularly useful, though.  I’d combine these into a single module, with clear package docs explaining what each part is useful for - fast shape searching vs high-precision, etc.

I spent a bit of time in the spatial-extras code last year when I was working on replacing ValueSource.  One question I have, again as an outsider to all this, is this: are there still circumstances where indexing spatial data into the terms index, as spatial-extras does, is better in terms of accuracy or performance than using the Points API?  Or should we think about spatial-extras as we do about the legacy numeric encodings, and direct users to LatLonPoint or the geo3d classes instead?  It’s not at all clear to me what the trade-offs are here.

- Alan

On 20 Jun 2018, at 18:00, Nicholas Knize <[hidden email]> wrote:

If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Alexandre Rafalovitch
Maybe it should be spatial2D vs. spatial3D then? To avoid sounding
that spatial3D is a child of spatial.

Regards,
   Alex.

On 22 June 2018 at 05:41, Karl Wright <[hidden email]> wrote:

> The abstractions in spatial3d are not the same abstractions as you find in
> spatial, and yet they sound similar.  So I'd worry that if we threw them
> together we'd be setting ourselves up for a shotgun marriage at some point.
> I would strongly disagree that that was in any way a good idea.
>
> The fundamental difference between the two is one treats the world as a 2D
> surface, and the other treats the world as an actual ellipsoid.  So it's not
> just about numeric accuracy, since lines in 2D are not the proper great
> circles, and there are singularities at the poles.
>
> Karl
>
>
> On Fri, Jun 22, 2018 at 4:44 AM Alan Woodward <[hidden email]> wrote:
>>
>> I don’t normally speak up on spatial issues because I don’t know anything
>> about spatial stuff, but I suppose a point of view from somebody outside the
>> code may be helpful, so…
>>
>> I think I’d lean towards B. Having the 99% case in core makes most sense
>> to me, and it means that we can add some pointers to the search package-info
>> to make it easier for people starting out.  Common interfaces in core make
>> it easier to put specialist classes into separate modules without having
>> cross-dependencies.
>>
>> I’m not sure that having separate ‘spatial’ and ‘spatial3d’ modules is
>> particularly useful, though.  I’d combine these into a single module, with
>> clear package docs explaining what each part is useful for - fast shape
>> searching vs high-precision, etc.
>>
>> I spent a bit of time in the spatial-extras code last year when I was
>> working on replacing ValueSource.  One question I have, again as an outsider
>> to all this, is this: are there still circumstances where indexing spatial
>> data into the terms index, as spatial-extras does, is better in terms of
>> accuracy or performance than using the Points API?  Or should we think about
>> spatial-extras as we do about the legacy numeric encodings, and direct users
>> to LatLonPoint or the geo3d classes instead?  It’s not at all clear to me
>> what the trade-offs are here.
>>
>> - Alan
>>
>> On 20 Jun 2018, at 18:00, Nicholas Knize <[hidden email]> wrote:
>>
>> If I were to pick between the two, I also have a preference for B.  I've
>> also tried to keep this whole spatial organization rather simple:
>>
>> core - simple spatial capabilities needed by the 99% spatial use case
>> (e.g., web mapping). Includes LatLonPoint, polygon & distance search
>> (everything currently in sandbox). Lightweight, and no dependencies or
>> complexities. If one wants simple and fast point search, all you need is the
>> core module.
>>
>> spatial - dependency free. Expands on core spatial to include simple shape
>> searching. Uses internal relations. Everything confined to core and spatial
>> modules.
>>
>> spatial-extras - expanded spatial capabilities. Welcomes third-party
>> dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS
>> use-cases.
>>
>> geo3d - trades speed for accuracy. I've always struggled with the name,
>> since it implies 3D shapes/point cloud support. But history has shown
>> considering a name change to be a bike-shedding endeavor.
>>
>> At the end of the day I'm up for whatever makes most sense for everyone
>> here. Lord knows we could use more people helping out on geo.
>>
>> - Nick
>>
>>
>>
>> On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
>>>
>>> I have a slight preference for B similarly to how StandardAnalyzer is in
>>> core and other analyzers are in analysis, but no strong feelings. In any
>>> case I agree that both A and B would be much better than the current
>>> situation.
>>>
>>>
>>> Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a
>>> écrit :
>>>>
>>>> I think everyone agrees the current state of spatial code organization
>>>> in Lucene is not desirable.  We have a spatial module that has almost
>>>> nothing in it, we have mature spatial code in the sandbox that needs to
>>>> "graduate" somewhere, and we've got a handful of geo utilities in Lucene
>>>> core (mostly because I didn't notice).  No agreement has been reached on
>>>> what the desired state should be.
>>>>
>>>> I'd like to hear opinions on this from members of the community.  I am
>>>> especially interested in listening to people that normally don't seem to
>>>> speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I
>>>> respect both of you guys a ton for your tenure with Lucene and aren't too
>>>> pushy with your opinions. I can be convinced to change my mind, especially
>>>> if coming from you two.  Of course anyone can respond -- this is an open
>>>> discussion!
>>>>
>>>> As I understand it, there are two proposals loosely defined as follows:
>>>>
>>>> (A) Common spatial needs will be met in the "spatial" module.  The
>>>> Lucene "spatial" module, currently in a weird gutted state, should have
>>>> basically all spatial code currently in sandbox plus all geo stuff in Lucene
>>>> core. Thus there will be no geo stuff in Lucene core.
>>>>
>>>> (B) Common spatial needs will be met by Lucene core.  Lucene core should
>>>> expand it's current "geo" utilities to include the spatial stuff currently
>>>> in the sandbox module.  It'd also take on what little remains in the Lucene
>>>> spatial module and thus we can remove the spatial module.
>>>>
>>>> With either plan if a user has certain advanced/specialized needs they
>>>> may need to go to spatial3d or spatial-extras modules.  These would be
>>>> untouched in both proposals.
>>>>
>>>> I'm in favor of (A) on the grounds that we have modules for special
>>>> feature areas, and spatial should be no different.  My gut estimation is
>>>> that 75-90% of apps do not have spatial requirements and need not depend on
>>>> any spatial module.  Other modules are probably used more (e.g. queries,
>>>> suggest, etc.)
>>>>
>>>> Respectfully,
>>>>   ~ David
>>>>
>>>> p.s. if I mischaracterized any proposal or overlooked another then I'm
>>>> sorry, please correct me.
>>>> --
>>>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>>>> http://www.solrenterprisesearchserver.com
>>
>> --
>> Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache
>> Lucene  |  [hidden email]
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

david.w.smiley@gmail.com
In reply to this post by Nicholas Knize
Nick, are you not only arguing for spatial code to be in Lucene core, but also for the "spatial" module to continue to exist?  And I believe Adrien still wants some spatial stuff in sandbox so that means spatial code in 5 modules.  Five modules... let that that sink in... wow.  Gosh that's kinda overwhelming IMO.

Karl do you have any opinions about this stuff?  I don't know what your opinions are, come to think of it.

~ David

On Wed, Jun 20, 2018 at 1:01 PM Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Karl Wright-2
Sorry, I just returned from an overseas trip.  I'll try to put some thought into a cogent response when I get a little less scrambled.

Karl


On Fri, Jun 22, 2018 at 4:16 PM David Smiley <[hidden email]> wrote:
Nick, are you not only arguing for spatial code to be in Lucene core, but also for the "spatial" module to continue to exist?  And I believe Adrien still wants some spatial stuff in sandbox so that means spatial code in 5 modules.  Five modules... let that that sink in... wow.  Gosh that's kinda overwhelming IMO.

Karl do you have any opinions about this stuff?  I don't know what your opinions are, come to think of it.

~ David

On Wed, Jun 20, 2018 at 1:01 PM Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Nicholas Knize
In reply to this post by david.w.smiley@gmail.com
Hi David. 

I'm not arguing for or against anything in particular. I was simply communicating the state of things as I saw today. And yes, we have spatial code in five modules; and yes, that's pretty crazy. I was originally of the "keep it simple" opinion that all spatial should live in either the spatial module (dependency free), or spatial-extras (dependencies welcome) and that core should have absolutely no spatial code whatsoever. I still feel pretty strongly about this, but there are some compelling reasons to have a simple LatLonPoint in core at minimum. Namely the one Mike raised - "because it's the best default geo implementation we have to offer for basic usage now."  I can't argue with that because at the end of the day I think its good for the Lucene project to have a default spatial capability. 

That being said, I have been working on a simple default shape implementation that also uses BKD. While sandbox is certainly a good place for this to start, I do struggle with where it will ultimately land. Does it become a good default shape implementation that should go in core like LatLonPoint? Or is it considered "expert" and go in the spatial module (which further fragments the spatial codebase)? At the end of the day it doesn't matter as long as 1. the learning curve to contribute is not too high for the rest of the community, 2. javadocs make it clear where everything lives, and 3. people actually read javadocs.
 

On Fri, Jun 22, 2018 at 3:16 PM David Smiley <[hidden email]> wrote:
Nick, are you not only arguing for spatial code to be in Lucene core, but also for the "spatial" module to continue to exist?  And I believe Adrien still wants some spatial stuff in sandbox so that means spatial code in 5 modules.  Five modules... let that that sink in... wow.  Gosh that's kinda overwhelming IMO.

Karl do you have any opinions about this stuff?  I don't know what your opinions are, come to think of it.

~ David


On Wed, Jun 20, 2018 at 1:01 PM Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Karl Wright-2
Data points:

(1) For both geo3d and Robert's implementation (at least!) there exists a public API already.  For geo3d, this consists of:

drwxrwxrwx 0 root root   512 Jun 19 02:47 geom
-rwxrwxrwx 1 root root  5586 Jun 19 02:47 PointInGeo3DShapeQuery.java
-rwxrwxrwx 1 root root  4940 Jun 19 02:47 Geo3DPointOutsideDistanceComparator.java
-rwxrwxrwx 1 root root  6225 Jun 19 02:47 Geo3DPointDistanceComparator.java
-rwxrwxrwx 1 root root 11872 Apr 10 11:59 Geo3DUtil.java
-rwxrwxrwx 1 root root   966 Mar  2 17:39 package-info.java
-rwxrwxrwx 1 root root  5486 Mar  2 17:39 PointInShapeIntersectVisitor.java
-rwxrwxrwx 1 root root  3175 Mar  2 17:39 Geo3DPointSortField.java
-rwxrwxrwx 1 root root  3217 Mar  2 17:39 Geo3DPointOutsideSortField.java
-rwxrwxrwx 1 root root  8681 Mar  2 17:39 Geo3DPoint.java
-rwxrwxrwx 1 root root 22099 Mar  2 17:39 Geo3DDocValuesField.java

... plus shape factories found in geom.  There are similar public API classes in Robert's implementation, but they are implemented very differently and work only with 2d points.  A fair bit of effort went into insuring that the public api was well thought out, as lightweight as possible, and defensible.

(2) Neither Robert's planar, or my 3d implementation, has any external dependencies.

(3) There exists a spatial-4j implementation for geo3d as well, in spatial-extras.  spatial-extras does have external dependencies.

So, as you can see, merging all the packages is possible, but only if you sacrifice backwards compatibility, and only if you accept external dependencies.  Merging everything except spatial-extras is also possible but you still need to give up backwards compatibility, and you'd be putting classes together that have to individually signal what spatial universe they belong to.  That argues against a solution where all geometric implementations are merged into a single "spatial" package at this point.  So my thought is that we maintain multiple spatial-X modules, one for each universe, plus spatial-4j.  It may be possible to combine Nicholas's and Robert's 2D universes together, but I'd recommend doing that with great care since Robert spent quite a bit of time performance tuning his stuff.  Merging "into core" would seem like a good idea only if there was ONE implementation and ONE universe.

Thanks,
Karl


On Sat, Jun 23, 2018 at 6:11 PM Nicholas Knize <[hidden email]> wrote:
Hi David. 

I'm not arguing for or against anything in particular. I was simply communicating the state of things as I saw today. And yes, we have spatial code in five modules; and yes, that's pretty crazy. I was originally of the "keep it simple" opinion that all spatial should live in either the spatial module (dependency free), or spatial-extras (dependencies welcome) and that core should have absolutely no spatial code whatsoever. I still feel pretty strongly about this, but there are some compelling reasons to have a simple LatLonPoint in core at minimum. Namely the one Mike raised - "because it's the best default geo implementation we have to offer for basic usage now."  I can't argue with that because at the end of the day I think its good for the Lucene project to have a default spatial capability. 

That being said, I have been working on a simple default shape implementation that also uses BKD. While sandbox is certainly a good place for this to start, I do struggle with where it will ultimately land. Does it become a good default shape implementation that should go in core like LatLonPoint? Or is it considered "expert" and go in the spatial module (which further fragments the spatial codebase)? At the end of the day it doesn't matter as long as 1. the learning curve to contribute is not too high for the rest of the community, 2. javadocs make it clear where everything lives, and 3. people actually read javadocs.
 

On Fri, Jun 22, 2018 at 3:16 PM David Smiley <[hidden email]> wrote:
Nick, are you not only arguing for spatial code to be in Lucene core, but also for the "spatial" module to continue to exist?  And I believe Adrien still wants some spatial stuff in sandbox so that means spatial code in 5 modules.  Five modules... let that that sink in... wow.  Gosh that's kinda overwhelming IMO.

Karl do you have any opinions about this stuff?  I don't know what your opinions are, come to think of it.

~ David


On Wed, Jun 20, 2018 at 1:01 PM Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  
Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] Geo/spatial organization in Lucene

Ignacio Vera Sequeiros

The  planar implementation is not a full blown topology library like spatial3d and it only aims to provide some very fast implementation for some common use cases. Therefore we might argue that such implementation lives on core.  In addition, if we are to add a new spatial tree that supports indexing shapes, where should it go?

 

  1. If we remove all spatial code from core, then the tree might need to be added wherever the spatial code has been moved to.
  2. If the tree is added to core, it should contain at least a basic implementation.

 

 

One final note, the geo3d universe contains references to the planar universe. That only make sense if those shapes are generic and are in core.

 

Ignacio

 

 

 

 

From: Karl Wright [mailto:[hidden email]]
Sent: Sunday, June 24, 2018 7:10 AM
To: Lucene/Solr dev <[hidden email]>
Subject: Re: [DISCUSS] Geo/spatial organization in Lucene

 

Data points:

(1) For both geo3d and Robert's implementation (at least!) there exists a public API already.  For geo3d, this consists of:

 

drwxrwxrwx 0 root root   512 Jun 19 02:47 geom

-rwxrwxrwx 1 root root  5586 Jun 19 02:47 PointInGeo3DShapeQuery.java

-rwxrwxrwx 1 root root  4940 Jun 19 02:47 Geo3DPointOutsideDistanceComparator.java

-rwxrwxrwx 1 root root  6225 Jun 19 02:47 Geo3DPointDistanceComparator.java

-rwxrwxrwx 1 root root 11872 Apr 10 11:59 Geo3DUtil.java

-rwxrwxrwx 1 root root   966 Mar  2 17:39 package-info.java

-rwxrwxrwx 1 root root  5486 Mar  2 17:39 PointInShapeIntersectVisitor.java

-rwxrwxrwx 1 root root  3175 Mar  2 17:39 Geo3DPointSortField.java

-rwxrwxrwx 1 root root  3217 Mar  2 17:39 Geo3DPointOutsideSortField.java

-rwxrwxrwx 1 root root  8681 Mar  2 17:39 Geo3DPoint.java

-rwxrwxrwx 1 root root 22099 Mar  2 17:39 Geo3DDocValuesField.java

 

... plus shape factories found in geom.  There are similar public API classes in Robert's implementation, but they are implemented very differently and work only with 2d points.  A fair bit of effort went into insuring that the public api was well thought out, as lightweight as possible, and defensible.

(2) Neither Robert's planar, or my 3d implementation, has any external dependencies.

(3) There exists a spatial-4j implementation for geo3d as well, in spatial-extras.  spatial-extras does have external dependencies.

So, as you can see, merging all the packages is possible, but only if you sacrifice backwards compatibility, and only if you accept external dependencies.  Merging everything except spatial-extras is also possible but you still need to give up backwards compatibility, and you'd be putting classes together that have to individually signal what spatial universe they belong to.  That argues against a solution where all geometric implementations are merged into a single "spatial" package at this point.  So my thought is that we maintain multiple spatial-X modules, one for each universe, plus spatial-4j.  It may be possible to combine Nicholas's and Robert's 2D universes together, but I'd recommend doing that with great care since Robert spent quite a bit of time performance tuning his stuff.  Merging "into core" would seem like a good idea only if there was ONE implementation and ONE universe.

 

Thanks,

Karl

 

 

On Sat, Jun 23, 2018 at 6:11 PM Nicholas Knize <[hidden email]> wrote:

Hi David. 

 

I'm not arguing for or against anything in particular. I was simply communicating the state of things as I saw today. And yes, we have spatial code in five modules; and yes, that's pretty crazy. I was originally of the "keep it simple" opinion that all spatial should live in either the spatial module (dependency free), or spatial-extras (dependencies welcome) and that core should have absolutely no spatial code whatsoever. I still feel pretty strongly about this, but there are some compelling reasons to have a simple LatLonPoint in core at minimum. Namely the one Mike raised - "because it's the best default geo implementation we have to offer for basic usage now."  I can't argue with that because at the end of the day I think its good for the Lucene project to have a default spatial capability. 

 

That being said, I have been working on a simple default shape implementation that also uses BKD. While sandbox is certainly a good place for this to start, I do struggle with where it will ultimately land. Does it become a good default shape implementation that should go in core like LatLonPoint? Or is it considered "expert" and go in the spatial module (which further fragments the spatial codebase)? At the end of the day it doesn't matter as long as 1. the learning curve to contribute is not too high for the rest of the community, 2. javadocs make it clear where everything lives, and 3. people actually read javadocs.

 

 

On Fri, Jun 22, 2018 at 3:16 PM David Smiley <[hidden email]> wrote:

Nick, are you not only arguing for spatial code to be in Lucene core, but also for the "spatial" module to continue to exist?  And I believe Adrien still wants some spatial stuff in sandbox so that means spatial code in 5 modules.  Five modules... let that that sink in... wow.  Gosh that's kinda overwhelming IMO.

 

Karl do you have any opinions about this stuff?  I don't know what your opinions are, come to think of it.

 

~ David

 

On Wed, Jun 20, 2018 at 1:01 PM Nicholas Knize <[hidden email]> wrote:

If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

 

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

 

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

 

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

 

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

 

- Nick

 

 

 

On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:

I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.

 

Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :

I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

 

Respectfully,

  ~ David

 

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.

--

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--

Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--

Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Karl Wright-2
' One final note, the geo3d universe contains references to the planar universe'

Can you clarify?  Where are these references?

Karl

On Mon, Jun 25, 2018 at 4:25 AM Ignacio Vera Sequeiros <[hidden email]> wrote:

The  planar implementation is not a full blown topology library like spatial3d and it only aims to provide some very fast implementation for some common use cases. Therefore we might argue that such implementation lives on core.  In addition, if we are to add a new spatial tree that supports indexing shapes, where should it go?

 

  1. If we remove all spatial code from core, then the tree might need to be added wherever the spatial code has been moved to.
  2. If the tree is added to core, it should contain at least a basic implementation.

 

 

One final note, the geo3d universe contains references to the planar universe. That only make sense if those shapes are generic and are in core.

 

Ignacio

 

 

 

 

From: Karl Wright [mailto:[hidden email]]
Sent: Sunday, June 24, 2018 7:10 AM
To: Lucene/Solr dev <[hidden email]>
Subject: Re: [DISCUSS] Geo/spatial organization in Lucene

 

Data points:

(1) For both geo3d and Robert's implementation (at least!) there exists a public API already.  For geo3d, this consists of:

 

drwxrwxrwx 0 root root   512 Jun 19 02:47 geom

-rwxrwxrwx 1 root root  5586 Jun 19 02:47 PointInGeo3DShapeQuery.java

-rwxrwxrwx 1 root root  4940 Jun 19 02:47 Geo3DPointOutsideDistanceComparator.java

-rwxrwxrwx 1 root root  6225 Jun 19 02:47 Geo3DPointDistanceComparator.java

-rwxrwxrwx 1 root root 11872 Apr 10 11:59 Geo3DUtil.java

-rwxrwxrwx 1 root root   966 Mar  2 17:39 package-info.java

-rwxrwxrwx 1 root root  5486 Mar  2 17:39 PointInShapeIntersectVisitor.java

-rwxrwxrwx 1 root root  3175 Mar  2 17:39 Geo3DPointSortField.java

-rwxrwxrwx 1 root root  3217 Mar  2 17:39 Geo3DPointOutsideSortField.java

-rwxrwxrwx 1 root root  8681 Mar  2 17:39 Geo3DPoint.java

-rwxrwxrwx 1 root root 22099 Mar  2 17:39 Geo3DDocValuesField.java

 

... plus shape factories found in geom.  There are similar public API classes in Robert's implementation, but they are implemented very differently and work only with 2d points.  A fair bit of effort went into insuring that the public api was well thought out, as lightweight as possible, and defensible.

(2) Neither Robert's planar, or my 3d implementation, has any external dependencies.

(3) There exists a spatial-4j implementation for geo3d as well, in spatial-extras.  spatial-extras does have external dependencies.

So, as you can see, merging all the packages is possible, but only if you sacrifice backwards compatibility, and only if you accept external dependencies.  Merging everything except spatial-extras is also possible but you still need to give up backwards compatibility, and you'd be putting classes together that have to individually signal what spatial universe they belong to.  That argues against a solution where all geometric implementations are merged into a single "spatial" package at this point.  So my thought is that we maintain multiple spatial-X modules, one for each universe, plus spatial-4j.  It may be possible to combine Nicholas's and Robert's 2D universes together, but I'd recommend doing that with great care since Robert spent quite a bit of time performance tuning his stuff.  Merging "into core" would seem like a good idea only if there was ONE implementation and ONE universe.

 

Thanks,

Karl

 

 

On Sat, Jun 23, 2018 at 6:11 PM Nicholas Knize <[hidden email]> wrote:

Hi David. 

 

I'm not arguing for or against anything in particular. I was simply communicating the state of things as I saw today. And yes, we have spatial code in five modules; and yes, that's pretty crazy. I was originally of the "keep it simple" opinion that all spatial should live in either the spatial module (dependency free), or spatial-extras (dependencies welcome) and that core should have absolutely no spatial code whatsoever. I still feel pretty strongly about this, but there are some compelling reasons to have a simple LatLonPoint in core at minimum. Namely the one Mike raised - "because it's the best default geo implementation we have to offer for basic usage now."  I can't argue with that because at the end of the day I think its good for the Lucene project to have a default spatial capability. 

 

That being said, I have been working on a simple default shape implementation that also uses BKD. While sandbox is certainly a good place for this to start, I do struggle with where it will ultimately land. Does it become a good default shape implementation that should go in core like LatLonPoint? Or is it considered "expert" and go in the spatial module (which further fragments the spatial codebase)? At the end of the day it doesn't matter as long as 1. the learning curve to contribute is not too high for the rest of the community, 2. javadocs make it clear where everything lives, and 3. people actually read javadocs.

 

 

On Fri, Jun 22, 2018 at 3:16 PM David Smiley <[hidden email]> wrote:

Nick, are you not only arguing for spatial code to be in Lucene core, but also for the "spatial" module to continue to exist?  And I believe Adrien still wants some spatial stuff in sandbox so that means spatial code in 5 modules.  Five modules... let that that sink in... wow.  Gosh that's kinda overwhelming IMO.

 

Karl do you have any opinions about this stuff?  I don't know what your opinions are, come to think of it.

 

~ David

 

On Wed, Jun 20, 2018 at 1:01 PM Nicholas Knize <[hidden email]> wrote:

If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

 

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

 

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

 

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

 

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

 

- Nick

 

 

 

On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:

I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.

 

Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :

I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

 

Respectfully,

  ~ David

 

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.

--

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--

Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--

Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] Geo/spatial organization in Lucene

Ignacio Vera Sequeiros

In Geo3DPoint:

 

public static Query newPolygonQuery(final String field, final Polygon... polygons)

 

public static Query newLargePolygonQuery(final String field, final Polygon... polygons)

 

In Geo3DDocValuesField:

 

  public static SortField newOutsidePolygonSort(final String field, final Polygon... polygons)

 

  public static SortField newOutsideLargePolygonSort(final String field, final Polygon... polygons)

 

Where Polygon class seems a generic abstraction for a Polygon on the earth surface, still is in the core together with the planar implementation.

 

 

 

 

 

 

From: Karl Wright [mailto:[hidden email]]
Sent: Monday, June 25, 2018 12:38 PM
To: Lucene/Solr dev <[hidden email]>
Subject: Re: [DISCUSS] Geo/spatial organization in Lucene

 

' One final note, the geo3d universe contains references to the planar universe'

Can you clarify?  Where are these references?

 

Karl

 

On Mon, Jun 25, 2018 at 4:25 AM Ignacio Vera Sequeiros <[hidden email]> wrote:

The  planar implementation is not a full blown topology library like spatial3d and it only aims to provide some very fast implementation for some common use cases. Therefore we might argue that such implementation lives on core.  In addition, if we are to add a new spatial tree that supports indexing shapes, where should it go?

 

  1. If we remove all spatial code from core, then the tree might need to be added wherever the spatial code has been moved to.
  2. If the tree is added to core, it should contain at least a basic implementation.

 

 

One final note, the geo3d universe contains references to the planar universe. That only make sense if those shapes are generic and are in core.

 

Ignacio

 

 

 

 

From: Karl Wright [mailto:[hidden email]]
Sent: Sunday, June 24, 2018 7:10 AM
To: Lucene/Solr dev <[hidden email]>
Subject: Re: [DISCUSS] Geo/spatial organization in Lucene

 

Data points:

(1) For both geo3d and Robert's implementation (at least!) there exists a public API already.  For geo3d, this consists of:

 

drwxrwxrwx 0 root root   512 Jun 19 02:47 geom

-rwxrwxrwx 1 root root  5586 Jun 19 02:47 PointInGeo3DShapeQuery.java

-rwxrwxrwx 1 root root  4940 Jun 19 02:47 Geo3DPointOutsideDistanceComparator.java

-rwxrwxrwx 1 root root  6225 Jun 19 02:47 Geo3DPointDistanceComparator.java

-rwxrwxrwx 1 root root 11872 Apr 10 11:59 Geo3DUtil.java

-rwxrwxrwx 1 root root   966 Mar  2 17:39 package-info.java

-rwxrwxrwx 1 root root  5486 Mar  2 17:39 PointInShapeIntersectVisitor.java

-rwxrwxrwx 1 root root  3175 Mar  2 17:39 Geo3DPointSortField.java

-rwxrwxrwx 1 root root  3217 Mar  2 17:39 Geo3DPointOutsideSortField.java

-rwxrwxrwx 1 root root  8681 Mar  2 17:39 Geo3DPoint.java

-rwxrwxrwx 1 root root 22099 Mar  2 17:39 Geo3DDocValuesField.java

 

... plus shape factories found in geom.  There are similar public API classes in Robert's implementation, but they are implemented very differently and work only with 2d points.  A fair bit of effort went into insuring that the public api was well thought out, as lightweight as possible, and defensible.

(2) Neither Robert's planar, or my 3d implementation, has any external dependencies.

(3) There exists a spatial-4j implementation for geo3d as well, in spatial-extras.  spatial-extras does have external dependencies.

So, as you can see, merging all the packages is possible, but only if you sacrifice backwards compatibility, and only if you accept external dependencies.  Merging everything except spatial-extras is also possible but you still need to give up backwards compatibility, and you'd be putting classes together that have to individually signal what spatial universe they belong to.  That argues against a solution where all geometric implementations are merged into a single "spatial" package at this point.  So my thought is that we maintain multiple spatial-X modules, one for each universe, plus spatial-4j.  It may be possible to combine Nicholas's and Robert's 2D universes together, but I'd recommend doing that with great care since Robert spent quite a bit of time performance tuning his stuff.  Merging "into core" would seem like a good idea only if there was ONE implementation and ONE universe.

 

Thanks,

Karl

 

 

On Sat, Jun 23, 2018 at 6:11 PM Nicholas Knize <[hidden email]> wrote:

Hi David. 

 

I'm not arguing for or against anything in particular. I was simply communicating the state of things as I saw today. And yes, we have spatial code in five modules; and yes, that's pretty crazy. I was originally of the "keep it simple" opinion that all spatial should live in either the spatial module (dependency free), or spatial-extras (dependencies welcome) and that core should have absolutely no spatial code whatsoever. I still feel pretty strongly about this, but there are some compelling reasons to have a simple LatLonPoint in core at minimum. Namely the one Mike raised - "because it's the best default geo implementation we have to offer for basic usage now."  I can't argue with that because at the end of the day I think its good for the Lucene project to have a default spatial capability. 

 

That being said, I have been working on a simple default shape implementation that also uses BKD. While sandbox is certainly a good place for this to start, I do struggle with where it will ultimately land. Does it become a good default shape implementation that should go in core like LatLonPoint? Or is it considered "expert" and go in the spatial module (which further fragments the spatial codebase)? At the end of the day it doesn't matter as long as 1. the learning curve to contribute is not too high for the rest of the community, 2. javadocs make it clear where everything lives, and 3. people actually read javadocs.

 

 

On Fri, Jun 22, 2018 at 3:16 PM David Smiley <[hidden email]> wrote:

Nick, are you not only arguing for spatial code to be in Lucene core, but also for the "spatial" module to continue to exist?  And I believe Adrien still wants some spatial stuff in sandbox so that means spatial code in 5 modules.  Five modules... let that that sink in... wow.  Gosh that's kinda overwhelming IMO.

 

Karl do you have any opinions about this stuff?  I don't know what your opinions are, come to think of it.

 

~ David

 

On Wed, Jun 20, 2018 at 1:01 PM Nicholas Knize <[hidden email]> wrote:

If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

 

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

 

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

 

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

 

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

 

- Nick

 

 

 

On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:

I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.

 

Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :

I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

 

Respectfully,

  ~ David

 

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.

--

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--

Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--

Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Karl Wright-2
Yes, thank you for reminding me.

The Polygon description abstraction is in common at least across Robert's implementation and geo3d.  It must indeed remain available in core.

Karl


On Mon, Jun 25, 2018 at 6:56 AM Ignacio Vera Sequeiros <[hidden email]> wrote:

In Geo3DPoint:

 

public static Query newPolygonQuery(final String field, final Polygon... polygons)

 

public static Query newLargePolygonQuery(final String field, final Polygon... polygons)

 

In Geo3DDocValuesField:

 

  public static SortField newOutsidePolygonSort(final String field, final Polygon... polygons)

 

  public static SortField newOutsideLargePolygonSort(final String field, final Polygon... polygons)

 

Where Polygon class seems a generic abstraction for a Polygon on the earth surface, still is in the core together with the planar implementation.

 

 

 

 

 

 

From: Karl Wright [mailto:[hidden email]]
Sent: Monday, June 25, 2018 12:38 PM
To: Lucene/Solr dev <[hidden email]>
Subject: Re: [DISCUSS] Geo/spatial organization in Lucene

 

' One final note, the geo3d universe contains references to the planar universe'

Can you clarify?  Where are these references?

 

Karl

 

On Mon, Jun 25, 2018 at 4:25 AM Ignacio Vera Sequeiros <[hidden email]> wrote:

The  planar implementation is not a full blown topology library like spatial3d and it only aims to provide some very fast implementation for some common use cases. Therefore we might argue that such implementation lives on core.  In addition, if we are to add a new spatial tree that supports indexing shapes, where should it go?

 

  1. If we remove all spatial code from core, then the tree might need to be added wherever the spatial code has been moved to.
  2. If the tree is added to core, it should contain at least a basic implementation.

 

 

One final note, the geo3d universe contains references to the planar universe. That only make sense if those shapes are generic and are in core.

 

Ignacio

 

 

 

 

From: Karl Wright [mailto:[hidden email]]
Sent: Sunday, June 24, 2018 7:10 AM
To: Lucene/Solr dev <[hidden email]>
Subject: Re: [DISCUSS] Geo/spatial organization in Lucene

 

Data points:

(1) For both geo3d and Robert's implementation (at least!) there exists a public API already.  For geo3d, this consists of:

 

drwxrwxrwx 0 root root   512 Jun 19 02:47 geom

-rwxrwxrwx 1 root root  5586 Jun 19 02:47 PointInGeo3DShapeQuery.java

-rwxrwxrwx 1 root root  4940 Jun 19 02:47 Geo3DPointOutsideDistanceComparator.java

-rwxrwxrwx 1 root root  6225 Jun 19 02:47 Geo3DPointDistanceComparator.java

-rwxrwxrwx 1 root root 11872 Apr 10 11:59 Geo3DUtil.java

-rwxrwxrwx 1 root root   966 Mar  2 17:39 package-info.java

-rwxrwxrwx 1 root root  5486 Mar  2 17:39 PointInShapeIntersectVisitor.java

-rwxrwxrwx 1 root root  3175 Mar  2 17:39 Geo3DPointSortField.java

-rwxrwxrwx 1 root root  3217 Mar  2 17:39 Geo3DPointOutsideSortField.java

-rwxrwxrwx 1 root root  8681 Mar  2 17:39 Geo3DPoint.java

-rwxrwxrwx 1 root root 22099 Mar  2 17:39 Geo3DDocValuesField.java

 

... plus shape factories found in geom.  There are similar public API classes in Robert's implementation, but they are implemented very differently and work only with 2d points.  A fair bit of effort went into insuring that the public api was well thought out, as lightweight as possible, and defensible.

(2) Neither Robert's planar, or my 3d implementation, has any external dependencies.

(3) There exists a spatial-4j implementation for geo3d as well, in spatial-extras.  spatial-extras does have external dependencies.

So, as you can see, merging all the packages is possible, but only if you sacrifice backwards compatibility, and only if you accept external dependencies.  Merging everything except spatial-extras is also possible but you still need to give up backwards compatibility, and you'd be putting classes together that have to individually signal what spatial universe they belong to.  That argues against a solution where all geometric implementations are merged into a single "spatial" package at this point.  So my thought is that we maintain multiple spatial-X modules, one for each universe, plus spatial-4j.  It may be possible to combine Nicholas's and Robert's 2D universes together, but I'd recommend doing that with great care since Robert spent quite a bit of time performance tuning his stuff.  Merging "into core" would seem like a good idea only if there was ONE implementation and ONE universe.

 

Thanks,

Karl

 

 

On Sat, Jun 23, 2018 at 6:11 PM Nicholas Knize <[hidden email]> wrote:

Hi David. 

 

I'm not arguing for or against anything in particular. I was simply communicating the state of things as I saw today. And yes, we have spatial code in five modules; and yes, that's pretty crazy. I was originally of the "keep it simple" opinion that all spatial should live in either the spatial module (dependency free), or spatial-extras (dependencies welcome) and that core should have absolutely no spatial code whatsoever. I still feel pretty strongly about this, but there are some compelling reasons to have a simple LatLonPoint in core at minimum. Namely the one Mike raised - "because it's the best default geo implementation we have to offer for basic usage now."  I can't argue with that because at the end of the day I think its good for the Lucene project to have a default spatial capability. 

 

That being said, I have been working on a simple default shape implementation that also uses BKD. While sandbox is certainly a good place for this to start, I do struggle with where it will ultimately land. Does it become a good default shape implementation that should go in core like LatLonPoint? Or is it considered "expert" and go in the spatial module (which further fragments the spatial codebase)? At the end of the day it doesn't matter as long as 1. the learning curve to contribute is not too high for the rest of the community, 2. javadocs make it clear where everything lives, and 3. people actually read javadocs.

 

 

On Fri, Jun 22, 2018 at 3:16 PM David Smiley <[hidden email]> wrote:

Nick, are you not only arguing for spatial code to be in Lucene core, but also for the "spatial" module to continue to exist?  And I believe Adrien still wants some spatial stuff in sandbox so that means spatial code in 5 modules.  Five modules... let that that sink in... wow.  Gosh that's kinda overwhelming IMO.

 

Karl do you have any opinions about this stuff?  I don't know what your opinions are, come to think of it.

 

~ David

 

On Wed, Jun 20, 2018 at 1:01 PM Nicholas Knize <[hidden email]> wrote:

If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

 

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

 

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

 

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

 

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

 

- Nick

 

 

 

On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:

I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.

 

Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :

I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

 

Respectfully,

  ~ David

 

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.

--

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--

Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--

Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Michael McCandless-2
In reply to this post by Nicholas Knize
I also favor B: move the common case, good performing spatial implementations to core, but still bake new things in sandbox.  LatLonPoint has baked way too long already!  The addition of first class (codec support) KD trees in Lucene (dimensional points) was/is really a game changer for Lucene supporting common geo spatial applications.

It would be nice to find a better name than geo3d / spatial3d: it confuses 100% of the people I explain it to, on first impression :)  But we should tackle that separately/later.

Merging the 2D/3D abstractions sounds a little too ambitious at this point, so I think it's fine to leave them separate for now.


On Wed, Jun 20, 2018 at 1:00 PM, Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

david.w.smiley@gmail.com
Okay fine, I'm not going to block spatial stuff going into core.  (celebration).  I foresee the spatial stuff there growing beyond the one default impl though.

Perhaps most of us are still not happy with seeing spatial code across so many modules?  Nick and I have voiced this concern so far.  Given the pittance of utility of what's in the spatial module today, can we agree to simply remove it?

I pity users trying to figure out what is where to make sense of it.  I wonder how new users discover/browse to look around -- I'm too used to the codebase to have any idea what newbies do.  That seems to be this: http://lucene.apache.org/core/7_3_1/index.html  Each module only gets one terse sentence fragment.  It'd be nice to have potentially a paragraph of information?  Even without more verbage, the spatial ones could have better descriptions.  I propose these changes:

* spatial:  remove it :-)   -- see above
* spatial3d: Computational geometry on the surface of a sphere or ellipsoid, including Lucene index & search solutions
* spatial-extras: Spatial code that has external dependencies like Spatial4j and JTS, including Lucene index & search solutions

perhaps "spatial-sphere" might be a more meaningful name than spatial3d?  Yes, it's ellipsoidal but sphere is close enough ;-)

~ David

On Mon, Jun 25, 2018 at 10:42 AM Michael McCandless <[hidden email]> wrote:
I also favor B: move the common case, good performing spatial implementations to core, but still bake new things in sandbox.  LatLonPoint has baked way too long already!  The addition of first class (codec support) KD trees in Lucene (dimensional points) was/is really a game changer for Lucene supporting common geo spatial applications.

It would be nice to find a better name than geo3d / spatial3d: it confuses 100% of the people I explain it to, on first impression :)  But we should tackle that separately/later.

Merging the 2D/3D abstractions sounds a little too ambitious at this point, so I think it's fine to leave them separate for now.

On Wed, Jun 20, 2018 at 1:00 PM, Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Alan Woodward-3
+1 to move LatLonPoint and friends to core, and nuke the spatial module

On 25 Jun 2018, at 16:32, David Smiley <[hidden email]> wrote:

Okay fine, I'm not going to block spatial stuff going into core.  (celebration).  I foresee the spatial stuff there growing beyond the one default impl though.

Perhaps most of us are still not happy with seeing spatial code across so many modules?  Nick and I have voiced this concern so far.  Given the pittance of utility of what's in the spatial module today, can we agree to simply remove it?

I pity users trying to figure out what is where to make sense of it.  I wonder how new users discover/browse to look around -- I'm too used to the codebase to have any idea what newbies do.  That seems to be this: http://lucene.apache.org/core/7_3_1/index.html  Each module only gets one terse sentence fragment.  It'd be nice to have potentially a paragraph of information?  Even without more verbage, the spatial ones could have better descriptions.  I propose these changes:

* spatial:  remove it :-)   -- see above
* spatial3d: Computational geometry on the surface of a sphere or ellipsoid, including Lucene index & search solutions
* spatial-extras: Spatial code that has external dependencies like Spatial4j and JTS, including Lucene index & search solutions

perhaps "spatial-sphere" might be a more meaningful name than spatial3d?  Yes, it's ellipsoidal but sphere is close enough ;-)

~ David

On Mon, Jun 25, 2018 at 10:42 AM Michael McCandless <[hidden email]> wrote:
I also favor B: move the common case, good performing spatial implementations to core, but still bake new things in sandbox.  LatLonPoint has baked way too long already!  The addition of first class (codec support) KD trees in Lucene (dimensional points) was/is really a game changer for Lucene supporting common geo spatial applications.

It would be nice to find a better name than geo3d / spatial3d: it confuses 100% of the people I explain it to, on first impression :)  But we should tackle that separately/later.

Merging the 2D/3D abstractions sounds a little too ambitious at this point, so I think it's fine to leave them separate for now.

On Wed, Jun 20, 2018 at 1:00 PM, Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Ignacio Vera Sequeiros

+1


From: Alan Woodward <[hidden email]>
Sent: Monday, June 25, 2018 5:56:16 PM
To: [hidden email]
Subject: Re: [DISCUSS] Geo/spatial organization in Lucene
 
+1 to move LatLonPoint and friends to core, and nuke the spatial module

On 25 Jun 2018, at 16:32, David Smiley <[hidden email]> wrote:

Okay fine, I'm not going to block spatial stuff going into core.  (celebration).  I foresee the spatial stuff there growing beyond the one default impl though.

Perhaps most of us are still not happy with seeing spatial code across so many modules?  Nick and I have voiced this concern so far.  Given the pittance of utility of what's in the spatial module today, can we agree to simply remove it?

I pity users trying to figure out what is where to make sense of it.  I wonder how new users discover/browse to look around -- I'm too used to the codebase to have any idea what newbies do.  That seems to be this: http://lucene.apache.org/core/7_3_1/index.html  Each module only gets one terse sentence fragment.  It'd be nice to have potentially a paragraph of information?  Even without more verbage, the spatial ones could have better descriptions.  I propose these changes:

* spatial:  remove it :-)   -- see above
* spatial3d: Computational geometry on the surface of a sphere or ellipsoid, including Lucene index & search solutions
* spatial-extras: Spatial code that has external dependencies like Spatial4j and JTS, including Lucene index & search solutions

perhaps "spatial-sphere" might be a more meaningful name than spatial3d?  Yes, it's ellipsoidal but sphere is close enough ;-)

~ David

On Mon, Jun 25, 2018 at 10:42 AM Michael McCandless <[hidden email]> wrote:
I also favor B: move the common case, good performing spatial implementations to core, but still bake new things in sandbox.  LatLonPoint has baked way too long already!  The addition of first class (codec support) KD trees in Lucene (dimensional points) was/is really a game changer for Lucene supporting common geo spatial applications.

It would be nice to find a better name than geo3d / spatial3d: it confuses 100% of the people I explain it to, on first impression :)  But we should tackle that separately/later.

Merging the 2D/3D abstractions sounds a little too ambitious at this point, so I think it's fine to leave them separate for now.

On Wed, Jun 20, 2018 at 1:00 PM, Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Nicholas Knize
+1

On Mon, Jun 25, 2018, 11:32 AM Ignacio Vera Sequeiros <[hidden email]> wrote:

+1


From: Alan Woodward <[hidden email]>
Sent: Monday, June 25, 2018 5:56:16 PM
To: [hidden email]

Subject: Re: [DISCUSS] Geo/spatial organization in Lucene
+1 to move LatLonPoint and friends to core, and nuke the spatial module

On 25 Jun 2018, at 16:32, David Smiley <[hidden email]> wrote:

Okay fine, I'm not going to block spatial stuff going into core.  (celebration).  I foresee the spatial stuff there growing beyond the one default impl though.

Perhaps most of us are still not happy with seeing spatial code across so many modules?  Nick and I have voiced this concern so far.  Given the pittance of utility of what's in the spatial module today, can we agree to simply remove it?

I pity users trying to figure out what is where to make sense of it.  I wonder how new users discover/browse to look around -- I'm too used to the codebase to have any idea what newbies do.  That seems to be this: http://lucene.apache.org/core/7_3_1/index.html  Each module only gets one terse sentence fragment.  It'd be nice to have potentially a paragraph of information?  Even without more verbage, the spatial ones could have better descriptions.  I propose these changes:

* spatial:  remove it :-)   -- see above
* spatial3d: Computational geometry on the surface of a sphere or ellipsoid, including Lucene index & search solutions
* spatial-extras: Spatial code that has external dependencies like Spatial4j and JTS, including Lucene index & search solutions

perhaps "spatial-sphere" might be a more meaningful name than spatial3d?  Yes, it's ellipsoidal but sphere is close enough ;-)

~ David

On Mon, Jun 25, 2018 at 10:42 AM Michael McCandless <[hidden email]> wrote:
I also favor B: move the common case, good performing spatial implementations to core, but still bake new things in sandbox.  LatLonPoint has baked way too long already!  The addition of first class (codec support) KD trees in Lucene (dimensional points) was/is really a game changer for Lucene supporting common geo spatial applications.

It would be nice to find a better name than geo3d / spatial3d: it confuses 100% of the people I explain it to, on first impression :)  But we should tackle that separately/later.

Merging the 2D/3D abstractions sounds a little too ambitious at this point, so I think it's fine to leave them separate for now.

On Wed, Jun 20, 2018 at 1:00 PM, Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Geo/spatial organization in Lucene

Karl Wright-2
+1


On Mon, Jun 25, 2018 at 12:46 PM Nicholas Knize <[hidden email]> wrote:
+1

On Mon, Jun 25, 2018, 11:32 AM Ignacio Vera Sequeiros <[hidden email]> wrote:

+1


From: Alan Woodward <[hidden email]>
Sent: Monday, June 25, 2018 5:56:16 PM
To: [hidden email]

Subject: Re: [DISCUSS] Geo/spatial organization in Lucene
+1 to move LatLonPoint and friends to core, and nuke the spatial module

On 25 Jun 2018, at 16:32, David Smiley <[hidden email]> wrote:

Okay fine, I'm not going to block spatial stuff going into core.  (celebration).  I foresee the spatial stuff there growing beyond the one default impl though.

Perhaps most of us are still not happy with seeing spatial code across so many modules?  Nick and I have voiced this concern so far.  Given the pittance of utility of what's in the spatial module today, can we agree to simply remove it?

I pity users trying to figure out what is where to make sense of it.  I wonder how new users discover/browse to look around -- I'm too used to the codebase to have any idea what newbies do.  That seems to be this: http://lucene.apache.org/core/7_3_1/index.html  Each module only gets one terse sentence fragment.  It'd be nice to have potentially a paragraph of information?  Even without more verbage, the spatial ones could have better descriptions.  I propose these changes:

* spatial:  remove it :-)   -- see above
* spatial3d: Computational geometry on the surface of a sphere or ellipsoid, including Lucene index & search solutions
* spatial-extras: Spatial code that has external dependencies like Spatial4j and JTS, including Lucene index & search solutions

perhaps "spatial-sphere" might be a more meaningful name than spatial3d?  Yes, it's ellipsoidal but sphere is close enough ;-)

~ David

On Mon, Jun 25, 2018 at 10:42 AM Michael McCandless <[hidden email]> wrote:
I also favor B: move the common case, good performing spatial implementations to core, but still bake new things in sandbox.  LatLonPoint has baked way too long already!  The addition of first class (codec support) KD trees in Lucene (dimensional points) was/is really a game changer for Lucene supporting common geo spatial applications.

It would be nice to find a better name than geo3d / spatial3d: it confuses 100% of the people I explain it to, on first impression :)  But we should tackle that separately/later.

Merging the 2D/3D abstractions sounds a little too ambitious at this point, so I think it's fine to leave them separate for now.

On Wed, Jun 20, 2018 at 1:00 PM, Nicholas Knize <[hidden email]> wrote:
If I were to pick between the two, I also have a preference for B.  I've also tried to keep this whole spatial organization rather simple:

core - simple spatial capabilities needed by the 99% spatial use case (e.g., web mapping). Includes LatLonPoint, polygon & distance search (everything currently in sandbox). Lightweight, and no dependencies or complexities. If one wants simple and fast point search, all you need is the core module.

spatial - dependency free. Expands on core spatial to include simple shape searching. Uses internal relations. Everything confined to core and spatial modules.

spatial-extras - expanded spatial capabilities. Welcomes third-party dependencies (e.g., S3, SIS, Proj4J). Targets more advanced/expert GIS use-cases.

geo3d - trades speed for accuracy. I've always struggled with the name, since it implies 3D shapes/point cloud support. But history has shown considering a name change to be a bike-shedding endeavor. 

At the end of the day I'm up for whatever makes most sense for everyone here. Lord knows we could use more people helping out on geo.

- Nick



On Wed, Jun 20, 2018 at 11:40 AM Adrien Grand <[hidden email]> wrote:
I have a slight preference for B similarly to how StandardAnalyzer is in core and other analyzers are in analysis, but no strong feelings. In any case I agree that both A and B would be much better than the current situation.


Le mer. 20 juin 2018 à 18:09, David Smiley <[hidden email]> a écrit :
I think everyone agrees the current state of spatial code organization in Lucene is not desirable.  We have a spatial module that has almost nothing in it, we have mature spatial code in the sandbox that needs to "graduate" somewhere, and we've got a handful of geo utilities in Lucene core (mostly because I didn't notice).  No agreement has been reached on what the desired state should be.

I'd like to hear opinions on this from members of the community.  I am especially interested in listening to people that normally don't seem to speak up about spatial matters. Perhaps Uwe Schindlerand Alan Woodward – I respect both of you guys a ton for your tenure with Lucene and aren't too pushy with your opinions. I can be convinced to change my mind, especially if coming from you two.  Of course anyone can respond -- this is an open discussion!

As I understand it, there are two proposals loosely defined as follows:

(A) Common spatial needs will be met in the "spatial" module.  The Lucene "spatial" module, currently in a weird gutted state, should have basically all spatial code currently in sandbox plus all geo stuff in Lucene core. Thus there will be no geo stuff in Lucene core.

(B) Common spatial needs will be met by Lucene core.  Lucene core should expand it's current "geo" utilities to include the spatial stuff currently in the sandbox module.  It'd also take on what little remains in the Lucene spatial module and thus we can remove the spatial module. 

With either plan if a user has certain advanced/specialized needs they may need to go to spatial3d or spatial-extras modules.  These would be untouched in both proposals.

I'm in favor of (A) on the grounds that we have modules for special feature areas, and spatial should be no different.  My gut estimation is that 75-90% of apps do not have spatial requirements and need not depend on any spatial module.  Other modules are probably used more (e.g. queries, suggest, etc.)

Respectfully,
  ~ David

p.s. if I mischaracterized any proposal or overlooked another then I'm sorry, please correct me.
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]  

--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

--
Nicholas Knize  |  Geospatial Software Guy  |  Elasticsearch & Apache Lucene  |  [hidden email]