Sorting on distance from a long/lat

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Sorting on distance from a long/lat

rhodebump
I am successfully able to search for "nearbys" given a longitude and a
latitude.  The basic summary of how I do this is that I add 1000 to the
long/lat values and use a RangeFilter in my query.

In my display results, I display the results ordered by distance from the
original long/lat.  What I do is calulate the distance for every document in
my result from the original long/lat and perform a sort of the distance.

Doing the sort this way (calculating the distances for all results
documents) feels like I am being inefficient and wasteful with my CPU
cycles.  In most cases, I am only displaying the closest 10 documents, but I
need to calculate the distance for all documents (potentially 1000) in order
to come up with the 10 closest.

Has anyone wrestled with these questions before?  Is there another approache
that I can take?

Here is my current working implementation, so you can see what I am
describing.  The long/lat is stored in a database that I use to build up my
lucene query/filters
 http://www.visitpa.com/visitpa/visitNearbyActivities.pa?type=dining&name=Color+Me+MineThanks,Phillip


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Sorting on distance from a long/lat

Bryzek.Michael
Our general approach is:

  * Rewrite the query to compute the min and max latitude and longitude, essentially drawing a box around the coordinate to find all documents within the range. This provides for an efficient constant range query to find the total, unordered document set. This is much faster than trying to do a distance calculation between all documents.

  * Run the search once internally, and compute the distance between all matching results. Generally, this requires about 1000 distance calculations for our document set

  * Run the search again using our now populated search comparator to return ordered results back to the user

Our approach is also rather inefficient for queries that return a large number of documents, but we could not find a feasible way to precompute the distance between lat/long pairs in advance. I'm also interested in hearing other ideas/approaches.

-Mike


-----Original Message-----
From: spamsucks [mailto:[hidden email]]
Sent: Mon 11/20/06 2:05 PM
To: [hidden email]
Subject: Sorting on distance from a long/lat
 
I am successfully able to search for "nearbys" given a longitude and a
latitude.  The basic summary of how I do this is that I add 1000 to the
long/lat values and use a RangeFilter in my query.

In my display results, I display the results ordered by distance from the
original long/lat.  What I do is calulate the distance for every document in
my result from the original long/lat and perform a sort of the distance.

Doing the sort this way (calculating the distances for all results
documents) feels like I am being inefficient and wasteful with my CPU
cycles.  In most cases, I am only displaying the closest 10 documents, but I
need to calculate the distance for all documents (potentially 1000) in order
to come up with the 10 closest.

Has anyone wrestled with these questions before?  Is there another approache
that I can take?

Here is my current working implementation, so you can see what I am
describing.  The long/lat is stored in a database that I use to build up my
lucene query/filters
 http://www.visitpa.com/visitpa/visitNearbyActivities.pa?type=dining&name=Color+Me+MineThanks,Phillip


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Sorting on distance from a long/lat

Dennis Watson
In reply to this post by rhodebump
Hi,

I apologize if this is slightly off topic.  I have not implemented this, but
the idea came to me after reading another post about measuring distance in
lucene.  It may be completely impractical, however it seems it COULD work at
least if the area to be indexed could be constrained.

What if you divided the earth up into lets say 100 square blocks.  Each block
would have sides of length x1 with an area x1^2.  You could number those
blocks 0 - 99.  Then you could divide each of those blocks into 100 blocks
with sides of length x2 = x1 / 10 and number these 0 - 99 as well.  You could
do this until you had blocks with sides of length xn and area xn^2.

Each one of these blocks would have four corners with a lat and lon.  It would
be fairly easy to tell if one block were in another block.  It should be
fairly easy to tell which set of boxes any lat, lon pair belonged to.

It seems then you could specify locations almost like IP addresses:

   X1.X2.X3...Xn

where X1 is the 0-99 number of the first largest set of blocks and X2 is the
0-99 number of the next smaller set of blocks inside the first X1 box and so
on.

You could search in lucene for objects within a certain proximity with a
prefix search:

   X1.X2.*
   X1.X2.X3.*

Moreover you would know that the longer the path that matches, the closer
those objects are (it is fractal in nature).  If you needed more granularity
you could do a prefix search like this down to the smallest granularity you
have and then perform a greatcircle distance calculation on the the results
to see if they are close enough.

Of course there might be too many squares to cover the whole Earth.  You will
want to pick a number of boxes which is a power of two so there are a
"square" number of squares.  I think this enables the fractal nature
described above. You would want the corners of the squares to land on
predicable lat, lon points.  This may be done more easily with some
measurement systems than another.

Just my $.02...


Dennis Watson
Sr SW Engineer
GUBA.com


On Monday 20 November 2006 11:05, spamsucks wrote:

> I am successfully able to search for "nearbys" given a longitude and a
> latitude.  The basic summary of how I do this is that I add 1000 to the
> long/lat values and use a RangeFilter in my query.
>
> In my display results, I display the results ordered by distance from the
> original long/lat.  What I do is calulate the distance for every document
> in my result from the original long/lat and perform a sort of the distance.
>
> Doing the sort this way (calculating the distances for all results
> documents) feels like I am being inefficient and wasteful with my CPU
> cycles.  In most cases, I am only displaying the closest 10 documents, but
> I need to calculate the distance for all documents (potentially 1000) in
> order to come up with the 10 closest.
>
> Has anyone wrestled with these questions before?  Is there another
> approache that I can take?
>
> Here is my current working implementation, so you can see what I am
> describing.  The long/lat is stored in a database that I use to build up my
> lucene query/filters
>
> http://www.visitpa.com/visitpa/visitNearbyActivities.pa?type=dining&name=Co
>lor+Me+MineThanks,Phillip
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Sorting on distance from a long/lat

Chris Hostetter-3

I'm not really sure what an approach like this gaines you ... it provides
a mechanism for ensuring that the lat/lon of all results are within a
bounding box arround your start location -- but those bounding boxes
are fixed when building your index.

couldn't you achieve the same thing using a "lat" field, a "lon" field
and two RangeFilters? ... except now you get variable sized/centered
bounding boxes?


: Date: Tue, 21 Nov 2006 07:42:29 -0800
: From: Dennis Watson <[hidden email]>
: Reply-To: [hidden email]
: To: [hidden email]
: Cc: spamsucks <[hidden email]>
: Subject: Re: Sorting on distance from a long/lat
:
: Hi,
:
: I apologize if this is slightly off topic.  I have not implemented this, but
: the idea came to me after reading another post about measuring distance in
: lucene.  It may be completely impractical, however it seems it COULD work at
: least if the area to be indexed could be constrained.
:
: What if you divided the earth up into lets say 100 square blocks.  Each block
: would have sides of length x1 with an area x1^2.  You could number those
: blocks 0 - 99.  Then you could divide each of those blocks into 100 blocks
: with sides of length x2 = x1 / 10 and number these 0 - 99 as well.  You could
: do this until you had blocks with sides of length xn and area xn^2.
:
: Each one of these blocks would have four corners with a lat and lon.  It would
: be fairly easy to tell if one block were in another block.  It should be
: fairly easy to tell which set of boxes any lat, lon pair belonged to.
:
: It seems then you could specify locations almost like IP addresses:
:
:    X1.X2.X3...Xn
:
: where X1 is the 0-99 number of the first largest set of blocks and X2 is the
: 0-99 number of the next smaller set of blocks inside the first X1 box and so
: on.
:
: You could search in lucene for objects within a certain proximity with a
: prefix search:
:
:    X1.X2.*
:    X1.X2.X3.*
:
: Moreover you would know that the longer the path that matches, the closer
: those objects are (it is fractal in nature).  If you needed more granularity
: you could do a prefix search like this down to the smallest granularity you
: have and then perform a greatcircle distance calculation on the the results
: to see if they are close enough.
:
: Of course there might be too many squares to cover the whole Earth.  You will
: want to pick a number of boxes which is a power of two so there are a
: "square" number of squares.  I think this enables the fractal nature
: described above. You would want the corners of the squares to land on
: predicable lat, lon points.  This may be done more easily with some
: measurement systems than another.
:
: Just my $.02...
:
:
: Dennis Watson
: Sr SW Engineer
: GUBA.com
:
:
: On Monday 20 November 2006 11:05, spamsucks wrote:
: > I am successfully able to search for "nearbys" given a longitude and a
: > latitude.  The basic summary of how I do this is that I add 1000 to the
: > long/lat values and use a RangeFilter in my query.
: >
: > In my display results, I display the results ordered by distance from the
: > original long/lat.  What I do is calulate the distance for every document
: > in my result from the original long/lat and perform a sort of the distance.
: >
: > Doing the sort this way (calculating the distances for all results
: > documents) feels like I am being inefficient and wasteful with my CPU
: > cycles.  In most cases, I am only displaying the closest 10 documents, but
: > I need to calculate the distance for all documents (potentially 1000) in
: > order to come up with the 10 closest.
: >
: > Has anyone wrestled with these questions before?  Is there another
: > approache that I can take?
: >
: > Here is my current working implementation, so you can see what I am
: > describing.  The long/lat is stored in a database that I use to build up my
: > lucene query/filters
: >
: > http://www.visitpa.com/visitpa/visitNearbyActivities.pa?type=dining&name=Co
: >lor+Me+MineThanks,Phillip
: >
: >
: > ---------------------------------------------------------------------
: > To unsubscribe, e-mail: [hidden email]
: > For additional commands, e-mail: [hidden email]
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: [hidden email]
: For additional commands, e-mail: [hidden email]
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Sorting on distance from a long/lat

Dennis Watson
It is similar to the two Range Filter approach except my way is precomputed
and probably faster than filtering through a potentially large result set.  
Also I can quickly compute a rough max distance between two any lat, lon
pairs by compairing thier X1.X2.X3... path.


Dennis Watson
Sr SW Engineer
GUBA.com


On Tuesday 21 November 2006 11:05, Chris Hostetter wrote:

> I'm not really sure what an approach like this gaines you ... it provides
> a mechanism for ensuring that the lat/lon of all results are within a
> bounding box arround your start location -- but those bounding boxes
> are fixed when building your index.
>
> couldn't you achieve the same thing using a "lat" field, a "lon" field
> and two RangeFilters? ... except now you get variable sized/centered
> bounding boxes?
>
> : Date: Tue, 21 Nov 2006 07:42:29 -0800
> : From: Dennis Watson <[hidden email]>
> : Reply-To: [hidden email]
> : To: [hidden email]
> : Cc: spamsucks <[hidden email]>
> : Subject: Re: Sorting on distance from a long/lat
> :
> : Hi,
> :
> : I apologize if this is slightly off topic.  I have not implemented this,
> : but the idea came to me after reading another post about measuring
> : distance in lucene.  It may be completely impractical, however it seems
> : it COULD work at least if the area to be indexed could be constrained.
> :
> : What if you divided the earth up into lets say 100 square blocks.  Each
> : block would have sides of length x1 with an area x1^2.  You could number
> : those blocks 0 - 99.  Then you could divide each of those blocks into 100
> : blocks with sides of length x2 = x1 / 10 and number these 0 - 99 as well.
> :  You could do this until you had blocks with sides of length xn and area
> : xn^2.
> :
> : Each one of these blocks would have four corners with a lat and lon.  It
> : would be fairly easy to tell if one block were in another block.  It
> : should be fairly easy to tell which set of boxes any lat, lon pair
> : belonged to.
> :
> : It seems then you could specify locations almost like IP addresses:
> :
> :    X1.X2.X3...Xn
> :
> : where X1 is the 0-99 number of the first largest set of blocks and X2 is
> : the 0-99 number of the next smaller set of blocks inside the first X1 box
> : and so on.
> :
> : You could search in lucene for objects within a certain proximity with a
> : prefix search:
> :
> :    X1.X2.*
> :    X1.X2.X3.*
> :
> : Moreover you would know that the longer the path that matches, the closer
> : those objects are (it is fractal in nature).  If you needed more
> : granularity you could do a prefix search like this down to the smallest
> : granularity you have and then perform a greatcircle distance calculation
> : on the the results to see if they are close enough.
> :
> : Of course there might be too many squares to cover the whole Earth.  You
> : will want to pick a number of boxes which is a power of two so there are
> : a "square" number of squares.  I think this enables the fractal nature
> : described above. You would want the corners of the squares to land on
> : predicable lat, lon points.  This may be done more easily with some
> : measurement systems than another.
> :
> : Just my $.02...
> :
> :
> : Dennis Watson
> : Sr SW Engineer
> : GUBA.com
> :
> : On Monday 20 November 2006 11:05, spamsucks wrote:
> : > I am successfully able to search for "nearbys" given a longitude and a
> : > latitude.  The basic summary of how I do this is that I add 1000 to the
> : > long/lat values and use a RangeFilter in my query.
> : >
> : > In my display results, I display the results ordered by distance from
> : > the original long/lat.  What I do is calulate the distance for every
> : > document in my result from the original long/lat and perform a sort of
> : > the distance.
> : >
> : > Doing the sort this way (calculating the distances for all results
> : > documents) feels like I am being inefficient and wasteful with my CPU
> : > cycles.  In most cases, I am only displaying the closest 10 documents,
> : > but I need to calculate the distance for all documents (potentially
> : > 1000) in order to come up with the 10 closest.
> : >
> : > Has anyone wrestled with these questions before?  Is there another
> : > approache that I can take?
> : >
> : > Here is my current working implementation, so you can see what I am
> : > describing.  The long/lat is stored in a database that I use to build
> : > up my lucene query/filters
> : >
> : > http://www.visitpa.com/visitpa/visitNearbyActivities.pa?type=dining&nam
> : >e=Co lor+Me+MineThanks,Phillip
> : >
> : >
> : > ---------------------------------------------------------------------
> : > To unsubscribe, e-mail: [hidden email]
> : > For additional commands, e-mail: [hidden email]
> :
> : ---------------------------------------------------------------------
> : To unsubscribe, e-mail: [hidden email]
> : For additional commands, e-mail: [hidden email]
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]