[jira] Created: (LUCENE-1815) Geohash encode/decode floating point problems

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-1815) Geohash encode/decode floating point problems

Michael Gibney (Jira)
Geohash encode/decode floating point problems
---------------------------------------------

                 Key: LUCENE-1815
                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
             Project: Lucene - Java
          Issue Type: Bug
          Components: contrib/spatial
    Affects Versions: 2.9
            Reporter: Wouter Heijke


i'm finding the Geohash support in the spatial package to be rather unreliable.
Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
the format:
action geohash=(latitude, longitude)

the result:
encode u173zq37x014=(52.3738007,4.8909347)
decode u173zq37x014=(52.373799999999996,4.890934)
encode u173zq37rpbw=(52.373799999999996,4.890934)
decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)

if I now change to the google code implementation:

encode u173zq37x014=(52.3738007,4.8909347)
decode u173zq37x014=(52.37380061298609,4.890934377908707)
encode u173zq37x014=(52.37380061298609,4.890934377908707)
decode u173zq37x014=(52.37380061298609,4.890934377908707)
encode u173zq37x014=(52.37380061298609,4.890934377908707)

Note the differences between the geohashes in both situations and the lat/lon's!
Now things get worse if you work on low-precision geohashes:

decode u173=(52.0,4.0)
encode u14zg429yy84=(52.0,4.0)
decode u14zg429yy84=(52.0,3.999999)
encode u14zg429ywx6=(52.0,3.999999)

and google:

decode u173=(52.20703125,4.5703125)
encode u17300000000=(52.20703125,4.5703125)
decode u17300000000=(52.20703125,4.5703125)
encode u17300000000=(52.20703125,4.5703125)

We are using geohashes extensively and will now use the google code version unfortunately.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1815) Geohash encode/decode floating point problems

Michael Gibney (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746808#action_12746808 ]

Michael McCandless commented on LUCENE-1815:
--------------------------------------------

Wouter, or anyone, do you have an idea on where the problem is, or how to fix it?

> Geohash encode/decode floating point problems
> ---------------------------------------------
>
>                 Key: LUCENE-1815
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Wouter Heijke
>
> i'm finding the Geohash support in the spatial package to be rather unreliable.
> Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
> the format:
> action geohash=(latitude, longitude)
> the result:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.373799999999996,4.890934)
> encode u173zq37rpbw=(52.373799999999996,4.890934)
> decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
> encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)
> if I now change to the google code implementation:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> Note the differences between the geohashes in both situations and the lat/lon's!
> Now things get worse if you work on low-precision geohashes:
> decode u173=(52.0,4.0)
> encode u14zg429yy84=(52.0,4.0)
> decode u14zg429yy84=(52.0,3.999999)
> encode u14zg429ywx6=(52.0,3.999999)
> and google:
> decode u173=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> decode u17300000000=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> We are using geohashes extensively and will now use the google code version unfortunately.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1815) Geohash encode/decode floating point problems

Michael Gibney (Jira)
In reply to this post by Michael Gibney (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746836#action_12746836 ]

Simon Willnauer commented on LUCENE-1815:
-----------------------------------------

bq. Wouter, or anyone, do you have an idea on where the problem is, or how to fix it?
I'm not sure if there is something to fix. Spatial uses error correction if you use GeoHashUtils#decode. It calculates a precision values and rounds the result accordingly. If you use  GeoHashUtils#decode_exactly  the result looks much better though if you expect the result to be very very precise.

don't know if this is a huge issue. I could change the implementation to ignore decode and encode precision maybe that makes our impl closer to the one on google code. Again don't know if that is really an issue.
The lat values 52.3738007 and 52.373799999999996 are very very close so I guess you won't even realize it on a map.

simon

> Geohash encode/decode floating point problems
> ---------------------------------------------
>
>                 Key: LUCENE-1815
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Wouter Heijke
>
> i'm finding the Geohash support in the spatial package to be rather unreliable.
> Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
> the format:
> action geohash=(latitude, longitude)
> the result:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.373799999999996,4.890934)
> encode u173zq37rpbw=(52.373799999999996,4.890934)
> decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
> encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)
> if I now change to the google code implementation:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> Note the differences between the geohashes in both situations and the lat/lon's!
> Now things get worse if you work on low-precision geohashes:
> decode u173=(52.0,4.0)
> encode u14zg429yy84=(52.0,4.0)
> decode u14zg429yy84=(52.0,3.999999)
> encode u14zg429ywx6=(52.0,3.999999)
> and google:
> decode u173=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> decode u17300000000=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> We are using geohashes extensively and will now use the google code version unfortunately.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1815) Geohash encode/decode floating point problems

Michael Gibney (Jira)
In reply to this post by Michael Gibney (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746856#action_12746856 ]

Wouter Heijke commented on LUCENE-1815:
---------------------------------------

No, I don't have a solution, but I've noticed that 'decode_exactly' is less 'lossy' then 'decode' but still google code is 'lossless'.

> Geohash encode/decode floating point problems
> ---------------------------------------------
>
>                 Key: LUCENE-1815
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Wouter Heijke
>
> i'm finding the Geohash support in the spatial package to be rather unreliable.
> Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
> the format:
> action geohash=(latitude, longitude)
> the result:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.373799999999996,4.890934)
> encode u173zq37rpbw=(52.373799999999996,4.890934)
> decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
> encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)
> if I now change to the google code implementation:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> Note the differences between the geohashes in both situations and the lat/lon's!
> Now things get worse if you work on low-precision geohashes:
> decode u173=(52.0,4.0)
> encode u14zg429yy84=(52.0,4.0)
> decode u14zg429yy84=(52.0,3.999999)
> encode u14zg429ywx6=(52.0,3.999999)
> and google:
> decode u173=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> decode u17300000000=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> We are using geohashes extensively and will now use the google code version unfortunately.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1815) Geohash encode/decode floating point problems

Michael Gibney (Jira)
In reply to this post by Michael Gibney (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-1815:
------------------------------------

    Priority: Minor  (was: Major)

I don't think this shouldn't be major!

> Geohash encode/decode floating point problems
> ---------------------------------------------
>
>                 Key: LUCENE-1815
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Wouter Heijke
>            Priority: Minor
>
> i'm finding the Geohash support in the spatial package to be rather unreliable.
> Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
> the format:
> action geohash=(latitude, longitude)
> the result:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.373799999999996,4.890934)
> encode u173zq37rpbw=(52.373799999999996,4.890934)
> decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
> encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)
> if I now change to the google code implementation:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> Note the differences between the geohashes in both situations and the lat/lon's!
> Now things get worse if you work on low-precision geohashes:
> decode u173=(52.0,4.0)
> encode u14zg429yy84=(52.0,4.0)
> decode u14zg429yy84=(52.0,3.999999)
> encode u14zg429ywx6=(52.0,3.999999)
> and google:
> decode u173=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> decode u17300000000=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> We are using geohashes extensively and will now use the google code version unfortunately.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Updated: (LUCENE-1815) Geohash encode/decode floating point problems

Mark Miller-3
I wish JIRA wouldn't default to major - it would make those tags much
more useful.

Simon Willnauer (JIRA) wrote:

>      [ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Simon Willnauer updated LUCENE-1815:
> ------------------------------------
>
>     Priority: Minor  (was: Major)
>
> I don't think this shouldn't be major!
>
>  
>> Geohash encode/decode floating point problems
>> ---------------------------------------------
>>
>>                 Key: LUCENE-1815
>>                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
>>             Project: Lucene - Java
>>          Issue Type: Bug
>>          Components: contrib/spatial
>>    Affects Versions: 2.9
>>            Reporter: Wouter Heijke
>>            Priority: Minor
>>
>> i'm finding the Geohash support in the spatial package to be rather unreliable.
>> Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
>> the format:
>> action geohash=(latitude, longitude)
>> the result:
>> encode u173zq37x014=(52.3738007,4.8909347)
>> decode u173zq37x014=(52.373799999999996,4.890934)
>> encode u173zq37rpbw=(52.373799999999996,4.890934)
>> decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
>> encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)
>> if I now change to the google code implementation:
>> encode u173zq37x014=(52.3738007,4.8909347)
>> decode u173zq37x014=(52.37380061298609,4.890934377908707)
>> encode u173zq37x014=(52.37380061298609,4.890934377908707)
>> decode u173zq37x014=(52.37380061298609,4.890934377908707)
>> encode u173zq37x014=(52.37380061298609,4.890934377908707)
>> Note the differences between the geohashes in both situations and the lat/lon's!
>> Now things get worse if you work on low-precision geohashes:
>> decode u173=(52.0,4.0)
>> encode u14zg429yy84=(52.0,4.0)
>> decode u14zg429yy84=(52.0,3.999999)
>> encode u14zg429ywx6=(52.0,3.999999)
>> and google:
>> decode u173=(52.20703125,4.5703125)
>> encode u17300000000=(52.20703125,4.5703125)
>> decode u17300000000=(52.20703125,4.5703125)
>> encode u17300000000=(52.20703125,4.5703125)
>> We are using geohashes extensively and will now use the google code version unfortunately.
>>    
>
>  


--
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1815) Geohash encode/decode floating point problems

Michael Gibney (Jira)
In reply to this post by Michael Gibney (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748548#action_12748548 ]

Wouter Heijke commented on LUCENE-1815:
---------------------------------------

To me it was major since geohashes are THE way for us to search a location through millions of records our index has, and small numbers do count!

I see it this way, if a jpeg picture would not decode like it encoded would you accept it, also if it would be slightly different?

Right now i don't want to spend my time on finding the cause of the issue since i have working (google) code and I prefer doing cooler stuff like implementing a solution for the 'greenwich' geohash problem.

> Geohash encode/decode floating point problems
> ---------------------------------------------
>
>                 Key: LUCENE-1815
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Wouter Heijke
>            Priority: Minor
>
> i'm finding the Geohash support in the spatial package to be rather unreliable.
> Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
> the format:
> action geohash=(latitude, longitude)
> the result:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.373799999999996,4.890934)
> encode u173zq37rpbw=(52.373799999999996,4.890934)
> decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
> encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)
> if I now change to the google code implementation:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> Note the differences between the geohashes in both situations and the lat/lon's!
> Now things get worse if you work on low-precision geohashes:
> decode u173=(52.0,4.0)
> encode u14zg429yy84=(52.0,4.0)
> decode u14zg429yy84=(52.0,3.999999)
> encode u14zg429ywx6=(52.0,3.999999)
> and google:
> decode u173=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> decode u17300000000=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> We are using geohashes extensively and will now use the google code version unfortunately.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1815) Geohash encode/decode floating point problems

Michael Gibney (Jira)
In reply to this post by Michael Gibney (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787934#action_12787934 ]

patrick o'leary commented on LUCENE-1815:
-----------------------------------------

What google code are you working with?


> Geohash encode/decode floating point problems
> ---------------------------------------------
>
>                 Key: LUCENE-1815
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Wouter Heijke
>            Priority: Minor
>
> i'm finding the Geohash support in the spatial package to be rather unreliable.
> Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
> the format:
> action geohash=(latitude, longitude)
> the result:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.373799999999996,4.890934)
> encode u173zq37rpbw=(52.373799999999996,4.890934)
> decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
> encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)
> if I now change to the google code implementation:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> Note the differences between the geohashes in both situations and the lat/lon's!
> Now things get worse if you work on low-precision geohashes:
> decode u173=(52.0,4.0)
> encode u14zg429yy84=(52.0,4.0)
> decode u14zg429yy84=(52.0,3.999999)
> encode u14zg429ywx6=(52.0,3.999999)
> and google:
> decode u173=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> decode u17300000000=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> We are using geohashes extensively and will now use the google code version unfortunately.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1815) Geohash encode/decode floating point problems

Michael Gibney (Jira)
In reply to this post by Michael Gibney (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787973#action_12787973 ]

Wouter Heijke commented on LUCENE-1815:
---------------------------------------

I'm happily using now for some time:

http://code.google.com/p/geospatialweb/source/browse/trunk/geohash/src/Geohash.java


> Geohash encode/decode floating point problems
> ---------------------------------------------
>
>                 Key: LUCENE-1815
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1815
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/spatial
>    Affects Versions: 2.9
>            Reporter: Wouter Heijke
>            Priority: Minor
>
> i'm finding the Geohash support in the spatial package to be rather unreliable.
> Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
> the format:
> action geohash=(latitude, longitude)
> the result:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.373799999999996,4.890934)
> encode u173zq37rpbw=(52.373799999999996,4.890934)
> decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
> encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)
> if I now change to the google code implementation:
> encode u173zq37x014=(52.3738007,4.8909347)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> decode u173zq37x014=(52.37380061298609,4.890934377908707)
> encode u173zq37x014=(52.37380061298609,4.890934377908707)
> Note the differences between the geohashes in both situations and the lat/lon's!
> Now things get worse if you work on low-precision geohashes:
> decode u173=(52.0,4.0)
> encode u14zg429yy84=(52.0,4.0)
> decode u14zg429yy84=(52.0,3.999999)
> encode u14zg429ywx6=(52.0,3.999999)
> and google:
> decode u173=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> decode u17300000000=(52.20703125,4.5703125)
> encode u17300000000=(52.20703125,4.5703125)
> We are using geohashes extensively and will now use the google code version unfortunately.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]