Two-dimensional spatial hashing with space-filling curves

The Hilbert curve is a fractal space-filling curve that is rather pretty to look at. hilbert.png A Hilbert curve of order n traces a single path over a square of side 2^n units, as you can see in the images from MathWorld above (with curves of order 2 through 6). If you pick a point p from within the square covered by a curve, and trace out the curve until you reach that point, the number of intermediate points that you passed through along your way to p is called the Hilbert distance. The notion of distance along a space-filling curve is useful for a few reasons.
  • You can use it as a hash from a 2D coordinate to an integer.
  • Some curves preserve locality; for example, points that are nearby in 2D space are likely to have similar Hilbert distances.
  • It’s an isomorphism: you can turn a distance back into a 2D coordinate, in some cases quite easily.
  • Locality-preserving hashes can be useful for clustering nodes of an R-tree, which is the generalisation of a B-tree to spatial data.
There’s a nice algorithm for computing the Hilbert distance of a 2D point that’s linear in the order of the curve (or logarithmic in the size of the side of the square you’re trying to cover). To put this into useful terms, a “10 megapixel” camera typically has a resolution of 3766×2800 pixels. To cover every pixel with a Hilbert curve, you’d need a curve with 4096 points on a side, which is an order-12 curve. You can compute the Hilbert distance of any point in your 10-megapixel photo with 12 steps of the distance algorithm. Here’s an implementation of this Hilbert distance algorithm in Haskell.
import Data.Bits
hilbertDistance :: (Bits a, Ord a) => Int -> (a,a) -> a
hilbertDistance d (x,y)
    | x < 0 || x >= 1 `shiftL` d = error "x bounds"
    | y < 0 || y >= 1 `shiftL` d = error "y bounds"
    | otherwise = dist (1 `shiftL` (d - 1)) (1 `shiftL` ((d - 1) * 2)) 0 x y
    where dist 0 _ result _ _ = result
          dist side area result x y =
              case (compare x side, compare y side) of
              (LT, LT) -> step result y x
              (LT, _)  -> step (result + area) x (y - side)
              (_, LT)  -> step (result + area * 3) (side - y - 1)
                          (side * 2 - x - 1)
              (_, _)   -> step (result + area * 2) (x - side) (y - side)
              where step = dist (side `shiftR` 1) (area `shiftR` 2)
As you can see, each step requires a few comparisons, a few shifts, and a handful of arithmetic operations, so it’s fairly cheap, and the loop unrolls and constant-folds well. There’s some controversy in the literature over whether a Hilbert curve gives better spatial locality than a Lebesgue (also known as a Z-order, or Morton) curve. It’s certainly the case that computing the Lebesgue distance is much cheaper than the Hilbert distance, as it’s possible to finagle an answer using a fixed, unconditional sequence of bit-twiddling operations to interleave the bits of the x coordinate with those of the y.
lebesgueDistance :: (Int, Int) -> Int
lebesgueDistance (x,y) =
    let ms = [(0x00FF00FF, 8), (0x0F0F0F0F, 4),
              (0x33333333, 2), (0x55555555, 1)]
        aaargh x (mask,shift) = (x .|. (x `shiftL` shift)) .&. mask
        x' = foldl aaargh x ms
        y' = foldl aaargh y ms
    in x .|. (y `shiftL` 1)
I haven’t myself checked to see which curve better preserves locality. It’s certainly easier to see how to extend the Lebesgue bit-fiddling hack to a higher dimension, while the given Hilbert code is a bit of a head wrecker.
Posted in haskell, software
10 comments on “Two-dimensional spatial hashing with space-filling curves
  1. Hey Bryan, why the interest in fractal curves? Are you using them in some application?

  2. Patrick Mézard says:

    For the record, I did performance comparison of Hilbert vs Z-order indexing for a cartographic tiling or Western Europe, mostly containing the road network and water/forest/building features. The indexed database was something like 15/20Gb of vector data, test machine looks like a bi-PIII-700Mhz with 1Gb of memory and SCSI disks (was 7 years ago).

    I did not find relevant differences despite what several papers/thesis claim (Moon,Faloutsos&Jagadish 1996, Lawder 2001). Maybe the code/tests I wrote were incorrect or biased in a manner which invalidated the expected locality gains, anyway nothing spectacular came from it.

    I was sorry about it, I thought it was a cool idea. Too bad it just did not work for me.

  3. Ben – a number of years ago, I wrote a spatial database in Scheme, using Hilbert R-trees. The fast Hilbert distance algorithm was a useful little snippet that fit nicely into the constraints of a blog posting 🙂

  4. Malcolm Rowe says:

    Nice, I was aware of Hilbert curves, but I hadn’t heard about Lebesgue curves before!

    See also: http://xkcd.com/c195.html

  5. Ezra says:

    Hello–

    Is there a typo in the lebesgue algorithm? It appears you’re defining x’ and y’ and then ignoring them. Is the last line meant to be

    in x’ .|. (y’ `shiftL` 1)

    instead? It seems to work better that way.

  6. jberryman says:

    Cool, thanks! I wonder if space-filling curves can be used in functional languages to represent a grid of mutable objects, like a checker board.

  7. taotree says:

    Thanks for the info and especially the code implementations.

    Ezra is correct, there is a typo. I used the code given and it results in incorrect values. I took Ezra’s fix and it looks much better now. It would be nice if the author could correct the code in the post.

  8. Mats Rauhala says:

    Could you point on to some material about hilbert curves, especially about mapping 2d->1d, 1d->2d?

  9. Carter T Schonwald says:

    I”ll actually have some benchmarks on morton vs hilbert layouts for linear algebra soon. It also looks like a straightline loopless hilbert algorithm *may* exist, trying to sus out the details shortly. (will certainly be bigger than the morton algorithm, but thats a small price for better locality!)

  10. Josephpence says:

    dtvj here

3 Pings/Trackbacks for "Two-dimensional spatial hashing with space-filling curves"
  1. […] A nice way to hash a 2-D point, while preserving some locality.  Link. […]

  2. […] key for sorting the info within the structure. Actually, I’ve found an implementation of the Hilbert distance algorithm in Haskell (which happens to be the language that I’ll be using on my homework), and I’m allowed […]

Leave a Reply

Your email address will not be published. Required fields are marked *

*