Two-dimensional spatial hashing with space-filling curves

The Hilbert curve is a fractal space-filling curve that is rather pretty to look at. hilbert.png A Hilbert curve of order n traces a single path over a square of side 2^n units, as you can see in the images from MathWorld above (with curves of order 2 through 6). If you pick a point p from within the square covered by a curve, and trace out the curve until you reach that point, the number of intermediate points that you passed through along your way to p is called the Hilbert distance. The notion of distance along a space-filling curve is useful for a few reasons.
  • You can use it as a hash from a 2D coordinate to an integer.
  • Some curves preserve locality; for example, points that are nearby in 2D space are likely to have similar Hilbert distances.
  • It’s an isomorphism: you can turn a distance back into a 2D coordinate, in some cases quite easily.
  • Locality-preserving hashes can be useful for clustering nodes of an R-tree, which is the generalisation of a B-tree to spatial data.
There’s a nice algorithm for computing the Hilbert distance of a 2D point that’s linear in the order of the curve (or logarithmic in the size of the side of the square you’re trying to cover). To put this into useful terms, a “10 megapixel” camera typically has a resolution of 3766×2800 pixels. To cover every pixel with a Hilbert curve, you’d need a curve with 4096 points on a side, which is an order-12 curve. You can compute the Hilbert distance of any point in your 10-megapixel photo with 12 steps of the distance algorithm. Here’s an implementation of this Hilbert distance algorithm in Haskell.
import Data.Bits
hilbertDistance :: (Bits a, Ord a) => Int -> (a,a) -> a
hilbertDistance d (x,y)
    | x < 0 || x >= 1 `shiftL` d = error "x bounds"
    | y < 0 || y >= 1 `shiftL` d = error "y bounds"
    | otherwise = dist (1 `shiftL` (d - 1)) (1 `shiftL` ((d - 1) * 2)) 0 x y
    where dist 0 _ result _ _ = result
          dist side area result x y =
              case (compare x side, compare y side) of
              (LT, LT) -> step result y x
              (LT, _)  -> step (result + area) x (y - side)
              (_, LT)  -> step (result + area * 3) (side - y - 1)
                          (side * 2 - x - 1)
              (_, _)   -> step (result + area * 2) (x - side) (y - side)
              where step = dist (side `shiftR` 1) (area `shiftR` 2)
As you can see, each step requires a few comparisons, a few shifts, and a handful of arithmetic operations, so it’s fairly cheap, and the loop unrolls and constant-folds well. There’s some controversy in the literature over whether a Hilbert curve gives better spatial locality than a Lebesgue (also known as a Z-order, or Morton) curve. It’s certainly the case that computing the Lebesgue distance is much cheaper than the Hilbert distance, as it’s possible to finagle an answer using a fixed, unconditional sequence of bit-twiddling operations to interleave the bits of the x coordinate with those of the y.
lebesgueDistance :: (Int, Int) -> Int
lebesgueDistance (x,y) =
    let ms = [(0x00FF00FF, 8), (0x0F0F0F0F, 4),
              (0x33333333, 2), (0x55555555, 1)]
        aaargh x (mask,shift) = (x .|. (x `shiftL` shift)) .&. mask
        x' = foldl aaargh x ms
        y' = foldl aaargh y ms
    in x .|. (y `shiftL` 1)
I haven’t myself checked to see which curve better preserves locality. It’s certainly easier to see how to extend the Lebesgue bit-fiddling hack to a higher dimension, while the given Hilbert code is a bit of a head wrecker.
Posted in haskell, software
9 comments on “Two-dimensional spatial hashing with space-filling curves
  1. Hey Bryan, why the interest in fractal curves? Are you using them in some application?

  2. Patrick Mézard says:

    For the record, I did performance comparison of Hilbert vs Z-order indexing for a cartographic tiling or Western Europe, mostly containing the road network and water/forest/building features. The indexed database was something like 15/20Gb of vector data, test machine looks like a bi-PIII-700Mhz with 1Gb of memory and SCSI disks (was 7 years ago).

    I did not find relevant differences despite what several papers/thesis claim (Moon,Faloutsos&Jagadish 1996, Lawder 2001). Maybe the code/tests I wrote were incorrect or biased in a manner which invalidated the expected locality gains, anyway nothing spectacular came from it.

    I was sorry about it, I thought it was a cool idea. Too bad it just did not work for me.

  3. Ben – a number of years ago, I wrote a spatial database in Scheme, using Hilbert R-trees. The fast Hilbert distance algorithm was a useful little snippet that fit nicely into the constraints of a blog posting :-)

  4. Malcolm Rowe says:

    Nice, I was aware of Hilbert curves, but I hadn’t heard about Lebesgue curves before!

    See also: http://xkcd.com/c195.html

  5. Ezra says:

    Hello–

    Is there a typo in the lebesgue algorithm? It appears you’re defining x’ and y’ and then ignoring them. Is the last line meant to be

    in x’ .|. (y’ `shiftL` 1)

    instead? It seems to work better that way.

  6. jberryman says:

    Cool, thanks! I wonder if space-filling curves can be used in functional languages to represent a grid of mutable objects, like a checker board.

  7. taotree says:

    Thanks for the info and especially the code implementations.

    Ezra is correct, there is a typo. I used the code given and it results in incorrect values. I took Ezra’s fix and it looks much better now. It would be nice if the author could correct the code in the post.

  8. Mats Rauhala says:

    Could you point on to some material about hilbert curves, especially about mapping 2d->1d, 1d->2d?

  9. Carter T Schonwald says:

    I”ll actually have some benchmarks on morton vs hilbert layouts for linear algebra soon. It also looks like a straightline loopless hilbert algorithm *may* exist, trying to sus out the details shortly. (will certainly be bigger than the morton algorithm, but thats a small price for better locality!)

2 Pings/Trackbacks for "Two-dimensional spatial hashing with space-filling curves"
  1. […] A nice way to hash a 2-D point, while preserving some locality.  Link. […]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>