Subscribe to
Posts
Comments

8 Responses to “Case conversion and text 0.3”

  1. on 07 Jun 2009 at 18:18bla

    Why do you convert ß to SS not to capital ß?
    http://en.wikipedia.org/wiki/Capital_%C3%9F

  2. on 08 Jun 2009 at 00:48Bryan O'Sullivan

    Because ẞ is not widely used in German, and to use it as a capital ß would violate the guidelines set in the Unicode SpecialCasing table.

  3. on 08 Jun 2009 at 14:31Simon Michael

    I really appreciate what you are doing here, more power to your hacking elbow.

  4. on 09 Jun 2009 at 02:21Porges

    Are you implementing this all yourself in pure Haskell? Why not hook into ICU or something similar, which would provide a proven-correct implementation?

  5. on 10 Jun 2009 at 01:07Bryan O'Sullivan

    Porges, where I can, I implement the code in pure Haskell. I’ve written a separate text-icu library that provides bindings to ICU for code that is currently just too much trouble.

  6. on 10 Jun 2009 at 07:10Porges

    Perhaps we should really have some kind of way to generate the code from the CLDR XML data, so that when that is updated we can update the Haskell library without too much hassle.

  7. on 10 Jun 2009 at 22:08Pseudonym

    As interesting as this is, I wonder whether or not this is something that the Haskell community should be maintaining. Wouldn’t a binding for, say, ICU be more appropriate?

  8. on 11 Jun 2009 at 12:45Bryan O'Sullivan

    Pseudonym, there’s already a text-icu package that contains ICU bindings. My goal is to write as much Unicode handling code in pure Haskell as possible, and to leave the complex and fugly stuff to the ICU bindings. That way, if you have fairly simple needs, your number of dependencies is kept low. Also, crossing back and forth between Haskell and C++ is very expensive (due to the different representations used for text), so calling into ICU shouldn’t be done often.

Leave a Reply