Subscribe to
Posts
Comments

17 Responses to “First steps with Haskell text API improvement”

  1. on 06 Jul 2009 at 03:09Stephen Blackheath

    Bryan – This is great stuff! Thank you for your good works!

    I’m an extreme disliker of helper functions, and yet (contrary to what I said on #haskell) even I am starting to think that ‘strip’ is a good idea. Some observations:

    - I don’t like ‘dropAfter’ because it sounds like it drops everything after the first occurrence of the delimiter (from the left). I thought about ‘dropFinal’ but I’m starting to think ‘dropLast’ might be better because it’s consistent with ‘Prelude.last’.

    - I can’t think of anything better than ‘dropAround’. It seems good in that it is clear and memorable.

    - ‘stripLeft’ and ‘stripRight’ assume that left == start. I realize there is a precedent in Haskell: foldl, etc. However, this is a Unicode library and there are 600 million speakers of Arabic, Farsi and Hebrew world-wide. If Haskell does take over the world in spite of itself it would be nice not to annoy/confuse people.

    Making the names consistent with drop* doesn’t work. ‘head’ and ‘last’ (inspired by Prelude) don’t work either because it sounds like they work on a single character only. stripHeads is just plain clunky. So here’s an idea – how about some new terms Start and End? (They’re nouns – ‘Begin’ is a verb). These are conceptually consistent with ‘head’ and ‘last’:

    heads == start
    lasts == end

    - Instead of ‘strip’ you could consider ‘trim’, which is what Java uses – shorter and possibly clearer.

    Here they all are together:

    dropWhile, trimStart
    dropEnd, trimEnd
    dropAround, trim (or trimAround?)

    Less than perfect, I’m afraid, but hopefully there’s something useful in it. — Steve

  2. on 06 Jul 2009 at 04:35Nicolas Pouillard

    Great progress, thanks!

    About dropAfter, if I understand well the following equation holds:
    dropAfter p = reverse . dropWhile p . reverse

    If so I propose using the words reverse or backward in the name:
    revDropWhile
    dropWhileBackward

    I’m also in favor of trim{Start,End,}.

  3. on 06 Jul 2009 at 05:06Dougal Stanton

    This is all looking pretty good. I totally agree on the ubiquity of “chunksOf”. I always end up recreating it by some name, “groupsOf”, “breaklist”, etc etc. I wonder if just “chunk” would be a good name?

    I don’t really follow the logic of Stephen’s argument about Unicode. Surely the left and right in stripLeft and stripRight are referring to the underlying Haskell lists, which are always written x:y:z:[]. Stripping the left means stripping the head elements of the list. If writing Hebrew with Haskell allows one to write []:z:y:x I’d be very surprised!

  4. on 06 Jul 2009 at 05:31Mark Wotton

    I’d prefer “chomp” to either “strip” or “trim”, if a perlism isn’t considered too filthy…

  5. on 06 Jul 2009 at 06:27Arthur van Leeuwen

    I know for a fact dropAfter is useful, however, why not name it in accordance to spanEnd and breakEnd in Data.ByteString.Strict, i.e. dropWhileEnd ?

  6. on 06 Jul 2009 at 08:48Duncan Coutts

    The ‘split’ function was only in the Data.ByteString[.Lazy].Char8 modules, not in Data.List. So there’s no great history or existing standard practice that needs preserving. I’m not sure I’d bother with the Compat module.

  7. on 06 Jul 2009 at 11:01Programmer

    Chomp? What a horrid name. My vote is for strip or trim, in that order.

  8. on 06 Jul 2009 at 12:14brian

    stripLeft and stripRight seem fine to me. If they were named stripStart and stripEnd, I’d agree that they were badly named because of the language issue.

  9. on 06 Jul 2009 at 12:42Ian Taylor

    ‘trimLeft’, ‘trimRight’, ‘trim’ sound good to me.

    I always liked the sound of ‘join’ when dealing with text rather than ‘intercalate’. It goes well with split.

  10. on 06 Jul 2009 at 18:35Simon Michael

    I would like to see strip* included. I write these helpers in every haskell project (as strip, lstrip, rstrip).

    You didn’t mention the split library on hackage.. did you see it, any more good ideas to be harvested from there ?

    Great stuff.

  11. on 06 Jul 2009 at 21:30solrize

    strip/stripLeft/stripRight are in the spirit of the Python names for those functions, which in turn probably has more in common with Haskell than Perl or Java do. So I’d stay with them.

  12. on 07 Jul 2009 at 01:13Greg

    chunksOf

    in Ruby this is Enumerable#each_slice.

    # Ruby
    (1..10).each_slice(3) {|a| p a} # [1,2,3] …

    – Haksell
    eachSlice 4 “haskell.org” == ["hask","ell.","org"]

    I like chunksOf or groupsOf and slicesOf

  13. on 07 Jul 2009 at 01:24nbloomf

    IIRC the chunksOf function was implemented as groupBy in “On Lisp”.

  14. on 08 Jul 2009 at 20:15Keith

    I think chunksOf is generally useful enough to be in Data.List. Is is possible (Haskell’ ?) to add functions like this that turn out to be general enough

  15. on 12 Jul 2009 at 20:48Stephen Blackheath

    I second Arthur van Leeuwen’s “dropWhileEnd” suggestion

  16. on 14 Jul 2009 at 04:43Johan Tibell

    I agree with Duncan that a Compat module is unnecessary. The number of modules listed at http://hackage.haskell.org/package/text is already rather intimidating. I also wouldn’t bother with splitChar unless it has serious performance benefits.

    I also prefer join to intercalate but I guess that boat already sailed. I don’t remember what the original argument was but if intercalate was chosen over join because of name clashes with monads I think that’s a poor reason since we have namespaces.

    A parting comment: Beware of the potential combinatorial explosion that comes from creating a helper function for every common case rather than relying on composition. Haskell’s lack of keyword arguments makes libraries prone to export lots of fooByBar functions for lots of different “Bars”. If lots of possible “configuration” parameters are absolutely needed for a function consider passing them in a record instead of creating separate functions.

  17. on 04 Oct 2009 at 16:50Andrew

    Good stuff.

    chunksOf :: Int -> [a] -> [[a]]
    please.

Leave a Reply