Arc Forumnew | comments | leaders | submitlogin
1 point by rkts 5874 days ago | link | parent

Also, here's a version which uses iterators to reduce consing. It's ~2x faster on long words (again, assuming no blunders on my part).

  (= wordfile "big.txt")

  (= *nwords* (table))

  (w/infile f wordfile (whilet l readline.f (counts (tokens downcase.l whitec) *nwords*)))

  (= str [coerce _ 'string])

  (= alphabet (accum a (each c "abcdefghijklmnopqrstuvwxyz" (a str.c))))

  (def edits1 (w)
    (fn (f)
      (let n len.w
        (for i 0 (- n 1) (f:+ (sub w 0 i) (sub w (+ i 1))))
        (for i 0 (- n 2) (f:+ (sub w 0 i) (str:w (+ i 1)) (str:w i) (sub w (+ i 2))))
        (for i 0 (- n 1) (each c alphabet (f:+ (sub w 0 i) c (sub w (+ i 1)))))
        (for i 0 n       (each c alphabet (f:+ (sub w 0 i) c (sub w i)))))))

  (def editsn (n w)
    (let ws [_ w]
      (for i 1 n (= ws imappend.edits1.ws))
      (iornil:ikeep [*nwords* _] ws)))

  (def correct (w)
    (aif (or editsn.0.w editsn.1.w editsn.2.w)
      (ibest (compare > [*nwords* _]) it)))

  ; iterator utils

  (def iempty (i) (ccc (fn (c) (i [c nil]) t)))

  (def iornil (i) (and (~iempty i) i))

  (def imappend (r i) (fn (f) (i [r._ f])))

  (def ikeep (p i) (fn (f) (i [if p._ f._])))

  (def ifoldl (r q i) (i [= q r.q._]) q)

  (def ibest (gt i) (ifoldl (fn (y x) (if (or no.y gt.x.y) x y)) nil i))


4 points by rkts 5848 days ago | link

Well, not that anyone cares, but it turns out that the changes I made here do close most of the performance gap between Arc and Python. Python is still faster, but it's nothing to lose sleep over. The most important change was replacing Arc's cut with MzScheme's substring, which works the same way but is implemented in C. The other changes were important too, though, especially replacing lists with iterators. So I guess the lessons are

1. Any Arc library function that has an MzScheme equivalent should be rewritten to use the MzScheme version.

2. Using iterators to reduce consing is a win. So, maybe we should make a library of common iterator utilities (map, filter, etc.)?

-----