That said, the formatter in wiki-arc is plenty darned slow on & codes (I used a really long 'alt form). Lacking a profiler, I can't really tell whether it's treeparse or scanner-string which is slow. Any suggestions on optimization? I think one problem is that each parser returns a list of stuff, without any chance to reuse that list (especially for 'seq forms). Another problem might be that scanner-string is just too slow for easy usage.
I haven't fully digested the wiki parser yet. As a first thought, the use of enclose-sem confuses me a bit -- seems like a reinvention of filt. I doubt that would be a performance issue of course.
Maybe the grammar should be factored, or maybe this is an opportunity to optimize treeparse.
Try this: convert the string to a normal list of characters in advance, and see if the performance improves:
(map idfn (scanner-string "hello world"))
If that doesn't help, then scanner-string is not the problem.
Hmm, sorry, I didn't fully digest 'filt either. I thought 'filt was for transforming the 'parsed field in the return value? 'enclose-sem is intended to act on the 'actions field in the return value (although I did end up passing the entire return value).
The main slowdown occurred on adding & codes. Here's my most degenerate case:
Takes about 5-6 seconds to render; also, if I do some thing on the repl (such as searching using (help "searchstring"), or loading some random library) the parsing takes up to 12 seconds. Anyway I've added a rendering time at the lower right of each rendered page. Note also that I've added caching, to disable caching you'll need to search through wiki-arc.arc for *wiki-def and change the (cached-table) call to (cached-table 'cachetime 0).
(def many (parser)
"Parser is repeated zero or more times."
(fn (remaining) (many-r parser remaining nil nil nil nil)))
(let lastcdr (afn (p) (aif (cdr p) (self it) p))
(def many-r (parser li acc act-acc acctl act-acctl)
(iflet (parsed remaining actions) (parse parser li)
(do
(when parsed
; edit: necessary, it seems that some of the other
; parsers reuse the return value
(zap copy parsed)
; end of edit
(if acc
(= (cdr acctl) parsed)
(= acc parsed))
(= acctl (lastcdr parsed)))
(when actions
; edit: necessary, it seems that some of the other
; parsers reuse the return value
(zap copy actions)
; end of edit
(if act-acc
(= (cdr act-acctl) actions)
(= act-acc actions))
(= act-acctl (lastcdr actions)))
(many-r parser remaining
acc act-acc acctl act-acctl))
(return acc li act-acc))))
Basically instead of using join, I used a head+tail form of concatenating lists. It seems to work, and the optimization above seems to drop the test:
((many anything) (range 1 1000)))
down to 27 msec (edited: 58msec) on my machine (it was about 7350msec on the older version)
What are your thoughts? The code now looks unprintable. Also, I'm not 100% sure of its correctness.
UPDATE: yes, it's not correct, however the edited version above seems to work now. Rendering of my "difficult" page has dropped to 1100msec.
I've since added a 'tconc facility to Anarki. Basically tconc encapsulates away the head+tail form of list catenation; a single cons cell is used with car==head and cdr==tail.
The head of the list is the start of the list, while the tail of the list is the last cons cell:
cons
O->cdr
v
car
the list (1 2 3 4 5):
O->O->O->O->O->nil
v v v v v
1 2 3 4 5
the tconc cell for the above list:
tconc cell
O-----------+
| head | tail
v v
O->O->O->O->O->nil
v v v v v
1 2 3 4 5
'tconc creates a new cell and modifies the tconc cell to repoint the tail to the new tail. You can extract the currently concatenated list by using 'car on the tconc cell.
The diff between my version of treeparse and yours is now:
--- treeparse.arc 2008-03-21 11:59:13.000000000 +0800
+++ m_treeparse.arc 2008-03-22 23:00:51.000000000 +0800
@@ -23,4 +23,6 @@
; Examples in "lib/treeparse-examples.arc"
+(require "lib/tconc.arc")
+
(mac delay-parser (p)
"Delay evaluation of a parser, in case it is not yet defined."
@@ -112,12 +114,12 @@
(def many (parser)
"Parser is repeated zero or more times."
- (fn (remaining) (many-r parser remaining nil nil)))
+ (fn (remaining) (many-r parser remaining (tconc-new) (tconc-new))))
(def many-r (parser li acc act-acc)
(iflet (parsed remaining actions) (parse parser li)
(many-r parser remaining
- (join acc parsed)
- (join act-acc actions))
- (return acc li act-acc)))
+ (nconc acc (copy parsed))
+ (nconc act-acc (copy actions)))
+ (return (car acc) li (car act-acc))))
(def many1 (parser)
edit: note that use of 'tconc/'nconc is slightly slower than explicitly passing around the tails. For the test, it runs at 79 msec on my machine (explicit passing ran at 58msec); this is expected since we must destructure the cons cell into the head and tail of the list under construction. Would it be perhaps better to use a macro to hide the dirty parts of the code in explicit passing of hd and tl?
Nice optimization. I'm not so sure about the naming of nconc, though. Although it is used for a similar purpose as the traditional CL nconc, I would expect anything called nconc to behave like this:
(def last-list (li)
(if (or (no li) (no (cdr li))) li
(last-list (cdr li))))
(def nconc (li . others)
"Same behavior as Common Lisp nconc."
(if (no others) li
(no li) (apply nconc others)
(do (= (cdr (last-list li)) (apply nconc others))
li)))
>> do you think this optimization is worth putting in treeparse?
Certainly. At the moment you are probably 50% of the treeparse user base, so it needs to be fast enough for your use case :)
I admit that efficiency wasn't a big thought when I first wrote treeparse (besides avoiding infinite loops -- hopefully those are gone now...). I fondly remember my CL optimization days... we've gotta make ourselves one of those nifty profilers for Arc.
It seems that 'many is the low-hanging fruit of optimization. I've since gotten an 8-paragraph lorem ipsum piece, totalling about 5k, which renders in 3-4 seconds (about around 3800msec).
Hmm. Profiler.
I'm not 100% sure but maybe the fact that nearly all the composing parsers decompose the return value of sub-parsers, then recompose the return value, might be slowing it down? Maybe have parsers accept an optional return value argument, which 'return will fill in (instead of creating its own) might reduce significantly the memory consumption (assuming it's GC which is slowing it down)?
Note that the timing will not be very accurate or particularly useful IMO, since it doesn't count recursion but does count calls to other functions. Sigh. We need a real profiler ^^
Hmm. It seems we can't squeeze much performance out of 'alt, I can't really see a way of optimizing 'alt itself, so possibly we should optimize the grammar that uses 'alt.
True, filt doesn't touch the actions field, while sem does. However, I am usually able to replace the use of actions with filters that operate on the parsed field. I prefer this, because the filter style is a more clean and functional style -- rather than relying on side-effects. Hopefully that made sense.
I don't yet know for certain whether filters could or should be used in this particular case. enclose-sem might be the right way to go after all.
I'll have to defer to you on this one - I've only written a parser combinator type parser once before, and that was before I learned what it was. I did end up using something nearer to filters (i.e. acts on the returned value instead of having a separate 'actions field).
Edit: Perhaps part of the optimization of treeparse could be to eliminate the 'actions field. None of your samples used 'sem, and I might prefer 'filt for such enclosing stuff as ''bold'' and '''italics'''. The [[link]]s might be more difficult (because of the necessity of adding alphadigs after the closing bracket in [[link]]s to the actual link text but not the link target), but I think it's doable.
I've thought several times of removing sem in favor of filt. As features, they are very similar but filt is usually cleaner. If I don't see a compelling use case for actions soon I may indeed remove them. This would simplify the interface considerably.
Just between us, filt is actually an analogue to Haskell's monadic lift operator. They're even anagrams! (this happened by accident.)
Okay, I've since ported the Arki wikiformat parser to use filt instead of sem. Removing 'actions would reduce slightly the memory consumption of treeparse.
Doesn't help. I inserted (map idfn ...) on the value given to enformat, which is really the currying of enformat-base (the very short (fn ) at the end):
Hmm. I tried refactoring the (apply alt (map str-seq ...)) thing to an alt-str-seq, basically I created a trie and then did alt on each key, with each sub-trie another 'alt and stored sequences converted to 'seq, and stored end-of-strings (nil keys) as 'nothing, but it was buggy and didn't improve performance T.T.
I've been doing quite a few attempts at improving performance but often I ended up getting more bugs, waa!
Hmm. I just tried replacing the scanner. Just as you also found, it didn't help performance.
(def scanner-string2 (s (o start 0) (o end (len s)))
(map idfn (scanner-string s start end)))
I'll see if I can find time soon to digest the wiki-arc grammar. Possibly it could be optimized, but I suspect treeparse could be handling it better. There are probably lessons to be learned from how Parsec gets good performance.
It seems that part of Parsec's optimization is that it is actually limited to an LL(1) grammar (whatever that means) and it's <|> ('alt) will fail immediately if the first parser ever consumed anything at all. Not sure how that translates to treeparse.
LL(1) grammars only require one token of look-ahead to parse.
Parsec does not strictly require this, it can handle infinite look-ahead grammars. However, for good performance, it is best to use LL(1) grammars -- so there will be no backtracking required.
When using Parsec, I have often been surprised by the quick-failing behavior of <|> that you mentioned. Thus, I did not duplicate it in treeparse.