Doesn't help. I inserted (map idfn ...) on the value given to enformat, which is really the currying of enformat-base (the very short (fn ) at the end):
Hmm. I tried refactoring the (apply alt (map str-seq ...)) thing to an alt-str-seq, basically I created a trie and then did alt on each key, with each sub-trie another 'alt and stored sequences converted to 'seq, and stored end-of-strings (nil keys) as 'nothing, but it was buggy and didn't improve performance T.T.
I've been doing quite a few attempts at improving performance but often I ended up getting more bugs, waa!
Hmm. I just tried replacing the scanner. Just as you also found, it didn't help performance.
(def scanner-string2 (s (o start 0) (o end (len s)))
(map idfn (scanner-string s start end)))
I'll see if I can find time soon to digest the wiki-arc grammar. Possibly it could be optimized, but I suspect treeparse could be handling it better. There are probably lessons to be learned from how Parsec gets good performance.
It seems that part of Parsec's optimization is that it is actually limited to an LL(1) grammar (whatever that means) and it's <|> ('alt) will fail immediately if the first parser ever consumed anything at all. Not sure how that translates to treeparse.
LL(1) grammars only require one token of look-ahead to parse.
Parsec does not strictly require this, it can handle infinite look-ahead grammars. However, for good performance, it is best to use LL(1) grammars -- so there will be no backtracking required.
When using Parsec, I have often been surprised by the quick-failing behavior of <|> that you mentioned. Thus, I did not duplicate it in treeparse.