It seems that 'many is the low-hanging fruit of optimization. I've since gotten an 8-paragraph lorem ipsum piece, totalling about 5k, which renders in 3-4 seconds (about around 3800msec).
Hmm. Profiler.
I'm not 100% sure but maybe the fact that nearly all the composing parsers decompose the return value of sub-parsers, then recompose the return value, might be slowing it down? Maybe have parsers accept an optional return value argument, which 'return will fill in (instead of creating its own) might reduce significantly the memory consumption (assuming it's GC which is slowing it down)?
Note that the timing will not be very accurate or particularly useful IMO, since it doesn't count recursion but does count calls to other functions. Sigh. We need a real profiler ^^
Hmm. It seems we can't squeeze much performance out of 'alt, I can't really see a way of optimizing 'alt itself, so possibly we should optimize the grammar that uses 'alt.