Aruu, okay okay I finally actually looked at the presentation docs, which I probably should have looked at first. One of the things that threw me off was 'self - I was thinking of Arc's sense of 'self as in 'afn!
The output C code looks suspiciously like assembly language to me. Perhaps we can also target a temporary assembly syntax so that we can do minimal peephole opts, such as convert stuff like PUSH(x); y = TOS(); to MOVE(x,y);. Can't wait to actually see this code on the git ^^.
P.S. Given that the transformations are (gasp!) syntactic, it might actually be possible to implement the entire compiler as (gasp gasp!) a treeparse parser (or at the very least a piped chain of treeparse parsers) ^^.
Maybe treeparse would be the right thing to use... The code is getting uglier everytime I try to add a new primitive / special form... I dunno...
As for the generated code, yes, it's a lot like a portable assembly code. There are certainly easy optimizations to perform on it, but as for now, it's working and that's a lot :)
And yes, let is the traditional one -- with tons of parens everywhere.
I'll go through the code later to see what can be done. Certainly the AST looks representable as plain lists to me, although I haven't fully analyzed it yet.
As an aside compile-file could be restructured like the following:
(def compile-file (filename)
(compile-ast (parse-file filename) (+ (strip-ext filename) ".c")))
; to allow programmatic access
(def compile-ast (ast dest)
; chain of conversions
(let chain
(list
(list cps-convert "CPS-CONVERSION")
(list closure-convert "CLOSURE-CONVERSION"))
; do reduction
(let final-ast
(reduce
(fn (ast (f desc))
(let new-ast (f ast)
(prn "----------------- AST after " desc)
(prn (source new-ast))
new-ast))
chain ast)
(prn "-------------------- C Code:")
(w/outfile f dest
(w/stdout f
(prn:liststr:code-generate final-code))))))
This should allow easy insertion of any steps (e.g. optimization steps) along the way.
In fact the chain list should probably be better use `(,), so that we can support flags or suchlike for optimizations:
For that matter my concern is with the expansion of PUSH and POP:
PUSH(x); y = POP();
=>
*sp++ = x; y = *--sp;
Can gcc peephole the above, completely eliminating the assignment to the stack (which is unnecessary in our case after all)?
y = x; //desired target
Without somehow informing gcc to the contrary, gcc will assume that writing to * sp is significant, even though in our "higher-level" view the assignment to * sp is secondary to transferring the data between two locations.
Well, with full optimizations on gcc (-O3), it doesn't change anything (at least in execution time, I didn't compare generated machine codes). Wow, gcc seems really clever. Now that I know how hard it is to implement a compiler, I can only applaud :)