| I'm still working on my "speed-up-things" compiler, you know, the Psyco-like, but not quite. Hmm... let's call it Arco. I was thinking about adding a bit of type inference behind it. After all, in an expression like (+ s "foo" "bar")), you know that s is a string as the + operator is designed to concatenate string, append lists or add numbers. That way, compiling functions would be easier and more efficient (we could deal with lambdas inside lambdas for example). Well, I found quit a big problem. Operators can be redefined to deal with other types (for example, + on characters could build a string), but that's not a problem as the operator's behavior is conservative : it still works the old way too. But it can be redefined in a destructive way, as in (= + -). Adding two numbers does not behave the same as before, and you can't inline the use of the + operator. That's why mzscheme code is faster inside modules : predefined operators cannot be redefined, so (+ a b) can be safely inlined. Outside modules, well, the norm says you can destroy the basic operators, so, the interpreter is screwed, it cannot inline things. My solution, as hinted by almkglor, was to say : "hey, you only compile the functions of your choice, so just get sure you don't redefine the existing operators destructively". That way, the + applied on numbers can be translated to mzscheme's +, itself inlined as the Arc code is in a module. As long as you declare types, that's easy : you know the type of everything and know if you can inline or not. Infering types looks quite easy too : in +, all types must be the same : (+ x 1) means x is a number, (+ x "foo") means it's a string, etc. But, wait, I could redefine the operator "+" to work on a new type of mine, an amount of money for example. Thus, (+ x '(1 euro)) would add 1 euro (ok, that still works), but you could also provide the shortcut (+ x 1) meaning "add one unit of the current currency". There, we're screwed. And forbidding such things is out of question. There a a few solutions : - inserting type declaration inside the code you want to optimize, à la CL, where you can get into the details to have the most optimized code as possible, - using a Stalin-like model : once the compilation is called, we're in a closed world ; the whole system is compiled, and we know which operators cannot possibly change ; if something is changed in the REPL (a new definition, a modification, ...), you get back to the uncompiled version and have to recompile the whole thing if you want to (not to be used when debugging or designing), - not dealing with optimized compilation anymore, the compile facility already present is enough to get acceptable performance (we're faster than Python for example), - insert your idea here. My favorite would be the second solution. Very hard to implement efficiently, but we can go step after step so that's not really a problem. Anyway, what's your advice ? |