Arc Forumnew | comments | leaders | submitlogin
1 point by stefano 6102 days ago | link | parent

For the global vars problem, a solution could be to associate top level values directly with the symbol, this way a symbol would consist of three values: its string representation, its global value (initially a special 'unbound value') and a property list.


1 point by almkglor 6102 days ago | link

The current style has an optimization where all globals are simply referenced directly from an array in O(1). I'd rather that symbols point to entries in this array, because symbol-as-global-variable lookups are expected to be completely nonexistent if 'eval isn't involved in the program anyway (who uses 'eval in a language with 'read?). Only newly created symbols must have allocated variable values, and only for the benefit of 'eval'ed code - we can already know the global variables in the compiled code, because the compiler need that info anyway.

Basically:

  struct {
    long type; /*T_SYM*/
    char* stringform;
  #ifdef EVAL_USED
    obj* binding;
  #endif
  } symbol;

   int main(){
     /*compiler generated only if eval is used*/
     obj sym; symbol* sympt;
     sym = SYM2OBJ("globalvar0");
     sympt = (symbol*) sym;
     sympt->binding = &GLOBAL(0);
     sym = SYM2OBJ("globalvar1");
     sympt = (symbol*) sym;
     sympt->binding = &GLOBAL(1);
     ...
   }
This way the current performance is retained (global variable lookups are O(1)).

-----

2 points by stefano 6101 days ago | link

I don't know how much this solution will be once support for a dynamic load (e.g. from the REPL) will have to be implemented, because you'll have to keep an index of the last global variables created across different compilation sessions. With threads it gets even more complicated (mutex on the index?). With symbols it would be simpler to implement a dynamic load or definition of a global var from the REPL. The price paid is a slightly slower access to global variables, because 2 references to memory are necessary for every refrence to a global var. Global variables lookups are still O(1) though, e.g: sym->binding for read access and sym->binding = value for write access.

-----

2 points by almkglor 6101 days ago | link

> the last global variables created across different compilation sessions

I don't understand this part. I was proposing that 'eval would be an interpreter, not a compiler. My intentions was that compiled code would be statically generated (the way it's done now), so 'eval cannot possibly compile code. It would be a compiled interpreter of Arc. arc2c is a static compiler, so 'eval won't add ever add compiled code; the best it can do is create a 'interpreted-fn object that contains an interpreted function's code (as a list) and the enclosing interpreted environment

So a dynamic load would just interpret the expressions in the file being loaded:

  (set load
    (fn (f)
      (w/infile s f
        (whilet e (read s)
          (eval e)))))
'eval would be able to access the global variable table indirectly via the symbols and %symeval/%symset.

Basically, 'eval would be compiled to something like this:

  (set eval
    (fn (e (o env nil))
      (if (isa e 'symbol)
          (if env (lookup-environment env e)
                  (%symeval e))
          (...))))
Also: if the compiled code doesn't reference it, it won't be in the GLOBAL() array. The reason is simple: the compiled code won't reference it, ever. If 'globalvar isn't in GLOBAL(), then it does not exist in the compiled code. So it doesn't matter that it's not in the GLOBAL() array - the compiled code never referenced that global, so it won't ever use an index into the GLOBAL() array to refer to it. The interpreted code might, but that's why we have an indirect reference connected to the symeval.

Also, when I say O(1), I mean O(1) with the number one, as in only one layer of indirection (an index within a table). If global bindings are kept with the symbol only, then all global accesses - even precompiled ones - need (1) to find the symbol and (2) get the binding, for a total of O(2).

In other words: 'compile-file compiles, but it creates a new executable which is never connected to the process that ran 'compile-file. 'eval just interprets, and if the interpreted code mutates a global of the program, then that global gets mutated for real, in the program (what are you doing using 'eval on untrusted coe anyway). But if the interpreted code mutates a global that is never used in the program, it just creates a new global variable, one which is never referenced by the program (by definition, because the program never used it).

-----

1 point by stefano 6101 days ago | link

I thought eval compiled code, loaded it and then executed it. I've been mistaken. With the compiled code completely static then your strategy is better than assigning values to symbols.

-----