Arc Forum | I've been thinking a bit more on the "package" problem, which is a bit dangerous...

Arc Forum

3 points by shader 6288 days ago | link | parent

I've been thinking a bit more on the "package" problem, which is a bit dangerous, I know.

I don't really know how you're doing scoping in SNAP, or even much of how it's done in arc, but if the lexical scoping is done the same way as it is according to SICP (which I'm just now reading), then each "environment" has a pointer to it's parent frame, and if the interpreter can't find the definition in the current frame, it moves up one level.

What if you just merged the concept of "packages" "environments" and "scope" into the same thing? I don't really know what the syntax should be but basically when you load a package at a certain level of the program, it adds a pointer to the current frame (maybe between this frame and its parent? I don't know how well multiple parents works for lexical scoping) and it's treated by the interpreter as if it was just another scope.

Then, to make things fun and interesting for more than just packages, you add a syntax for naming scopes, and referring to them either by name or by relative height. This way you can access not only package specific variables, but also get around some shadowing problems by telling it "I only want 'a if it's in this scope" or "I want 'a from the frame two levels up"

I don't really care too much what the syntax looks like, but this would be an interesting way to reference packages that should be (if scoping actually works in a way similar to I've presumed) a) fast and b) capable of managing more than just packages.

One major problem is that it would probably require lots of low-level munging in scheme to get this to work. It would probably be easier to do in SNAP, since I think you're implementing most of this from scratch there.

What do you think?

Edit: I've been thinking again ^^, and I just realized that though this might be an interesting way to implement packages, I don't think it would fix your typing issue. Maybe variables should all carry around info of which scope they're from (kind of what you were suggesting)? That way the system can look in the right place for variables more quickly, too. If 'a says it's from scope this+2, we wouldn't have to wait for the interpreter to miss twice before it found the value.

2 points by almkglor 6288 days ago | link

> then each "environment" has a pointer to it's parent frame

Your understanding of this is quite correct.

But this is how it's equivalent to, not necessarily how it's implemented. AST-traversal?

The problem with the nested-environment implementation is that mere variable lookup can be O(N) where N is the number of nested environments:

  (def foo (x)
    (fn (y)
      (fn (z)
          ;accessing x takes 2 indirections!
        (+ x y z))))

Also, consider an alternative:

  (def foo (x)
    (fn (y)
      (let tmp (+ x y)
        (fn (z)
          (+ tmp z)))))

In a true nested-environment implementation, the innermost function will still carry around 'x and 'y even though it doesn't need them. This means that GC will not be able to collect the contents of those variables.

A faster way would be to "flatten" the environment. In effect, what's really done is:

  (def foo (x)
    (fn (y)
      (fn (z)
        (+ x y z))))
  =>
  (def foo (x)
    (closure (list x)
      (fn (y)
        (closure (list (my-environment 0) y)
          (fn (z)
            (+ (my-environment 0) (my-environment 1) z))))))

For mutated and shared variables, most implementations have some sort of "boxed" variable; this is what are called "shared variables" in arc2c/SNAP.

------- BUT!

Just because environment flattening is how it's usually implemented doesn't mean it's not possible to capture and name an environment.

For example consider that what we call a "function" is really an object constructed by a 'closure very-hidden function. It's really a pair of the environment and the function code. Similarly if Arc had ways of destructuring a function, we could implement a call-with-current-environment function. Suppose we had a 'environment-of function which extracts just the data of a given function:

  (def call-with-current-environment (f)
    (f:environment-of f))

Then a theoretical 'w/environment:

  (mac w/environment (var . body)
    `(call-with-current-environment
       (fn (,var) ,@body)))

> What if you just merged the concept of "packages" "environments" and "scope" into the same thing?

This has been implemented. Take a look at lib/module/module1.arc . But aside from having problems with macros, there's also....

> I've been thinking again ^^, and I just realized that though this might be an interesting way to implement packages, I don't think it would fix your typing issue.

Quite right ^^

> Maybe variables should all carry around info of which scope they're from (kind of what you were suggesting)?

Err. I don't quite understand this. Also, I don't quite understand how this realates to symbols-as-types...

-----

2 points by shader 6287 days ago | link

I seem to be slowly arriving to similar conclusions that you have, the only differences being things that I didn't know about the way that lisp is optimized. Maybe I'll end up learning something through all of this.

> Err. I don't quite understand this. Also, I don't quite understand how this realates to symbols-as-types...

Well, what I had originally thought when I said this was rather silly, which was that in the value of the variable there should be information about which scope it's located in. Doh! Maybe that would work if it looked in the current env to figure out where the "real" var is.

What I now think is something very similar to what you've been saying, except I still think it would be cool if this could be a method of managing general environments, instead of just modules. I don't know if that was what you were thinking, or not; it's more of an implementation detail.

I do like the idea of being able to extract and name the environment at any point in the code. If all that a package was was an environment with a name and we had the ability to access items from each environment directly, it could (possibly?) be useful for a lot more than just modules.

I really like the syntax (env sym), where env is the environment object, and 'var is the symbol we're looking up. This way env.sym works. That's probably a syntax that many people are used to. I just don't know if it would work or not.

For your typing problem, I would do what you recommended. If it made any sense, I would use the syntax env.sym, so that it can be evaluated to figure out what type it is. You would still have to evaluate everything in head position to see if it evaluated to a macro. On the other hand, you could just check the car of the first item, and see if it was an environment. Or you could treat the whole thing as a single symbol and have it bypass the reader? How does arc's typing system work?

Also, how much would changing the type symbol to <env>sym help? How would it help the code decide how to interpret it?

I just don't know enough about scheme's pointer and environment system, or arc's typing system to know if any of my thoughts are even going in the right direction. So I just make more of them to cover the bases, and have you explain why none of them work. ^^

Basically at this point, I'll agree with anything you say, because I don't know enough to disagree with you. I would be very interested in learning/having you teach me all of the details about arc's implementation. As always, I'd love to help, but I don't know nearly enough to be useful.

Maybe we should start a new thread for the module system that you're proposing?

As per one of you previous comments: If symbols weren't global, how would you do them?

Edit: PG: It would be really nice if there were "reply" and "View Thread" buttons for each comment on the comments page. How hard would these be to add?

-----

2 points by almkglor 6287 days ago | link

> What I now think is something very similar to what you've been saying, except I still think it would be cool if this could be a method of managing general environments, instead of just modules. I don't know if that was what you were thinking, or not; it's more of an implementation detail.

Yes, it would definitely be cool to access the environment. For one, it would make serializing functions possible: aside from giving accessors to the environment and to the function code, you can give constructors for functions that accept a serializable environment and a serializable function code.

> If it made any sense, I would use the syntax env.sym, so that it can be evaluated to figure out what type it is.

Typical 'isa pattern:

  (isa foo 'cons)

Translating to module.sym form:

  (isa foo 'arc.cons)

.... except that arc.cons == (arc cons) and you'd be comparing the type to the unevaluated (arc cons), because of the ' mark.

With <arc>cons syntax, the "<" and ">" characters are simply passed directly, i.e. <arc>cons is just a single symbol.

-----

1 point by shader 6286 days ago | link

Is there any reasonable way to prevent the dot from being interpreted there? Or alternatively, evaluate the whole statement?

And how is the isa statement going to be updated to work independent of which module it's running in? Does each module type all of it's objects that way from the beginning, or were you going to have the interpreter do something fancy?

-----

1 point by almkglor 6285 days ago | link

> Is there any reasonable way to prevent the dot from being interpreted there? Or alternatively, evaluate the whole statement?

First things first. We want to make the type arc.cons readable, don't we? So if say 'arc here is a macro, it will expand to a symbol, which is basically "the symbol for cons in the arc package".

Now that symbol can't be a 'uniq symbol, since those are perfectly unreadable.

What we could do is....... oh, just make up a symbol from the package name and the given symbol. Like, say...... <arc>cons. The choice of <> is completely arbitrary.

Now.... if (arc cons) is a macro that expands to <arc>cons, why go through the macro?

> And how is the isa statement going to be updated to work independent of which module it's running in? Does each module type all of it's objects that way from the beginning, or were you going to have the interpreter do something fancy?

Something fancy, of course. Specifically it hinges on the bit about the "contexter". Remember that in my proposal I proposed adding an additional step after the reader, viz. the contexter.

Basically the contexter holds the current package and it puts any unpackaged symbols it finds into the mapping for the current package.

So for example we have:

  (in-package foo)
  (using <arc>v3)
  (interface <foo>v1
    make-a-foo foo-type is-a-foo)
  (def make-a-foo (x)
    (annotate 'foo-type x))
  (def is-a-foo (x)
    (isa x 'foo-type))

Now the contexter goes through it and maintains hidden state. This state is not shared and is not assured of being shared across threads (it might, it might not, implementer's call - for safety just assume it doesn't)

Initially the contexter has the package "User".

It encounters:

  (in-package foo)

This changes its internal package to "foo". This (presumably) newly-created package is given a set of default mappings, corresponding to the arc axioms: fn => <axiom>fn, quote => <axiom>quote, if => <axiom>if, etc. The contexter then returns:

The t is evaluated and returns.... t.

Then it accepts:

  (using <arc>v3)

This causes the contexter to look for a "v3" interface in the "arc" package. On finding them, it creates default mappings; among them are:

  def => <arc>def
  isa => <arc>isa
  annotate => <arc>annotate
  ...etc.

Upon accepting this and setting up the "foo" package to use <arc>v3 interface, it again returns:

Then it accepts:

  (interface <foo>v1
    make-a-foo foo-type is-a-foo)

This causes the contexter to create a new interface for the package "foo", named "v1". This interface is composed of <foo>make-a-foo, <foo>foo-type, and <foo>is-a-foo.

After creating the interface, it then returns:

Then it accepts:

  (def make-a-foo (x)
    (annotate 'foo-type x))

Since it isn't one of the special contexter forms, it simply uses the mapping of the current package - specifically the package foo - and returns the form:

  (<arc>def <foo>make-a-foo (<foo>x)
    (<arc>annotate '<foo>foo-type <foo>x))

Notice how x is implicitly mapped into the package foo; basically any unpackaged symbol which doesn't have a mapping in a package is automatically given to that package.

Then the contexter accepts:

  (def is-a-foo (x)
    (isa x 'foo-type))

Which is returned as:

  (<arc>def <foo>is-a-foo (<foo>x)
    (<arc>isa <foo>x '<foo>foo-type))

So there: the contexter automagically inserts packages.

-----

1 point by shader 6283 days ago | link

See, I told you you'd convince me to come around to your way of thinking, once I'd asked enough questions to find out the reasons for the choices you've made.

So, how hard is it to take what you've made so far, and generalize it to work with any kind of environment, and not just packages? So, things like destructuring functions, naming environments, passing them around, etc. Would this allow us to avoid shadowing variables by naming which level they came from explicitly?

Also, if I type in the symbol '<arc>bar, will the contexter read that and presume I'm looking for the bar in arc, or will it rename it <foo><arc>bar? And which is better?

If it's the latter, how would you propose we directly reference a particular environment? Would that be pack.sym, like I had thought earlier?

-----

1 point by almkglor 6283 days ago | link

> So, how hard is it to take what you've made so far, and generalize it to work with any kind of environment, and not just packages?

Hmm. Well, if you want to be technical about things, one Neat Thing (TM) that most compilers of lexical-scope languages (like Scheme and much of CL, as well as Arc) do is "local renaming". Basically, suppose you have the following code:

  (fn (x)
    (let x (+ x 1)
      x))

This is transformed (say by arc2c) into:

  (fn (x@1)
    (let x@2 (+ x@1 1)
      x@2))

Note that the renaming could actually be made more decent, i.e. something readable. For example, in theory the renaming could be done more like:

  (fn (<fn>x)
    (let <fn/let>x (+ <fn>x 1)
      <fn/let>x))

In fact I am reasonably sure that Scheme's hygienic macros work after local renaming, unlike CL's which work before renaming. You'll probably have to refer to some rather turgid papers though IMO, and decompressing turgid papers is always hard ^^.

> Also, if I type in the symbol '<arc>bar, will the contexter read that and presume I'm looking for the bar in arc, or will it rename it <foo><arc>bar? And which is better?

Well, since packages are constrained to be nonhierarchical, <arc>bar is considered already packaged and the contexter will ignore them. The contexter will add context only to unpackaged symbols.

-----

1 point by rincewind 6284 days ago | link

If I write this into the foo-package-file:

   (let list '(this is a test case)
       (all is-a-foo (map make-a-foo list)))

would list be read as <foo>list or <arc>list? Should the expression

   (sym "make-a-foo")

evaluate to <foo>make-a-foo?

-----

2 points by almkglor 6283 days ago | link

> If I write this into the foo-package-file: ....

Assuming you mean in the same file with the example foo package:

  (<arc>let <arc>list (<axiom>quote (<arc>this <arc>is <foo>a <foo>test <arc>case))
    (<arc>all <foo>is-a-foo (<arc>map <foo>make-a-foo <arc>list)))

> Should the expression ....

No, it evaluates to the unpackaged symbol 'make-a-foo.

-----