Arc Forum | Malleable functions and information hiding in py-arc

Arc Forum

Malleable functions and information hiding in py-arc

2 points by Pauan 5199 days ago | 20 comments

I've been mulling around this crazy idea for a little while, and thought I would share it.

If you haven't been following my earlier post[1] (I don't blame you, it's long and jumps all over the place), then my basic idea is that all complex data types would be represented as tables, so `(table)` would create something like this:

  (obj set  (fn (k v) ...)
       call (fn (k d) ...)
       keys (fn ()    ...))

But... what should functions be represented as? They could of course be represented as a table that only has a `call` attribute:

  (obj call ...)

But we can go further. What if functions had an `arguments` attribute, that contained the argument list...? So this...

  (fn (a b c) ...)

...would create this:

  (obj call      ...
       arguments '(a b c))

And then, what if the body of the function was stored in the `body` attribute? So this...

  (fn (a b c)
    (+ a b)
    (+ b c))

...would create this:

  (obj call      ...
       arguments '(a b c)
       body      '((+ a b) (+ b c)))

And then, what if the function's environment was also stored in the `environment` attribute...?

  (obj call        ...
       arguments   '(a b c)
       body        '((+ a b) (+ b c))
       environment ...)

This would not only allow you to inspect the argument list, body, and environment of every function[2], but also allow you to change them! And since every environment would have an `outer` attribute that points to the outer scope, this would allow you to inspect/change the function's closure as well.

Everything would be completely open. There would be no information hiding[3]. You could change how many arguments a function expects, you could change the function's body itself, you could change the function's environment, or change the outer environment...

So. This is kinda scary. Because there's no information hiding, how would you implement security? Would security even be possible in such a language? But... Arc's goal is not security or safety, it's conciseness and malleability, and this idea would definitely increase malleability.

In case information hiding is needed... we could provide a different mechanism for that. So rather than functions being opaque blobs by default, they would be open and free, with information hiding being used only when needed.

What do you guys think?

---

* [1]: http://arclanguage.org/item?id=14219

* [2]: This won't work on built-ins, because those are implemented in Python. But it would work on every user-created function. I may eventually get it to work on built-ins too, but that's more of a long-term goal.

* [3]: I find it amusing that one reason I dislike Python is because it makes modules and classes so open and makes information hiding sorta difficult, yet here I am, seriously considering making py-arc even more open than Python.

2 points by Pauan 5195 days ago | link

By the way, I just found this CL function:

http://www.lispworks.com/documentation/HyperSpec/Body/f_fn_l...

Which does something similar, but in a non-mutable way. It also requires some additional parsing, because it conflates the argument list and the body together. It's also using multi-value return. Assuming `function-lambda-expression` was available in Arc (but renamed to 'get-fn-attributes), the following two would be equivalent:

  ; function-lambda-expression
  (let '((nil arguments . body) environment) (get-fn-attributes foo) ...)

  ; message passing
  (let '(arguments body environment) foo ...)

Hurray, attribute destructuring! And, there are two more benefits to my approach:

1) Mutability. If we're destroying information hiding by making the information available in the first place, might as well make it mutable too, right? Better than adding in a `set-fn-attributes` function...

2) It makes it a lot easier if you only want to access one of the bits of information. Observe:

  (let body '((nil nil . body)) (get-fn-attributes foo) ...)

  (let body foo<-body ...)

As far as I can see, the only benefit to `get-fn-attributes` is if you want to use really short variable names:

  (let '((nil a . b) e) (get-fn-attributes foo) ...)

  (with (a foo<-arguments b foo<-body e foo<-environment) ...)

I generally prefer accessing data by key rather than by index. Seems more concise and robust to me.

On the other hand, lists are nice because they preserve their order, they're shorter and easier to create than tables, and they're less complex.

But, we're talking about the core here, so I'm gonna stick to tables for the core datatypes, until I see a better idea.

</dead horse beating>

-----

3 points by akkartik 5198 days ago | link

What does this level of openness buy programmers? Mostly yet another assembly layer to hack kludgily at, it seems to me.

I'm not sure slots for arg lists and body are the right separation. I can't imagine changing the arglist without the body, for example. Except maybe to add or remove defaults for args. In any other situation I'd want to change both at once. Might as well just run def all over again.

The biggest benefit of this approach may be in letting the program change itself. That's AI territory; you may want to read about Doug Lenat's work: http://en.wikipedia.org/wiki/Eurisko. I've also linked to Push before, a concatenative representation designed to automatically modify programs (http://arclanguage.org/item?id=12742, http://github.com/akkartik/wart/commit/aac83ceafc65c255a37e9...).

-----

1 point by Pauan 5197 days ago | link

"What does this level of openness buy programmers? Mostly yet another assembly layer to hack kludgily at, it seems to me."

I don't know. But at the same time, I tend to follow pg's mentality that if I can't find a logical reason to disallow it, then I should allow it.

So a better question would be, should we not allow this functionality, and if so, why? You can argue that my particular solution is kludgy, hacky, or any other pejorative term, but that doesn't answer the question, "is this a good idea or not?"

Obviously I wouldn't expect this function mutating to be common place. In case it wasn't obvious, I'm mucking around with low-level special-case stuff here. Stuff that just isn't possible in pgArc. A kludgy solution that works is better than no solution, in the situations where you need it.

One of my goals with py-arc is to get as much as (reasonably) possible written in Arc itself. Once you go down far enough, you eventually get to low-level "assembly kludge". Do you have a better idea for a way to get function inspecting/mutating (or something equivalent) in a way that's better than my proposal?

py-arc isn't even at 1.0, and implementations can change. I'm much more interested in knowing whether the idea itself is good or not. Though, if you do have a suggestion for a better implementation, I'd like to hear that too!

---

"I'm not sure slots for arg lists and body are the right separation. I can't imagine changing the arglist without the body, for example. Except maybe to add or remove defaults for args. In any other situation I'd want to change both at once. Might as well just run def all over again."

We're talking about the situation where you're dealing with somebody else's code. Obviously if it's your module, you can just change it yourself, but imagine somebody else writes a library, and you want to add a feature, or fix a bug.

If the library isn't designed well, the only way to do this might be to mutate a function, so calling `def` isn't an option. Or perhaps the library is designed well, but the particular situation the library is trying to solve requires mutating functions.

Ordinary Arc code wouldn't use function mutating, but perhaps somebody has a special use-case for it. Imagine a debugger, or a profiler, or a logger. Something that needs to inject itself into other people's code in a way that works perfectly every time. Mutable functions might be useful in that situation.

A code walker could also make use of the `environment` attribute to gather information about the function's closure. rocketnia mentioned a potential use-case for this[1].

I don't know exactly where this might be useful, but I think there's at least one situation out there where it would be useful. And since this would be really easy to do given py-arc's architecture[2], it just makes sense to expose it to Arc, unless there's a good reason not to[3].

---

* [1]: http://arclanguage.org/item?id=14411

* [2]: py-arc internally uses this idea. Every Arc function has an opts, rest, and body attribute. Calling a function evals the body attribute, and returns the last value (it's a teensy bit more complicated than that, but that's the basic idea). So it's simply a matter of exposing the internal details to Arc, and message passing can do that easily.

* [3]: Like information hiding with closures.

-----

2 points by akkartik 5197 days ago | link

Rereading my original comment, my question "What does this level of openness buy programmers?" was too baldly stated. As I was writing it I was thinking something more like "can you give me an example where this would be useful?" (going back to aw's advice at http://arclanguage.org/item?id=14264) And you and others have since given examples.

When I said 'hack kludgily at', I was thinking, "can we find a better way?" So combining the two I'd rephrase my original statement as:

"Can you give some examples of where this is a problem? Assuming it's a problem, I'm sure we can come up with more elegant solutions."

That's closer to what I was thinking, and I don't think I'm revising history either. My apologies for the tone.

-----

1 point by Pauan 5197 days ago | link

"Can you give some examples of where this is a problem? Assuming it's a problem, I'm thinking about more elegant solutions to the problem."

The reason I didn't give examples is because I didn't have any. It was just a, "oh hey we can make Arc more hackable if we do this" sorta thing. I figured we'd find use-cases for it later, and, well, we did.

As for a more elegant solution... this solution is using message passing. So to get the environment of the function `foo`, you would use this:

  foo<-environment

And to get the closure for foo, you'd use this:

  foo<-environment<-outer

How would you make that more elegant? Does it need to be more elegant, considering that accessing a function's closure is a pretty rare and special-case thing to do? Would you prefer to use `(get-closure foo)` instead? Oh, wait, you can do that:

  (def get-closure (x)
    x<-environment<-outer)

It's easy to write function wrappers that make message passing more palpable, but it's a lot harder to use functions to simulate message passing. So it makes sense for the low level things to be done with message passing, and then write wrappers for them.

---

"My apologies for the tone."

Mine wasn't exactly happy rainbows either, you know. I don't mind people saying that an idea of mine is kludgy, but I'd at least like some justification for it. You know, an example where my idea is noticeably kludgy, or perhaps a different solution that is demonstrably less kludgy.

-----

1 point by akkartik 5197 days ago | link

Yeah like I said at http://arclanguage.org/item?id=14418 I even withdraw my sense that there must be a better way. Next time, think before hitting submit :)

-----

1 point by Pauan 5197 days ago | link

It's perfectly understandable that you want a nice, sleek, shiny high-level thing that walks your dog and washes your dishes, but at some point you need the low-level stuff.

And because I'm sorta kinda writing an interpreter, not only am I thinking about low-level stuff, but I'm actually implementing it, so my posts and "great ideas" will probably be about low-level stuff for a while.

So, don't worry about it. You're up there in Arc-land thinking about high-level stuff, whereas I'm down here in low-level Python, thinking about low-level stuff. We both need to try to remember that better.

I'd still be interested in hearing if you come up with a better idea.

Oh! I know! We can write an Arc program that reads our minds, it would be the most elegant and concise language ever! :P

-----

1 point by akkartik 5197 days ago | link

Yeah the toolchain-enhancement use cases make sense.

"If the library isn't designed well, the only way to do this might be to mutate a function, so calling `def` isn't an option. Or perhaps the library is designed well, but the particular situation the library is trying to solve requires mutating functions."

I don't follow this.

Adding features to a function is the whole point of extend. Can you think of an example where it wouldn't suffice?

"if I can't find a logical reason to disallow it, then I should allow it."

That isn't a defense to the question, "is this the right representation?" Even PG isn't allowing non-sexp syntax, as an extreme example. I think "is there a better way to do this" is a fine logical reason.

To summarize: I now agree there's value in making a language's constructs introspectable. I'm less convinced that this specific approach is the best one. But you've gotten me thinking about alternatives :)

-----

1 point by Pauan 5197 days ago | link

"Adding features to a function is the whole point of extend. Can you think of an example where it wouldn't suffice?"

Hm... yes, actually. Imagine a function that is wrapped in a closure, to hide data:

  (let hidden-data ...
    (def foo ()
      (do-something hidden-data)))

How would you use `extend` on `foo`? Extend can't access the hidden data in the closure, but by making functions transparent, you could.

Technically speaking, you wouldn't need mutable functions, just the ability to see a function's closure. For instance, you could use this:

  (let closure foo<-environment<-outer
    (extend foo ...))

Aaand now your extension can access the hidden data. But that's only possible if closures are transparent. And it might be easier to just mutate the function directly:

  (= foo<-body ...)

Incidentally, <- would be ssyntax that lets you easily access an attribute. Thus, the following two would be equivalent:

  foo<-bar 
  (get-attribute foo 'bar)

---

"I think "is there a better way to do this" is a fine logical reason."

Sure, but I don't know of any better ways, nor has anybody presented them, so for now I'm focusing on whether the idea has merit. We can fix the implementation later.

P.S. If you consider this idea to be kludgy, then you probably consider message passing in general to be kludgy. :P How would you like to be able to access the data, then?

---

"But you've gotten me thinking about alternatives :)"

If you find a good one, I'd like to hear it. :P

-----

2 points by akkartik 5197 days ago | link

"Extend can't access the hidden data in the closure, but by making functions transparent, you could."

That's a great example, thanks.

"Sure, but I don't know of any better ways, nor has anybody presented them, so for now I'm focusing on whether the idea has merit."

Yeah that makes sense.

In fact, this is a fine implementation. Once I think of it as exposing the compiler's data structures to the language, separating arg lists and body makes a lot of sense.

-----

1 point by Pauan 5197 days ago | link

"In fact, this is a fine implementation. Once I think of it as exposing the compiler's data structures to the language, separating arg lists and body makes a lot of sense."

Well, yeah, how else would you do it? This is low-level stuff. :P

-----

2 points by rocketnia 5197 days ago | link

This should make it a bit more possible to crawl through a whole Arc program and source-to-source translate it into JavaScript at runtime. I don't think it'll be an airtight transformation by any means, but that's a use case that's been in the back of my mind when thinking about closure reflection in Penknife, ever since evanrmurphy's attempts to translate Arc.

Mutating functions has been on my mind too, but only for a JIT, and I wouldn't know where to start. If PyArc did its own JIT, that'd be pretty fantastic.

-----

2 points by aw 5197 days ago | link

Having an open implementation has many advantages:

- if you wanted to serialize closures, you could easily implement a serialization strategy simply by inspecting the closure's environment yourself.

- writing a debugging or tracing facility becomes so simple that you could just throw together an ad-hoc solution to a particular need whenever you wanted, instead of having to rely on the debugging facilities offered (or not) by the runtime.

- for a programmer who is new to closures, having everything be completely transparent makes it easy to see what's going on.

One thought I've had is that it could be useful to write an Arc interpreter in Arc. It would be slow of course... but the highest level functions in a program (like the ones that we typically want to serialize to persist closures) often don't need to be fast... as long as the operations that they call (such as in particular anything being done in a loop) are themselves fast.

This would give us four levels of implementation:

- Arc interpreter

- Arc compiler

- Racket

- C

Each higher level is more expressive but slower than the level below it. Thus you can write something at a higher level first because that will be the fastest for you (it will take you less time to write the program), but then if it turns out to be too slow, you can rewrite it at a lower level at the cost of doing more work.

Your note [3] deserves an entire comment by itself, but I'm out of time now so I'll reply later :)

-----

1 point by Pauan 5197 days ago | link

Excellent notes. Those are definitely great reasons to make functions open, rather than opaque blobs. In fact, I see only two downsides to it:

1) Slowness. Then again, Arc's goal isn't speed either. As you mentioned, it should be possible to rewrite the slow bits in a lower layer if needed.

2) Lack of information hiding. This is more serious than point 1, but, I can provide a different mechanism for information hiding. The downside of having a different mechanism is that it requires learning a new mechanism... and I love that closures can be used for information hiding. It's just so simple and axiomatic, you know?

---

"One thought I've had is that it could be useful to write an Arc interpreter in Arc."

I've been mulling that over too, ever since your earlier comment[1], but... I don't really like the Racket runtime (the implementation, not the language). For instance, trying to make Unix shell scripting work with Racket is a big pain, but it's incredibly easy in Python.

Of course, I don't really like Racket the language either. :P Too big and bloated for my tastes. Why do you think I like Arc to begin with? On the other hand, programming in Racket would probably be preferable to programming in Python, if I had to choose between the two.

Oh yeah, and Python seems to be a lot more popular than Racket... so chances are, your users will have Python installed, but not Racket. Proooobably not a big deal, since we're a pretty small group, but still an advantage for Python.

---

"Each higher level is more expressive but slower than the level below it. Thus you can write something at a higher level first because that will be the fastest for you (it will take you less time to write the program), but then if it turns out to be too slow, you can rewrite it at a lower level at the cost of doing more work."

Yes! And don't forget: profile, profile, profile first. I think that's the best way to make good clean code that is also fast. Honestly, though, I rarely need to actually optimize my JavaScript code, because it's usually plenty fast as-is. Chrome's V8 engine is amazingly fast, especially given how dynamic JavaScript is.

---

* [1]: http://arclanguage.org/item?id=14201

-----

1 point by aw 5197 days ago | link

trying to make Unix shell scripting work with Racket is a big pain

In what way? Running a Racket program as a shell script, or calling shell scripts from Racket? (I haven't found either particularly difficult... though I can easily imagine that the process might be smoother and better documented in Python).

with respect to Racket vs. Python, an advantage of Arc's axiomatic approach is that it makes it really easy to implement Arc on top of different platforms. So I can use a Python runtime for things that it is it good at, and a Racket runtime for things that it is good for.

-----

1 point by Pauan 5197 days ago | link

Writing an Arc program that works as a shell script. In other words, I want to slap a #!/path/to/arc shebang into a text file, and then call it with ./foo.arc and have it work like an executable.

This works in Anarki (at least, I think so, I haven't tried it), because they made an arc.sh file, and used some kludgy stuff to make it work[1], but in Python it Just Works(tm), no fiddling needed. That's actually true for Arubic[2] as well, because it's written in Python and I designed it that way. You can just use #!/path/to/arubic and it'll Just Work(tm).

I'm probably overly-criticizing Racket for that because I ended up partially implementing arc.sh by myself, then found a forum post that gave a better implementation, swiped that, then modified it. I doubt I'd have cared as much if arc.sh had been included with Arc 3.1 (and worked good).

It's not as big of a deal now that Anarki has arc.sh, but it's just one of those rough-around-the-edges areas. Python also makes it easy to parse command line switches, with the optparse module, though that could be implemented as a library in Arc, so it's not really fair to criticize Racket for that. :P

I'm curious, though, how would you write a Racket program that worked as a shell script?

---

"with respect to Racket vs. Python, an advantage of Arc's axiomatic approach is that it makes it really easy to implement Arc on top of different platforms."

Not in my experience. Arc actually has a lot of primitives[3], not to mention the whole thread/continuations/TCO thing. Implementing a simple toy Arc interpreter? Piece of cake. Making it actually good? A lot harder. That's another reason I'm trying to shove as much into Arc as I can... the more stuff in Arc, the less stuff you need to write in the interpreter layer.

Actually, a lot of my time has been spent debugging small bugs... I've had to wade through a lot of tiny bugs to get Arubic working properly in all sorts of weird edge-cases. Unit tests help a lot with that.

---

* [1]: Actually, I just took a look at Anarki's arc.sh and it looks pretty clean. I'm not sure if it behaves correctly, but I don't feel like testing it to see.

* [2]: I'm renaming py-arc to Arubic, because I like that name a lot. This also gives me some more flexibility... I'm no longer tied to Python. I could implement Arubic in Arc, or Ruby, or another language.

* [3]: Compared to most other popular languages, Arc has very few primitives, but it still has a lot more than I'd like. One thing I did like, though, is that it was incredibly easy to write a tokenizer/parser for Arc, because it has such a regular syntax, due to being a Lisp.

When I implemented a top-down operator precedence parser (in JavaScript) to parse a custom language, it ended up being a lot harder, because the custom language uses syntax, and not particularly regular syntax either.

-----

2 points by aw 5197 days ago | link

Writing an Arc program that works as a shell script.

Ah, I don't think this is a Racket problem. Racket has good information on how to run a Racket program as a shell script: http://docs.racket-lang.org/guide/scripts.html

By the way, Racket also has a decent command line argument parser: http://docs.racket-lang.org/reference/Command-Line_Parsing.h...

Of course, to say that Arc doesn't make this easy enough is certainly a valid criticism. (But I don't think we can blame Racket for that).

not to mention the whole thread/continuations/TCO thing

Your point is well taken. The advantage of an axiomatic approach is once you have them implemented, the rest of the system will run on top of them... but the powerful axioms may be hard to implement.

-----

1 point by Pauan 5196 days ago | link

"Ah, I don't think this is a Racket problem."

Fair enough. I had somewhat assumed that the problem was Racket, because Arubic handles it just fine. My mistake.

That actually looks like a pretty good command line parser. Thanks for the links!

P.S. Once ar gets into a somewhat more complete state, you may want to consider making shell scripts work good with it. For instance, if I'm in the directory /foo and I run a script that's in /usr/bin and the script uses (load "lib/bar.arc"), I would expect it to load /usr/bin/lib/bar.arc, not /foo/lib/bar.arc.

I made this work in my copy of pgArc, but it'd be nice to have it be more standardized. Obviously it should be possible to load relative to the current directory as well, if your script wants to.

Mind if I go in and start hacking on ar? :P If I decide to make an Arc compiler in Arc, I'd probably base it on ar.

---

"The advantage of an axiomatic approach is once you have them implemented, the rest of the system will run on top of them... but the powerful axioms may be hard to implement."

That is true. I do like the axiomatic approach, but I think Arc can improve and become even more axiomatic, which is one of my goals with Arubic.

-----

1 point by aw 5196 days ago | link

Mind if I go in and start hacking on ar

Please do! (that's why I put it on github :)

By the way, the hackinator (https://github.com/awwx/hack) does generate an Arc program that can be run from the shell from any directory (for example https://github.com/awwx/hackbin/blob/master/hack works this way). I just haven't gotten around to getting it into ar yet.

Oh, and just so you know the above example produced by the hackinator incorrectly uses aload to load Arc files after arc.arc has been loaded instead of using Arc's load; this a known issue I've listed in the to-do list in the README.

-----

1 point by Pauan 5198 days ago | link

I'll note that if environments didn't have an `outer` attribute, and I made the `body` attribute read-only[1], then you could still get information hiding (with closures), but at the cost of (slightly) decreased malleability[2].

I could still allow for mutating the arguments list and environment, since I don't think that would leak hidden information (if you know of a way, I'd like to hear it!). Of course, if I went that route, then I should give a way for Arc code to "freeze" an attribute, making it read-only.

Side note: if every function had an `arguments` attribute, then the `sig` table in Arc would be redundant, and could be removed.

---

* [1]: Why would the `body` attribute need to be read-only? Consider this simple example, demonstrating information hiding:

  (let a 0
    (def foo ()
      (if (< a 3)
        (++ a))))
        
  (foo) -> 1
  (foo) -> 2
  (foo) -> 3
  (foo) -> nil

If I allowed for mutating the function's body, then you could remove the `if` check. In a less contrived example, you could mutate the function so it gives you direct access to the information hidden in the closure.

* [2]: I say it's only slightly decreased, because it's still possible to "mutate" a function by wrapping it in another function. This is what `extend` does.

-----