Arc Forum | "Thus, the name is intentionally ugly. You shouldn't be mucking around with _

Arc Forum

1 point by rocketnia 5238 days ago | link | parent

"Thus, the name is intentionally ugly. You shouldn't be mucking around with __built-ins * unless you need to, but it's available just in case you do need it. Though... do you think the two underscores are too Pythony?"

"Too Pythony" was my first impression, lol, but it makes sense according to your naming scheme. An underscore means it's something people shouldn't rely on accessing, and two underscores means it's something people, uh, really shouldn't rely on accessing?

---

"There should be a blacklist (or a whitelist?) of "safe" and "unsafe" functions."

I'm not a security expert either, but I don't know if that should be a global list. Suppose I make an online REPL which executes PyArc code people send to it, and suppose my REPL program also depends on someone else's Web server utilities, which I automatically download from their site as the program loads (if I don't already have them). I might distrust the Web server library in general but trust it enough to open ports, but I also might not want REPL users to have that power over my ports.

I think it would make more sense to control security by manually constructing limited namespaces and loading code inside of them. There's likely to be a common denominator namespace that's as secure as you'd ever care to make it, but it doesn't have to be the only one.

Is there a way to execute resource-limited code in Python, like http://docs.racket-lang.org/reference/Sandboxed_Evaluation.h...? ...Hm, I suppose (http://wiki.python.org/moin/How%20can%20I%20run%20an%20untru...) is a starting point to answer that.

---

"As you can see, bar.arc wants to expand the macro `message` in bar's namespace, not in foo's namespace."

Well, it's fine if that's what you expect as the writer of bar.arc, but I'd expect things to actually succeed at being hygienic. My approach to bar.arc would be more like this:

  (= foo!something (fn () "goodbye"))

This doesn't need to pollute all uses of foo.arc in the application; bar.arc can have its own separate instance of foo.arc.

There may still be a namespace issue though. If foo.arc defines a macro with an anaphoric variable, like 'aif, and then bar.arc uses foo.arc's version of 'aif, then the anaphoric variable will still be in foo.arc's namespace, right? My own solution would look something like this:

  ; in foo.arc
  (mac aif ...
    `(...
       ,(eval ''it caller-namespace)
       ...))

1 point by Pauan 5238 days ago | link

"An underscore means it's something people shouldn't rely on accessing, and two underscores means it's something people, uh, really shouldn't rely on accessing?"

Yeah, I figured two underscores served as more emphasis than one. :P Also, two underscores seemed uglier to me, and also distinguished it from internal ("private") variables.

---

"I think it would make more sense to control security by manually constructing limited namespaces and loading code inside of them. There's likely to be a common denominator namespace that's as secure as you'd ever care to make it, but it doesn't have to be the only one."

When I said "global list" what I meant was just defining which are safe and which aren't. Then having a default safe namespace that would contain the items deemed safe.

Yeah, you can create custom namespaces, for instance you could create a safe namespace that allows access to safe functions and (system), but nothing else:

  (= env (new-namespace))
  (= env.'system system)
  (load "foo.arc" env)

Voila. In fact, here's how you could handle the scenario you described:

  (= unsafe-env (new-namespace))
  (= unsafe-env.'open-socket open-socket)
  (= web (load "web-server.arc" unsafe-env))

  (= safe-env (new-namespace))
  (eval (read-input-from-user) safe-env)

Thus, web-server.arc has access to the safe functions, and open-socket. Meanwhile, the input that you get from the user is eval'd in a safe environment. It's a very flexible system. The above is verbose, I admit, but that can be fixed with a macro or two.

---

"My approach to bar.arc would be more like this:"

Hm... I admit that would probably be a clean solution most of the time, but what if you want both `something`s at the same time? You end up needing to store a reference to the old one and juggling them back and forth. Maybe that wouldn't be so bad.

Also, there can't be a `caller-namespace` variable (at least not implicitly), because then untrusted code could access trusted code, and so why have a distinction at all? Your example would work, but only if importers explicitly decide to give access:

  (= env (new-namespace))
  (env.'caller-namespace (current-namespace))
  (= foo (load "foo.arc" env))

Now foo.arc can access caller-namespace, because you're allowing them to.

---

Side note: I'm going to start using .' rather than ! because I think the former looks nicer.

-----

1 point by rocketnia 5238 days ago | link

"Side note: I'm going to start using .' rather than ! because I think the former looks nicer."

I agree. ^_^

---

"Hm... I admit that would probably be a clean solution most of the time, but what if you want both `something`s at the same time? You end up needing to store a reference to the old one and juggling them back and forth. Maybe that wouldn't be so bad."

I don't know what else you could do if you wanted both somethings at once. ^^; That said, I think I'd just explicitly qualify foo!something or import it under a new name.

There's some more potential trouble, though. If a module defines something that's supposed to be unique to it, then two instances of that module will have separate versions of the value, and they may not be compatible. If a module establishes a framework, for instance, then two instances of the module may define two frameworks, each with its own extensions, and some data might make its way over to the wrong framework at some point. On the other side of the issue, if a module extends a framework, then two instances of the module might extend it twice, and one of them might get in the way of the other.

There are several possible ways to deal with this. Code that loads a library (host code?) could load it in an environment that had dummy variable bindings which didn't actually change when they were assigned to, thereby causing the library to use an existing structure even if it created a new one. Framework structures could all be put in a single central namespace, as you say, and any code to make a new one could check to see if it already existed. A library could require some global variables to have already been defined in its load namespace, intentionally giving the host code a lot of leeway in how to specify those variables.

I've been considering all those approaches for Penknife, and I'm not sure what'll be nicest in practice. They all seem at least a little hackish, and none of them seems to really solve the duplicated-extension side of the issue, just the duplicated-framework side. At this point, I can only hope the scenarios come up rarely enough that whatever hackish solutions I settle on are good enough, and at least standardized so that not everyone has to reinvent the wheel. Please, if you have ideas, I'm all ears. ^_^

-----

1 point by Pauan 5238 days ago | link

When you say "unique to it" do you mean "only one value, even if the module is loaded multiple times"? Do you have any examples of where a module would want that?

In any case, all those approaches should work in PyArc, in addition to using __built-ins* (provided you really really want the library's unique something to be unique and available everywhere...)

Hm... come to think of it... an environment/namespace/module can be anything that supports get/set, right? It may be possible to create a custom data-type that would magically handle that. Somehow. With magic.

-----

1 point by rocketnia 5238 days ago | link

"When you say "unique to it" do you mean "only one value, even if the module is loaded multiple times"? Do you have any examples of where a module would want that?"

I thought I gave an example. If a module defines something extensible, then having two extensible things is troublesome, 'cause you have to extend both of them or be careful not to assume that values supported by one extensible thing are supported by the other.

---

"Somehow. With magic."

I propose also reserving the name "w/magic" for use in examples. :-p

-----

1 point by Pauan 5238 days ago | link

"something extensible?" Got any more specific/concrete examples?

---

"I propose also reserving the name "w/magic" for use in examples. :-p"

Okay, but if I find a magic function I'm going to put it in PyArc so you can use it in real code too. :P

-----

1 point by rocketnia 5238 days ago | link

""something extensible?" Got any more specific/concrete examples?"

I mean something extensible like the 'setter, 'templates, 'hooks, and 'savers* tables, as well as Anarki's 'defined-variables* , 'vtables* , and 'pickles* tables, all defined in arc.arc. These might sound familiar. ^_^

Lathe (my blob of Arc libraries) is host to a few examples of non-core Arc frameworks. There's the Lathe module system itself, and then there's the rule precedence system and the type-inheritance-aware dispatch system on top of that. There's also a small pattern-matching framework.

If you load the Lathe rule precedence system twice (which I think means invasively removing it from the Lathe module system's cache after the first time, but there may be other ways), you'll have two instances of 'order-contribs, the rulebook where rule precedence rules are kept. Then you can sort some rulebooks according to one 'order-contribs and some according to the other, depending on which instances of the definition utilities you use.

---

"Okay, but if I find a magic function I'm going to put it in PyArc so you can use it in real code too. :P"

I think I saw one implemented toward the end of Rainbow.... >.>

-----

1 point by Pauan 5238 days ago | link

Hm... I'm not sure why that's an issue, though. If a module imports your module, they'll get a nice clean copy. Then if a different module imports your module, they get a clean copy too. Everything's kept nice and isolated.

If you want your stuff to be available everywhere, stick it in __built-ins. Unless you have a better suggestion?

-----

1 point by rocketnia 5237 days ago | link

"Hm... I'm not sure why that's an issue, though. If a module imports your module, they'll get a nice clean copy. Then if a different module imports your module, they get a clean copy too. Everything's kept nice and isolated."

That's exactly why I'm not sure it'll come up much in practice. But as an example...

Suppose someone makes a bare-bones library to represent monads, for instance, and someone else makes a monadic parser library, and then someone else finally makes a Haskell-style "do" syntax, which they put in their own library. Now I want to make a monadic parser, but I really want the convenience of the "do" syntax--but I can't use it, because the parser library has extended the monad operations for its own custom monad type and the "do" library only sees its own set of extensions.

You mentioned having the person loading the libraries be in charge of loading their dependencies, and that would yield an obvious solution: I can just make sure I only load the monad library once, giving it to both libraries by way of namespace inheritance or something.

But is that approach sustainable in practice? When writing a small snippet for a script or example, it can't be convenient to enumerate all the script's dependencies and configure them to work together. Over multiple projects, people are going to fall back to in-library (load ...) commands just for DRY's sake. What I'd like to see is a good way to let libraries specify their dependencies while still letting their dependents decide how to resolve them.

---

"Unless you have a better suggestion?"

I've told ya my ideas: Dummy global variable bindings and/or a central namespace and/or configuration by way of global variables. (http://arclanguage.org/item?id=14036) They're all too imperfect for my taste, so I'm looking for better suggestions too.

-----

2 points by Pauan 5237 days ago | link

Hm... like I said, it should be possible to build a more complicated system on top of the simple core, though I'm not sure exactly how it would work.

But... here's an idea: a bootloader module that would load itself into __built-ins* so it could persist across all modules, including modules loaded later.

It could then define (namespace ...) and (require ...) functions or something. Modules could be written using said constructs, and the bootloader would then handle the dependencies, creating namespaces as needed. And it could keep a cache around, so re-importing a module that has already been imported will just grab it from the cache.

The bootloader could then define (use ...) or something, which would do all the automatic dependency and caching junk, but you could still use plain old (load) and (import) to bypass the bootloader and get more refined control. Something like that may work.

Haha, I just had a crazy idea. What if a module imported itself into __built-ins* ? Something like this:

  ; foo.arc

  (if no:__built-ins*.'foo-check
    (do
      (= __built-ins*.'foo-check t)
      (load "foo.arc" __built-ins*))
      
    (do
      ; define rest of foo.arc here
      ...))

I suspect any solution will have some wart or other. Tradeoffs and all that. Also, the solution to the specific problem you mentioned is to load them all in a single namespace, right? Or at least namespaces that inherit from some common one.

So perhaps we could define a macro that makes that easier, since the current way of doing it is pretty verbose. Assuming it was almost-as-simple as (import ...) that would help ease the pain somewhat, though it wouldn't help with dependency management (that's a whole different ballpark).

I also thought of a macro that would make it easier to import/export stuff to/from a module. Right now you need to do stuff like this:

  (= env (new-namespace))
  (= env.'foo foo)
  (= env.'bar bar)
  ; etc.

Which is clunky. But I haven't figured out a good name for it. Okay, wait, I could use plain-ol `namespace`:

  (namespace foo bar)

I'm undecided though. It's like (table) vs (obj), only with namespaces.

-----

1 point by Pauan 5238 days ago | link

Oh, and by the way. In addition to creating a safe namespace and selectively giving it unsafe functions, you can also remove functions from a safe namespace.

For instance, suppose you wanted to run code in a safe environment, but you didn't want it to be able to print (using pr, prn, prt, etc.) You could use this:

  (= env (new-namespace))
  (= env.'disp nil)

  ; do something with env

Like I said, it's very flexible. You have complete control over what is/isn't in a namespace. You can execute each module in it's own namespace, or combine them however you wish, etc. It has a very simple core, but has many many potential uses.

-----