Arc Forumnew | comments | leaders | submitlogin
Polymorphs: My proposed solution to the 'has-a' typing problem
6 points by cchooper 6127 days ago | 13 comments
This post is a contribution to the discussion of type systems that almkglor kicked off a few days ago (http://arclanguage.org/item?id=4855). It's been a really interesting discussion and has raised all kinds of genuine criticisms about the way Arc handles things, and a number of proposed solutions. This is my proposed solution to the problem, and it's quite different (and yet similar) to everyone else's. My post is rather long, because I want to give all the reasons why I did things the way I did, so apologies for that. Please let me know what you think.

Before I go any further, let me say that my feelings on typing are very simple.

  (> fewer-types more-types)
  => t
I like the functions-per-type ratio to be high, so I avoid creating new types when possible. In fact, I haven't found the need to create a single new type in my Arc programming so far.

But the problem will arise eventually, and when it does we'll want our old functions to work with the new types. Arc is already pretty overloaded (e.g. functions that work on strings and lists of strings) so we can assume it'll get even more overloaded as new types turn up.

pg has already given a way to make an old function work with new types. Just redefine the function to accept the new type. We even have a macro in Anarki to make this easier. This problem is solved.

The real problem arises further upstream, when you have functions which call these redefined functions, but switch on the value of isa or islist and things like that. These functions will fail because they look at the intrinsic type of your object, and are hard-coded to only accept a few types. Other than redefining isa to work in mysterious ways, there's no easy way to solve this, so functions that act like this need to be rewritten/redefined. What we need is a standard way of rewriting them that ensures functions always call the right function on the right type, without knowing ahead of time exactly what types may be passed to it. This is the essence of polymorphism.

My solution has the following benefits. Firstly, it allows you to tell a function what capabilities your type has (e.g. is it a sequence? Does it have a car?). This is what I think everyone means by 'has-a' semantics. You can do this two ways: you can either say what functions your type supports, or you can say that your type implements an abstract type (interface) and therefore supports all the operations on that abstract type. I think being able to do both is best, and you can do both with my solution. From now on though, I'll assume we're doing the first one because it makes the code simpler.

Secondly, it allows you give your type different capabilities in different situations. This is useful because you may have a type that can be treated, for example, as an array or a sequence, but you want to force a function to treat is as a sequence because you don't like what it does to arrays.

Thirdly, it allows you to implement all kinds of crazy inheritance systems, object-oriented DSLs and stuff like that, if that's the kind of thing that floats your boat. In fact, the whole solution is designed to be super-flexible, so you can implement all kinds of type systems on top of it. It's not so much a type system, as a way of dynamically extending your type system to fit your requirements. It's like the Meta-Object Protocol, only simpler.

Fourthly, it doesn't redefine functions like type and isa, nor does it interfere with any other type system you may want to implement. Caveat: you still need to redefine other core functions if you want to make them polymorphic.

Fifthly, it does all this with just one type! That's the bit that keeps me happy :)

So how does it work? You simply create a new type called a polymorph (the Red Dwarf reference should give you a clue as to what it does):

  (def polymorph (value methods)
    (annotate 'polymorph (cons value methods)))
Then you write your polymorphic function like this:

  (def polymorphic-car (x)
     (call car x)))
The call macro has the following behaviour: if x is not a polymorph, then simply call the supplied function on the object.

  (car x)
But if x is a polymorph, look up the required function in the methods part of the polymorph object, and call it on the value part. The methods could be stored as a hash table or similar.

  (((cdr x) 'car) (car x))  ;; [Note: need to fix multiple evaluation of x.]
So the methods part stores all the methods for that object. If you want to use the second kind of polymorphism, where you specify an interface instead of a particular function, then you just store the interface in the hash, and this interface object itself contains all the functions.

  ((((cdr x) 'sequence) 'car) (car x))
So two polymorphs are of the same 'type' if they use the same methods object.

There's lots of flexibility here already. For example, every polymorph can be given different methods. You're not restricted to having every value of a given type use the same methods, but can set things up however you want. Also, you can always fish the value out of your polymorph and stick it in a new polymorph with different methods. That's how you make the behaviour of your type relative to a given context.

But it can be made even more flexible. So far, I've assumed that the methods are stored in a hash list. But suppose that we used a polymorph instead, and rewrite call so that it is also polymorphic. If the methods are a hash, then do as above, but if they're a polymorph...

  ((call (getmethod (cdr x)) 'car) (car x))
What this means is that we can now use more interesting objects (anything that implements getmethod) to store the methods. Examples:

1. If we want our types to have single inheritance, then we could create a polymorph that contains a list of hashes and has a gethmethod method that traverses them to find the right method.

2. If we want multiple inheritance, then we could use a tree of hashes instead of a list.

3. We could have a different hash for each interface. Then we could add/drop interfaces as required.

4. We could even use an object that generates methods on the fly by returning closures. Ruby on Rails abuses that kind of feature a lot. I'm sure we can find a use for it in Arc too.

You can even use all of these side-by-side. They don't interfere with each other, because each system is encapsulated in a different kind of polymorph. You can even compose them together. For example, instead of implementing single inheritance with a list of hashes, you could use a list of polymorphs, and so on.

We could go even further: I stored the method and value of the polymorph as a dotted pair, but there's no reason why this can't be a polymorph too. This would allow you to store extra information in your polymorph, such as a list of interfaces it supports. You can even make call even more polymorphic by having call methods in your polymorphs, so they can decide how to call themselves! This could be used for implementing multiple dispatch and who knows what else.

A fully polymorphic type system like this gives you a lot of the power of the Meta-Object Protocol, but in just a few lines of code.



5 points by almkglor 6127 days ago | link

Hmm. Don't have time currently to think deeply about things, but how would you implement, say, an Arc-specified complex number (with the proviso that if mzscheme supports a complex number, you don't cheat and use that - you have to make the complex number in Arc).

And of course, you have to be capable of adding a non-complex number with a complex number, as well as multiply etc.

Then of course implement 3-d vectors, with real-imaginary-z components, with the implication that a complex number is simply a 3-d vector with z=0.

(Edit) As a final proviso, make the 3-d vectors and complex stuff into independent, separately loadable libraries. ^^ No loading one before the other. ^^ That way we can also check if it's possible for different pieces of code to interoperate using the proposed type system properly.

This, ultimately, is the problem when methods are stored with the object: what if a method acts on different objects? Which object's method should we use? It's okay for methods which act on only one object (such as 'car and 'cdr) but what about those that don't (such as '+ and '-)?

-----

3 points by cchooper 6127 days ago | link

It doesn't do multiple dispatch, so you can't really do all of this yet. Multiple dispatch could either be implemented in call or by putting the call implementation in the polymorph.

If you put it in the polymorph then you have to reimplement it each time, which is bad. On the other hand, you can customise how multiple dispatch works for different types. You'd probably need that for arithmetic, because you always have to coerce values to the most general type (e.g. integers to rationals). I don't know any other types that do that. It's a special case.

Vector types could perhaps be implemented by supporting the interface '(vector n) where n is the dimension. Then your complex numbers could implement '(vector 3) or something. Dependent typing is not something I've thought about. Maybe it can be done.

Another way of doing this could be to have alternatives to call. You could have a simple single-dispatch call, then another (multi-call) that works like multiple-dispatch in CLOS (using inheritance hierarchies to choose a method), then another one that coerces everything to the 'most general type' (whatever that means in the context).

-----

2 points by almkglor 6126 days ago | link

"I don't know any other types that do that. It's a special case."

Suppose I'm modelling a starfield for a computer game, and I have ships and missiles flying around. I then, quite reasonably enough, have a (collide a b) function that I call whenever I notice that one of the ships collides with a something else. We could have ships colliding with ships, or colliding with missiles, or missiles colliding wiht other missiles. Worse, we've relegated our collision detecting code to a loop that goes through the list of game elements, so we can't exactly say that a will be a ship or a missile (or even an asteroid). Different things will happen based on the type of object - missiles just cancel each other out, but ships have to explode convincingly, and of course there's a special handling for the player's own ship (because it's game over then).

It's not a web app, but hey ^^

So multiple dispatch is in fact a pretty good strength in CLOS, and it's generally done over the is-a-ness of the object ^^.

-----

1 point by cchooper 6125 days ago | link

That was in reference to coercion, not multiple dispatch. I actually think the whole thing would be pretty useless without multiple dispatch.

-----

2 points by almkglor 6125 days ago | link

Ah, ok. So: how do we implement multiple dispatch when functions are attached to objects? ^^

-----

1 point by cchooper 6124 days ago | link

I'm working on it :)

-----

1 point by almkglor 6124 days ago | link

Hmm. Any ideas yet? I can't think of any so far.

Somehow I feel that having two layers - an is-a type attached to an object, and has-a interface semantics attached to functions on an object, might work. Basically a type declares a set of has-a interfaces, and provides (using multiple dispatch) the functions used to implement the interface. So an object is-a ship and is-a missile and is-a asteroid and is-a player-ship, and each of those types has-a collideable interface, defining how different objects react to being bashed against one another.

Or not. Dunno.

-----

4 points by absz 6127 days ago | link

That seems like a very interesting idea, and I haven't had time to fully digest it yet, but here are some of my first thoughts.

(1) I don't really like having to write (call car x); that adds a layer of complexity, and I'm just going to want to write (car x), thus losing benefits down the line. It might be nicer to by default do things polymorphically, and then only specify "I want this car" when we need it.

(2) If you just define a way to say "I implement this function" in the core, and leave out interfaces, then they can be added on later. If you just define interfaces in the core, going the other way requires lots of extra one-function interfaces. Don't do it! Provide interface functionality, but put that on top of the core.

(3) Using a polymorph to store methods (and everything else) is a wonderfully clever bit of metahackery, and really does seem to allow pretty much anything you want: it's all in the getmethod. Uniformity is very powerful: look at Smalltalk and Lisp. Now look at Java. Quick, look at Lisp again! :)

(4) With the way you make things explicit, it feels like you're pretty much implementing an ugly OOP system on top of Arc. Again, make more things implicit--that way it won't feel like classic OOP, and thus (hopefully) won't be used like classic OOP. call feels like a long-winded way to do a message send (à la Smalltalk); if that's explicit, it feels like we're Greenspunning Smalltalk. Instead, let's integrate the feature. (I don't know if that was clear at all... it's related to point 1.)

I can't wait to see what people do with this--the more I think about it, the more I like this idea.

-----

3 points by almkglor 6127 days ago | link

(1) Agree

(2) Could be done, and have interfaces simply act as a shortcut way of saying "I have (for objects) or need (for functions) the following: a, b, c, and d"

(4) LOL. We're making an informally-specified, ad-hoc, bug-ridden implementation of half of CLOS. ^^

-----

2 points by sacado 6127 days ago | link

"We're making an informally-specified, ad-hoc, bug-ridden implementation of half of CLOS"

That's what I thought when I was thinking about solutions for these problems :)

I think (1) could be solved simply by using redef on each polymorphic function : if car becomes polymorphic, just encapsulate the desired behavior in a new definition of car that will do the dynamic dispatch for you. If you want a specific version of car (e.g. the array implementation), just call it explicitly : (array-car x) .

-----

2 points by cchooper 6127 days ago | link

(1) Yep, I agree. You'd want to put the polymorphisn all the way down, so you nearly never have to actually use call in your code.

(2) Again, agreed. Implement features on top of the system, rather than within it, just like Arc!

(4) Might be tough to make everything completely implicit. Ideally, the whole thing should be invisible, like it didn't even exist, but works anyway. All suggestions for doing that are welcome.

-----

3 points by absz 6127 days ago | link

(4) Actually, I think we're in agreement there, I just didn't say what I meant too well :) In fact, one of the reasons your post may look so explicit is that you're discussing how to implement the system, and thus have to deal with a lot of "low-level" code, the way the first few lines of arc.arc look nothing like most Arc you'll actually use (safeset, etc.). For instance, call certainly appears to be such a feature—used to get the system working (and possibly sometimes used the way Ruby uses its #send method), but not in every case. Similarly, since you're discussing implementing the system, of course implementation details will show through. At any rate, if the goal for the "end-user" is invisibility/integration/implicit behaviour (which I think all mean roughly the same thing in this context), then we're in agreement.

-----

2 points by cchooper 6127 days ago | link

Yes, I wasn't disagreeing with you, just pointing out that it might not be easy. As you say, some of this will just disappear because it's low level and will be buried right at the bottom where you implement the basic type operations. Hopefully it'll all turn out like that, but I'm not counting my chickens yet.

-----