"Of course this problem is completely solved by making the global namespace lexical rather than dynamic."
What do you mean by "lexical"? Module-local?
In Arc, I'd recommend you avoid using global scope for anything local (natch ^_^ ). You may prefer the ability to declare private variables or something, but a big (let ...) isn't that bad. You're even using the equivalent of a big (let ...) in JavaScript.
If what you want is first-class namespace support or selective importing, that's different.
I don't especially like the idea of infix operators breaking up juxtaposed identifiers like "a b". Infix operators are visible boundaries, and juxtaposition has no boundary except whitespace.
However, this is lisp syntax. Juxtaposition is an element separator, not an operator of its own. Ultimately, I don't know where I stand on this. :)
---
"And then "is" would implement pattern match behavior."
Because of my preference for patterns to be inverted expressions, I would rather (is b 5) match only t or nil. If the input is t, then b will be bound to 5. If the input is nil, b will be bound to anything but 5.
Not that I would have any use for such an operator.
---
The inverse of the optional arg pattern (o b 5) would be a splicing(!) expression which usually returned a length-1 subsequence containing b, but might(!) instead return a length-0 subsequence if b were 5. I wouldn't have a use for this operator either.
Actually, it might not be as simple as that. Should the expression 5 become the pattern 5 when we invert (o b 5)? What would that mean?
"The way I like to look at it, the correspondence between patterns and expressions is based on one being the inverse of the other. If a value is matched and then reconstructed using the same code, it should become the same value.
For example, since (cons a b) as an expression takes two inputs and gives one cons cell as output, its pattern version takes one cons cell as input and gives two outputs."
I think you're thinking in terms of Haskell pattern matching. I view pattern matching as being more general and dynamic, not exclusively about destructuring the components of data. I see "data destructuring" as being one kind of pattern matching. Nulan supports this kind, of course, but also supports other forms too. For instance, regexps are a form of pattern matching that slices and dices strings, but you can't recreate the string from the regexp.
---
"The pattern version of (and ...) would take one input and give at least one output. (I'm ignoring zero-argument (and) here.) If the input is nil, the outputs can be picked arbitrarily from the space of all possible values, as long as at least one of them is nil. If the input is not nil, then the last output will be equal to that value, and the other outputs can be any arbitrary non-nil values. Thanks to all this nondeterminism, an 'and pattern isn't exactly useful or even meaningful."
That's not how Nulan's pattern matching works. At least, for now. The point of Nulan's pattern matching is simply to take an environment, a pattern, and some blob of data and return an environment that possibly binds new variables to the data. This means a pattern could insert arbitrary stuff into the environment, or return an empty environment, etc.
I'm not big on the whole staticy guarantee thing, where you try to prevent bad stuff from happening. So my stance is, "well, if you wanna do that stuff, go ahead, even though I personally don't see much use for it." And if somebody happens to write a pattern matcher that does funky stuff, I can simply choose to not use it. That's what modules are for: to decide which things you want to use and not use.
"One refinement of this interface is to pick the outputs such that they're all equal to the input. That seems to be what you're going for."
That is kind of the point of "and", yes.
---
"The same implementation fits the requirements for a pattern version of 'or, so why choose one of these names over the other?"
No it doesn't. "or" matches only a single pattern, the first one that works. "and" always matches all the patterns, or fails. This is analogous to how "or" and "and" work in pretty much every programming language.
---
"If pattern-and's outputs are always equal, then expression-and's inputs should always be equal too, right?"
I have no clue what you mean by expression-and. I guess you're talking about using "and" in a non-pattern context. Right now, "and" behaves like it does in Arc (and many other languages). Pattern matching and calling are two separate things and have two separate implementations.
---
"We can dodge both of these conundrums by defining 'as as its own operator and just giving up on finding inverted counterparts of 'as, 'and, and 'or. It's okay for some computations to be non-invertible--unless, of course, the language design demands otherwise."
I see no reason to define "as" as its own operator since I can simply use "and" to get the same effect. I guess "as" might be slightly more self-documenting.
---
"Besides inverting expressions, you and Racket (not to mention my past self) do some extra things with pattern matching beyond inversion: You allow the same variable to appear more than once in a pattern, and you allow backtracking when a match fails."
Yes.
---
"If the variables are seen as logic variables, then multiple variable occurrences make sense as multiple constraints to be unified. Racket and Nulan operations can only operate on values that have been determined up to equality, so the logic variable semantics degenerates into an equality test.
In another language, (and ...) as a pattern might constrain its arguments without actually making an arbitrary choice. (At this point it's even tempting to remove notions of "input" and "output" and just reason about a constraint graph, but I don't know how to work that way. In my vague thought experiments, the extra polarity comes in handy for choosing orders of evaluation and such.)"
No clue what you're talking about here.
---
"Personally, I would expect this unconstrained foo to shadow nonlocal variables named "foo" just as if it were constrained to a particular value. However, if I tried to access it, I would expect some kind of uninitialized variable error. Is your version of "The Right Thing(tm)" the same as mine?"
If one of the branches does not bind "foo" and you try to use "foo", it will use the lexically scoped "foo":
(let foo 5
(let (or {foo} bar) 2
# foo is 5
# bar is 2
))
That's a good point, though: I should support the ability to write a pattern matcher that binds non-matching branches to variables that throw an error when accessed.
That way you can have a version of "or" that uses the lexical scope for non-matching branches, or one that uses special variables or whatever.
---
"In other news, this kind of (o a 5) syntax means you're putting expressions inside your patterns inside your expressions."
No, it doesn't. These are all patterns:
(o a 5)
{a (o b 5)}
(list a (o b 5))
"{a (o b 5)}" is just syntax sugar for "(list a (o b 5))"
A pattern is defined as:
* A list, in which case the first element has a %pattern-match property that is a vau that describes how to do the pattern matching.
* A symbol, in which case the value is bound to the symbol.
* Everything else is compared by equality.
So in the above case, "list" has a %pattern-match property, and "o" also has a %pattern-match property. It's just simple recursion with first-class environments. Very simple to understand and implement.
---
"You may need to ask yourself just what variables should be in scope in that inner expression; do they include some variables "previously" bound in the pattern? There was no need for this notion of "previously" until now."
This is a non-issue for the most part because if a variable occurs twice, it will test for equality. But yes, it currently evaluates the pattern in the environment of the previous patterns.
Because Nulan uses first-class environments, it's also easy to evaluate the patterns in the dynamic environment, or the lexical environment. I think I'll change it to use the lexical environment.
---
"I don't especially like the idea of infix operators breaking up juxtaposed identifiers like "a b". Infix operators are visible boundaries, and juxtaposition has no boundary except whitespace."
Personally I think it looks nice and short. I might choose a different operator other than "=" but I don't see a problem with the idea itself. You just have to keep in mind that "=" is an infix operator, that's all.
---
"Because of my preference for patterns to be inverted expressions, I would rather (is b 5) match only t or nil. If the input is t, then b will be bound to 5. If the input is nil, b will be bound to anything but 5."
I don't have much interest in constraining pattern matching to only be inversions of expressions. User code can supply any kind of pattern matching they want. And the reason for using "is" is because "a = b" expands to "(is a b)". It isn't actually tied to the is-expression, as you might call it.
Also, I would probably use "if" for such an expression. As in:
(var (if a) %f) -> error
(var (if a 1) %f) -> error
(var (if a 1 2) %f) -> 2
(var (if a) "foo") -> "foo"
(var (if a 1) "foo") -> 1
(var (if a 1 2) "foo") -> 1
That actually seems pretty useful. Thanks for the idea.
"I have no clue what you mean by expression-and. I guess you're talking about using "and" in a non-pattern context."
Yep, specifically an expression context. :-p
---
"I think you're thinking in terms of Haskell pattern matching. I view pattern matching as being more general and dynamic, not exclusively about destructuring the components of data."
I see that, of course. I'm not even saying I disagree with adding that functionality, just saying I'd choose operator names so that whenever a pattern and an expression are identical, they correspond in semantics too.
Haskell's pattern matching supports backtracking if the pattern fails, but its expressions don't support the same feature, so I hesitate to cite Haskell as a good example of what I'm going for. It is closer.
---
Me: "The same implementation fits the requirements for a pattern version of 'or, so why choose one of these names over the other?"
You: "No it doesn't. "or" matches only a single pattern, the first one that works. "and" always matches all the patterns, or fails. This is analogous to how "or" and "and" work in pretty much every programming language."
I was still talking about pattern-expression symmetry, trying to explain it with enough examples to convey how I reason about it and why I might make different choices than you do.
You're choosing definitions of pattern-and and pattern-or are (in most ways) not inverses of your definitions of expression-and and expression-or.
The point of pattern-as is that it takes one input and gives some outputs equal to it. If on the other hand you give several equal inputs to expression-and or expression-or, they'll all be equal to the output. Thus, both of these expression operators are special cases of the inverted interface of pattern-as.
---
"No clue what you're talking about here."
Logic variables are variables whose value is defined by solving a system of partial descriptions. Here's a sketch of an example:
(introduce-logic-vars (a b c d)
; We don't know anything about a, b, c, or d yet.
(= a (cons b c))
(= (cons d a) c)
; Now we know a is the infinite list (b d b d b d ...).
; The order of = doesn't matter.
(= a (cons b a))
; Now we know b and d are identical.
(= c "not a cons")
; Now we know there's an error, since the descriptions are
; inconsistent.
)
The terms "unification" and "constraint" may come in handy too, but these are strongly associated with logic programming and constraint programming respectively, and I'm not sure which kind of programming I'm talking about here.
---
Me: "In other news, this kind of (o a 5) syntax means you're putting expressions inside your patterns inside your expressions."
You: "No, it doesn't."
The "5" part is an expression, right?
Suppose the pattern were {a b c (o d (cons b e)) e}. Should the expression (cons b e) see the value of b as bound by the pattern, or the version in the nearest non-pattern surrounding scope? What version of e should it see?
---
"That actually seems pretty useful. Thanks for the idea."
Well, I'm glad you found something that might be useful, but I think that idea is all yours, lol. ^_^
"I see that, of course. I'm not even saying I disagree with adding that functionality, just saying I'd choose operator names so that whenever a pattern and an expression are identical, they correspond in semantics too."
It's up to the object to decide whether to ensure that correspondence or not. What operator name would you suggest for my "and" pattern matching?
---
"I was still talking about pattern-expression symmetry, trying to explain it with enough examples to convey how I reason about it and why I might make different choices than you do."
Yes, I know, and I'm saying I don't see any real benefit to that. So you're going to have to make a harder case to convince me.
---
"You're choosing definitions of pattern-and and pattern-or are (in most ways) not inverses of your definitions of expression-and and expression-or."
Yes, but I think it fits with the intuitive understanding of what "and" and "or" do. "and" makes sure all its arguments are "truthy" and "or" selects the first "truthy" argument. In this case, "truthiness" means whether the pattern matches or not. In an expression context, "truthiness" would mean "not %f"
---
"The point of pattern-as is that it takes one input and gives some outputs equal to it. If on the other hand you give several equal inputs to expression-and or expression-or, they'll all be equal to the output. Thus, both of these expression operators are special cases of the inverted interface of pattern-as."
Yes, and "and" is a superset of "as" because it behaves exactly like "as" when the first argument is a symbol and the second argument is a pattern:
(var (and a {b c}) {1 2})
(var (as a {b c}) {1 2})
But as you rightfully pointed out, "and" can do things besides that, because it can have more than 2 arguments, and it doesn't require a symbol as the first one. Thus, "as" is a constricted form of "and", effectively. Thus, "and" is a superset of "as".
---
"Logic variables are variables whose value is defined by solving a system of partial descriptions."
Okay. I have no clue how to implement such a system, so I'll stick with my nice simple system.
---
"The "5" part is an expression, right?"
In that particular case, I suppose so, yes.
If you call "pattern-match", it's a pattern, if you call "eval", it's an expression. "o" happens to call "eval" on its second argument, so it's an expression.
---
"Suppose the pattern were {a b c (o d (cons b e)) e}. Should the expression (cons b e) see the value of b as bound by the pattern, or the version in the nearest non-pattern surrounding scope? What version of e should it see?"
The "o" pattern matcher can actually choose whether its second argument is evaluated in the pattern-match environment or the enclosing environment. I suppose the "obvious" thing to do would be to evaluate it in the pattern-match environment.
As for "e"... the "list" pattern matcher does its pattern matching from left-to-right by recursing on the cdr of the list. Which means at the time (cons b e) is evaluated, "e" is not in the pattern-match environment.
As for what version it should see... I like the simplicity of doing the pattern matching from left-to-right, so I suppose it would see whatever "e" is bound (or not bound) in the pattern-match environment at the time it's evaluated. In that case, it'll probably use the "e" from the enclosing scope.
---
"Well, I'm glad you found something that might be useful, but I think that idea is all yours, lol. ^_^"
"I have no clue how to implement [logic variables]..."
I'll pretend you're curious though~! :D
In proof theory terms, constraint logic programming is closely related to proof search: Given a theorem, it finds a proof. Functional programming is closely related to proof normalization: Given a proof, it finds a simpler way to express the same proof (by, for example, reducing the proof all the way to a literal value).
Going by this correspondence, one way to combine these kinds of programming is by putting the theorem-heavy constraint logic code into a type system for the proof-heavy functional code. Agda is probably a very expressive language for this approach, since the full computational power of its "static" type system can potentially live until run time. Other typed functional languages with implicit arguments or type classes fit this bill too, but the phase separation might stifle it.
If you're not so interested in type systems, other languages have tackled this combination of programming paradigms in a more procedural way. It looks like http://en.wikipedia.org/wiki/Oz_(programming_language) and its "see also" links are a good starting point.
A search for [racket constraint logic] brings up Racklog (http://docs.racket-lang.org/racklog/index.html). It doesn't look like Racklog has numeric constraints, and it's based on Prolog besides, so it's clearly in the logic programming camp.
I don't really have a clue how this stuff is implemented, but I fear it might just be a lot of backtracking search most of the time. :-p I'm guessing (ISO standard) Prolog is particularly doomed to backtracking search, since some of its predicates have side effects.
There's probably some kind of relationship between depth-first vs. breath-first search and strict vs. lazy evaluation....
---
"[...]so I'll stick with my nice simple system."
"I like the simplicity of doing the pattern matching from left-to-right[...]"
As Rich Hickey says, "simple" and "easy" are distinct concepts. An easy implementation doesn't necessarily lead to a simple result.
In his "Simple Made Easy" talk, he specifically considered 'fold more complex than 'reduce due to the way it imposed a left-to-right or right-to-left ordering that wasn't necessarily part of the programmer's intent.
fun {Fact N}
if N =< 0 then 1 else N*{Fact N-1} end
end
fun {Comb N K}
{Fact N} div ({Fact K} * {Fact N-K}) % integers can't overflow in Oz (unless no memory is left)
end
fun {SumList List}
case List of nil then 0
[] H|T then H+{SumList T} % pattern matching on lists
end
end
I found the above to be cool: notice that the function arguments match the function call. This is like what akkartik was talking about: http://arclanguage.org/item?id=16924
I admit I am vaguely curious about lots of things. That Racklog in particular seems interesting.
---
"As Rich Hickey says, "simple" and "easy" are distinct concepts. An easy implementation doesn't necessarily lead to a simple result."
I am aware. But until I either A) find a better system that I can actually understand and implement or B) find some severe flaw with my system, I'll stick with my system.
In addition, because of immutable environments, variable bindings are strictly from top-to-bottom. So enforcing that same kind of strict left-to-right ordering is more consistent and I think intuitive for the programmer. So I would say it is both easy and simple.
---
"In his "Simple Made Easy" talk, he specifically considered 'fold more complex than 'reduce due to the way it imposed a left-to-right or right-to-left ordering that wasn't necessarily part of the programmer's intent."
Well, okay, but sometimes you really do want that different ordering.
---
"Ironically, he also dissed type systems. :)"
I guess it depends highly on what he meant by "type systems". I haven't seen that video so I don't know.
"What operator name would you suggest for my "and" pattern matching?"
I've been using "as" in this conversation, but I suppose I'd try to name it based on how its corresponding expression would behave (even if it isn't really usable as an expression). Since the point of this pattern is to act as a T-junction whose inputs and outputs are all equal, How about "idfn"?
If all else fails, I might name it something like "pand" just so it didn't conflict with 'and. But I don't seriously expect you to reach the same conclusion. :)
---
"So you're going to have to make a harder case to convince me."
Last time I brought this up, I gave an example implementation of a pattern-matching macro (with no backtracking) that took its operators from the same environment as the functions. In fact, the operators were implemented as Racket procedures with multiple return values.
At the time, I think I might have cited the advantage that a single macro could be used for patterns and expressions without any difference in the code it emits.
Personally, I find that a weak argument; there aren't likely to be very many macros that take advantage of this. Since you're embracing fexprs rather than macros, I don't expect it to convince you either, but I'm bringing it up just in case.
That's not a bad name choice at all, except that idfn takes a single argument, so it feels weird to have it take two with the pattern match version.
---
"If all else fails, I might name it something like "pand" just so it didn't conflict with 'and. But I don't seriously expect you to reach the same conclusion. :)"
That would be a good idea if there were multiple competing implementations for the "and" pattern. In this case, I think my implementation is reasonably intuitive and simple. So there's not much reason to rename it to avoid that sorta thing.
By the way, I know it's a bit of a pain, but it is possible to just do this:
(def pand and)
And now you can use "pand" instead of "and" in your own code.
---
"Since you're embracing fexprs rather than macros"
I'm actually only embracing fexprs because the implementation is super easy and makes dealing with environments easier.
I could instead have gone with the macro route and had a "call-with-current-environment" (analogous to call-with-current-continuation) instead.
That might actually be a better idea. It kinda bugs me having to specify the environment in vaus even if you aren't going to use it.
---
"As you indicate, this has to do with the pattern choosing to call 'eval on an argument. I think the scope question will come up pretty much any time a pattern does that. If different patterns do different things here, that's okay, but I'd hesitate to introduce that inconsistency without a good reason."
My philosophy in this case is basically "don't worry about it, just let the patterns do what they want to do. If I make it easy to do The Right Thing(tm), it'll be okay"
In other words, if I make it easy (and idiomatic) to evaluate in the pattern-match environment, then pattern matchers will naturally go that route unless they have a specific reason to do something else, in which case they're probably right.
"But as you rightfully pointed out, "and" can do things besides that, because it can have more than 2 arguments, and it doesn't require a symbol as the first one."
I didn't suppose 'as would have those restrictions. The reason 'as is a restricted version of 'and is because the arguments of 'as are fully determined by its value, while 'and is nondeterministic in that direction.
(I'm not talking about your pattern-and. That's just pattern-as to me.)
---
"In that particular case, I suppose so, yes."
That's the particular case I was talking about. :)
As you indicate, this has to do with the pattern choosing to call 'eval on an argument. I think the scope question will come up pretty much any time a pattern does that. If different patterns do different things here, that's okay, but I'd hesitate to introduce that inconsistency without a good reason.
Still, getting something that works for now is a good reason. :-p
My plan with patmac.arc was to support (o a 5) as one of many sequence parsing utilities inspired by regular expressions. The point is, this particular pattern doesn't describe a value; it describes a subsequence of length 0 or 1.
If you want this to work as part of {a b c} list patterns, I recommend somehow making sure all value patterns can be coerced to (or otherwise interpreted as) length-one subsequence patterns.
---
In other news, this kind of (o a 5) syntax means you're putting expressions inside your patterns inside your expressions. You may need to ask yourself just what variables should be in scope in that inner expression; do they include some variables "previously" bound in the pattern? There was no need for this notion of "previously" until now.
"I just made an interesting discovery: "and" is a superset of Haskell's "as" pattern"
How so?
The way I like to look at it, the correspondence between patterns and expressions is based on one being the inverse of the other. If a value is matched and then reconstructed using the same code, it should become the same value.
For example, since (cons a b) as an expression takes two inputs and gives one cons cell as output, its pattern version takes one cons cell as input and gives two outputs.
The pattern version of (and ...) would take one input and give at least one output. (I'm ignoring zero-argument (and) here.) If the input is nil, the outputs can be picked arbitrarily from the space of all possible values, as long as at least one of them is nil. If the input is not nil, then the last output will be equal to that value, and the other outputs can be any arbitrary non-nil values. Thanks to all this nondeterminism, an 'and pattern isn't exactly useful or even meaningful.
One refinement of this interface is to pick the outputs such that they're all equal to the input. That seems to be what you're going for. However, there are two caveats:
- The same implementation fits the requirements for a pattern version of 'or, so why choose one of these names over the other?
- If pattern-and's outputs are always equal, then expression-and's inputs should always be equal too, right?
We can dodge both of these conundrums by defining 'as as its own operator and just giving up on finding inverted counterparts of 'as, 'and, and 'or. It's okay for some computations to be non-invertible--unless, of course, the language design demands otherwise.
---
Besides inverting expressions, you and Racket (not to mention my past self) do some extra things with pattern matching beyond inversion: You allow the same variable to appear more than once in a pattern, and you allow backtracking when a match fails.
If the variables are seen as logic variables, then multiple variable occurrences make sense as multiple constraints to be unified. Racket and Nulan operations can only operate on values that have been determined up to equality, so the logic variable semantics degenerates into an equality test.
In another language, (and ...) as a pattern might constrain its arguments without actually making an arbitrary choice. (At this point it's even tempting to remove notions of "input" and "output" and just reason about a constraint graph, but I don't know how to work that way. In my vague thought experiments, the extra polarity comes in handy for choosing orders of evaluation and such.)
---
"Oh, and, here's "or""
I'm suspicious of the name "or," but if you understand what I'm saying about "and" above, I don't have anything more to say on that here.
With this alternation operator, you (and Racket and my past self) introduce backtracking at the level of subpatterns, rather than just at the top level of the pattern.
This is a bit curious for syntax purposes, since different branches impose different constraints on the logic variables. In particular, suppose one branch binds foo and another branch doesn't. If we end up in the branch that doesn't, then foo will be fully unconstrained.
Personally, I would expect this unconstrained foo to shadow nonlocal variables named "foo" just as if it were constrained to a particular value. However, if I tried to access it, I would expect some kind of uninitialized variable error. Is your version of "The Right Thing(tm)" the same as mine?
I've been closely following David Barbour's progress since early 2011, occasionally making my own attempts at implementing the basic evaluation model. As far as I'm concerned, RDP is far and away the best upcoming approach to making secure and physically viable programs.
I'm open-minded to improvements upon RDP, but RDP hasn't matured enough to say for sure where it ends and the improvements begin. :-p So I've been focusing my language design efforts in places that would mesh well with however RDP turns out, due to orthogonality.
Namely, I don't strive to model world state, multi-user real-time interaction, or even general computation. Instead, I've been looking at mathematical treatments of program/proof syntax, and I've been especially focused on modularity (multi-user eventual interaction, you could say).
Do you have a link for that? I had the impression Haskell already did a pretty good job of supporting untyped code within a module, but if it has something to do with explicit passing of type class instances or untyped module exports, that's pretty interesting. Gradual types would be even more interesting than that.
Robert Harper: "Since every value in a dynamic language is classified in this manner, what we are doing is agglomerating all of the values of the language into a single, gigantic (perhaps even extensible) type."
Laurence Tratt: "Bob points out that, conceptually, a dynamically typed program has a single static sum type ranging over all the other types in the program. We could therefore take a dynamically typed program and perform a conversion to a statically typed equivalent. Although Bob doesn't make it explicit how to do this, we can work out a simple manual method (though it's not clear to me that this would be fully automatable for most real programs)."
This part of the debate goes off on an unnecessary tangent. There's not really any need for a "gigantic (perhaps even extensible)" sum type.
-- Haskell code
{-# LANGUAGE FlexibleInstances #-}
-- A hack to show functions as opaque values at the REPL.
data ShowableFunc a b = ShowableFunc (a -> b)
instance Show (ShowableFunc a b) where
show _ = "[function]"
data Value =
FunctionValue (ShowableFunc Value Value)
| SymbolValue String
| TaggedValue Value Value
-- etc.
deriving Show
Having abandoned gigantic and extensible notions, the translation is actually very straightforward. The TaggedValue case is all we need in order to define a so-called dynamic type system in the Arc style.
Admittedly, we may need a basic DSL to help our embedded untyped programs go against the grain of the language.
Here's how we might start to support this kind of programming in Haskell. The only callable things in Haskell are of type (A -> B), so our programs will need to use a "call" function all over the place, but Haskell's type classes (with FlexibleInstances) make it possible to shove aside a lot of other boilerplate.
-- Some type classes for convenient expression of Value values using
-- Haskell syntaxes. We could go without these, but we'd need
-- explicit conversions all over. Consider this weak typing.
class ToValue a where
tv :: a -> Value
instance ToValue Value where
tv x = x
instance (ToValue a) => ToValue (Value -> a) where
tv x = FunctionValue (ShowableFunc (\v -> tv (x v)))
instance ToValue String where
tv x = SymbolValue x
call :: (ToValue func, ToValue arg) => func -> arg -> Value
call func arg =
let func' = tv func in
let arg' = tv arg in
case func' of
FunctionValue (ShowableFunc haskellFunc) -> haskellFunc arg'
_ -> error ("Call to non-function " ++ show func')
We can go on to define some built-ins for the dynamic type system:
annotate :: Value
annotate = tv $ \tag rep ->
TaggedValue tag rep
-- I'd name this "type" as in Arc, but that's a reserved word.
getType :: Value
getType = tv $ \x -> case x of
FunctionValue _ -> SymbolValue "fn"
SymbolValue _ -> SymbolValue "sym"
TaggedValue tag _ -> tag
-- etc.
The ":: Value" annotations are necessary to make the type class instance we're using unambiguous. (As far as Haskell is concerned, we're still allowed to define another instance which works for arguments of type other than Value.) Another few utilities could save us from typing these annotations, but this is an example.
Notice that this technique can faithfully translate dynamic type errors too, without the static type checker getting in the way.
Above I quoted both articles, but the rest of the quotes I'm responding to are just from Laurence Tratt.
---
"[If sum types are] Used indiscriminately – particularly where a sum type ranges over an unduly large number of other types – they create an unwieldy mess because of the requirement to check all the types that have been ranged over."
See my use of the catch-all pattern "_ ->" in the definition of "call" above. If the majority of the branches use a default behavior, usually this kind of pattern-match is available to save the day.
---
"The encoding of dynamically typed languages we used above would lead to a huge increase in the size of programs, since a static type system forces us to explicitly express all the checks that a dynamically typed run-time implicitly performs for us."
The kind of encoding I used would lead to only a moderate increase, and then only due to the lack of dedicated language-level support for that style of programming.
In statically typed programming, the "checks" tend to be expressed in the types rather than the expessions. Usually, when code uses a library utility, it doesn't have to reiterate the types of that utility, so the checks are still just as implicit.
(In my code above, the ":: Value" annotations are a counterexample. In a sense, this "check" is used for branching among type class instances.)
---
"It would also remove the possibility of making use of a dynamically typed system's ability to execute and run programs which are deliberately not statically type safe (a useful property when comprehending and modifying large systems)."
Some escape hatches have been investigated for this purpose. See gradual typing or Haskell's "Dynamic" type.
The way I think of it, sometimes types are a meaningful part of the program, and sometimes they're an effective way for a language or library to expose a well-behaving realm of possible uses in the first place.
It's easy to interpret the "well-behaving realm" case as artificial restrictions on an innately wild program space. But restrictive interfaces can frustrate us in untyped languages too.
---
"We could design a sound type system that will deal with this specific case but, because a sound type-system won't be complete, we can always find another similar run-time correct example which the type system will reject."
"It also provides a mirror-image encoding from above: one can always translate statically typed programs to a dynamically typed equivalent."
This section relies on the implicit assumption that for every program in a typed language, the type annotations can be deleted to yield a program in a corresponding untyped language.
That observation does not apply in general. Certain static type system features, such as Haskell's type classes, allow the programmer to specify run time behavior in the type annotations rather than in the expressions. This technique is used extensively in Haskell and Agda as a way to reduce expression boilerplate. I use it above.
---
"In this article, I've tried to show that statically and dynamically typed languages overlap in many ways, but differ in others. Neither is exactly a subset of the other nor, as the underlying theory shows, could they be."
I flatly disagree; untyped languages are exactly a special case of typed languages.
The thing is, integers are exactly a special case of real numbers too, and that's no reason to dismiss integers. :)
---
"One must consider the pros and cons for the tasks one has at hand, and choose accordingly. Personally, I use both styles, and I have no wish to unduly restrict my language toolbox."
Actually, there's a terminology issue going on here. Laurence Tratt and Pauan find it important to talk about errors, so in the above post I went point by point to address that. But as far as I'm concerned, "{static,dynamic} type error" is nothing more than a way to say "{compile,run} time error that a statically {untyped,typed} style could have avoided," and I take that suggestion with a grain of salt.
Robert Parker and I don't see "dynamic typing" as some kind of alternative strategy with extra features that somehow prevent the use of static typing. For what particular reasons and contexts should static typing be considered harmful?
---
In my opinion, static typing is mostly useful for three reasons: It has nice symmetry with mathematical reasoning[1], it can support static analysis of programs[2], and it can control a particular side effect. I'll focus on that last part, since side effects are what distinguish one computing platform from another.
Static typing controls the side effect of indeterminism when calling a black box library. By default, code that calls a black box can't expect very much about what will happen, due to the fact that it's a black box. By the time the program has run, the box could have been filled with any version of the library. Static typing establishes at least some guarantees about what kind of library code will fit in the box. (Since the intricacy of those guarantees can span a rich mathematical theory, it's nothing to scoff at.)
The black box side effect is ultimately "implemented" as a result of the programmer's manual process of selecting and upgrading libraries. So we can't just write an interpreter that simulates a new variant of this side effect; it needs to fit seamlessly into the programmer's existing habits, or we're veering into new language territory.
Within a statically typed module system, untyped programming seamlessly fits in as a trivial special case--as "unityped" programming. But it isn't as clear how to do typed programming within a module system that doesn't already support that particular static type system (or another that's at least as logically strong). I'm very interested in options here.
---
[1] Symmetry with mathematics is just an aesthetic. Aesthetics has power too, but it doesn't do much to resolve debates. :)
[2] Static analysis is just the design pattern of doing a preliminary computation in order to reason about an upcoming "real" computation. When this pattern comes up, it's useful to design a static type system that meets the needs of the analysis, but static typing in the implementation language doesn't always help.
"Our brains are trained to treat the semi-colon as punctuation, never part of a word. Even programming languages have only reinforced this pattern. It's going to be a hard habit to break."
"The semicolon at the end isn't terminating the "qux" statement, it's creating a new empty statement."
I don't understand that interpretation. If the last semicolon is its own statement, where's the semicolon that separates it from the previous statement?
You might be thinking of Pascal, where semicolons separate statements but there's a zero-character empty statement. It seems like Pascal vs. C always comes up when semicolons are involved, and the semicolon business is one of the first differences highlighted at http://en.wikipedia.org/wiki/Comparison_of_Pascal_and_C.
The last semicolon is the separator. The "thing" to the right that it's separating is an implicit empty statement. So yes, I am exactly describing Pascal. And that's the difference between a language where ";" is a separator and where it's a terminator.
"You could have the user specify a previously-declared chunk of code to execute first"
I think this is similar to Inform 7's approach, where rules can be referred to by name. Generally, every rule in a library has a name, but individual stories omit them unless necessary.
---
"- I'm really scraping the bottom of the barrel now...Have the user supply their own predicate that determines which generic to apply first? Which means the user basically has to implement their own customized partial-order. I really don't see that happening."
That's the approach I took in Lathe a long time ago.[1] Under my approach, the partial order is itself extensible, and it even determines the ordering of its own extensions. I quite like the expressiveness of this approach, but this expressiveness is barely meaningful for in-the-large programming: In order for the precedence rule programmer to have enough information to make judgments by, they need to require all other programmers to annotate their extensions with appropriate metadata. That or they need to maintain their own up-to-date metadata describing common libraries! Instead of programming language battles, we get framework battles full of boilerplate.
Since finishing that system, I've been pondering the "framework" conventions I'd like to use myself, and I've been trying to fold those decisions into the core of a language design.
Whereas Magpie and many other multimethod systems make a big deal about subclasses, I try to avoid any kind of programming where almost all X's do Xfoo, but some X's are also Z's so they do Zfoo instead. By the same principle, I'd avoid the [odd? _] vs. [= 1 _] case altogether. If fact, as long as I succeed in formulating in the non-overlapping designs I like, I avoid the need for precedence rules altogether... but it's still an option I keep in mind just in case.
Currently, I favor supporting extensibility by way of sealer/unsealer pairs and first-class (multi)sets.
Sealer/unsealer pairs work when each extension is an all new range of possibilities. In Arc, I'd just represent these as tagged values, and I wouldn't bother to pursue the secure encapsulation associated with sealer/unsealer pairs. In linear logic, the additive operators describe this kind of combination.
First-class (multi)sets work for when each extension is a new participant in a single computation. In Arc, a list is a good representation for this. In linear logic, the multiplicative operators describe this kind of combination.
When precedence is necessary, it can be modeled explicitly as a transformation of a (multi)set into a more structured model. I think any programmer who makes an extensible tool should carefully choose a transformation that makes sense for their own purposes--whether it's really "precedence" or something else.
---
"For general predicates, instead of hard-coding an order I'd rather use a big if/case/pattern-match that I at least know is ordered the way I wrote it."
That's my take on it, too. My examples of multi-rule functions have included factorial and exponentiation-by-squaring, but those are unrealistic. There's no reason for an extension to come interfere in that process, so it might as well be the body of a single declaration.
When I was using predicate dispatch more often, I often discovered that I could satisfy most of my use cases with just two extension annotations:
- This is the real meaning of the function, and all the other cases are just for auto-coercion of parameters into the expected format. Use this extension first. (This came up when implementing 'iso in terms of 'is, and it also makes sense for coercion functions themselves, such as 'testify.)
- This extension is the last resort. Use it last. (Maybe it throws an informative error, or maybe it returns false when everything else returns true. This also works for all-accepting coercion functions like 'testify, but I'm suspicious of this kind of design. A call to 'testify always does Xfoo, except when it does Zfoo.)
---
[1] Unfortunately, right now arc-Lathe's version of the precedence system isn't something I maintain, and Lathe.js's version has no easy-looking examples because I no longer store extensions in mutable global state. Lathe.js's "hackable" utilities are now higher-order utilities that take all their once-global dependencies as explicit parameters.
"they need to require all other programmers to annotate their extensions with appropriate metadata."
If Nulan had multimethods, I'd probably just require programmers to add additional metadata to the object in addition to the %pattern-match gensym. But Nulan doesn't have linearization or multimethods, so I avoid that whole mess!
---
"Whereas Magpie and many other multimethod systems make a big deal about subclasses"
And I have to agree: objects are amazing, but classes and subclasses are terrible. In fact, I'm of the opinion that probably all hierarchial relationships are too restrictive. Hence Nulan's object system which is completely flat, but has pretty much all the benefits of classes/prototypes.
Basically, the behavior that is common to all objects (of a particular kind) is put into a function, and behavior that is specific to a particular object is put into the object itself. And immutability gives you wonderfully clean semantics for prototype/class behavior, without all the complication and baggage.