"Over the past few months I've been gradually grokking why scheme says"
I always thought that was just because Scheme tries to be minimal, and using the single special-form "define" for two different purposes helps with that. Scheme also does two different things with "let":
Hmm... I tried to think of how to do this in a clean way... for instance, using macro-like pattern matching:
{foo x y z} -> ...
{bar x y z} -> ...
{~ x y z} -> ...
But I'll have to spend a while mulling on that to see if it pans out. In the meantime, my plan for functions is to use -> syntax to define an anonymous function, meaning that this:
-> x y z (+ x y z)
Is equivalent to this Arc code:
(fn (x y z) (+ x y z))
This looks great when the functions are last, which is frequent with JavaScript callbacks. And then "def" is kinda like Arc's "=":
# Nulan
(def foo -> a b c ...)
(def foo 5)
; Arc
(= foo (fn (a b c) ...))
(= foo 5)
Okay. I've spent a bit of time mulling this over and trying out some stuff. Here's what I came up with.
How about a language that's like Haskell/Shen: currying everywhere. This language would be based heavily on functions, of course, which are expressed with "->" as follows:
foo %x -> %x + 2
The above is equivalent to this in Arc:
(def foo (x) (+ x 2))
Now, how this works is... everything to the left of the "->" is a pattern. Everything to the right is the function's body. If a variable starts with % it's local to the function, if not, it's global.
class Duck:
def quack(self):
print("Quack")
def fly(self):
print("Flap, Flap")
class Person:
def __init__(self, name):
self.name = name
def quack(self):
print("{} walks in the forest and imitates ducks to draw them".format(self.name))
def fly(self):
print("{} takes an airplane".format(self.name))
def quack_and_fly(duck):
duck.quack()
duck.fly()
quack_and_fly(Duck())
quack_and_fly(Person("Jules Verne"))
And here it is in this hypothetical currying-pattern-matching language:
duck ->
[ quack -> prn "Quack"
fly -> prn "Flap, Flap" ]
person %n ->
[ quack -> prn "@%n walks in the forest and imitates ducks to draw them"
fly -> prn "@%n takes an airplane" ]
quack-and-fly [ quack %q fly %f ] -> %q; %f
quack-and-fly duck
quack-and-fly person "Jules Verne"
Wooow that is short! It also means that (except for % variables which are local) the pattern matching to the left of -> matches the function call.
If you wanted to, you could use parens like Shen instead of no-parens like Haskell.
Some downsides? Well, since it's using currying, there could be some issues with that. In particular, variable argument functions wouldn't be possible. There's also potentially some scoping issues and such.
Overall though, I think this idea is cool enough to actually try and implement it in a simple interpreter, to figure out all the kinks.
Okay, I was able to solve a couple problems with my object pattern-matching...
[ foo = 5 | bar = 10 ]
The above is a collection of patterns. Specifically, it has a "foo" pattern that maps to 5, and a "bar" pattern that maps to 10. Now, let's put this object into a variable:
pattern = [ foo = 5 | bar = 10 ]
Now how do we extract the subpatterns? Like so:
pattern [ foo ]
pattern [ bar ]
The above returns 5, and then 10. And we can extract multiple patterns at once:
pattern [ foo | bar ]
The above returns 10. This largely removes the need for "as" patterns, which is something I found cumbersome to use. You can think of | as being kinda like "do". It will first call the foo pattern, then the bar pattern.
Thanks. So when the function name is grouped with its parameters in definition (as it is in scheme), and infix is permitted, no longer does the function name have to come before its parameters.
I would have to get used to not always seeing the function name first, but I like the symmetry this produces between definition and call.
"<>" had to be in parens so that wart's infixey reader wouldn't try to swap it with "def". Now thanks to scheme-style grouping of the function name with its params, this definition can be written:
Exactly. Sorry, I think you're missing http://arclanguage.org/item?id=16826. In brief, = is now equality and <- is now assignment. And since both are composed of operator chars, (def (a <- b) ..) is transparently read as (def (<- a b) ..), which now maps directly to the scheme-like declaration syntax, etc., etc.
This reminds me of Shen macros[1], which are strictly more powerful than Arc-style macros:
; Arc
(mac and args
(if args
(if (cdr args)
`(if ,(car args) (and ,@(cdr args)))
(car args))
't))
; Shen
(defmacro andmacro
[and] -> true
[and X] -> X
[and X | R] -> [if X [and | R]])
Because Shen macros operate on the entire code structure rather than the arguments of the macro, you can do silly stuff like replace the number 2 with the number 2.5:
In Shen, macros are stored in a global list. I'm guessing you name them so that you can update/remove them later or something. I guess it can also serve as documentation? I dunno. I also thought it was weird that they were named.
"All that I've done in the above is to restate a simple principle: statically and dynamically typed languages enforce their types at different points in a program's life-cycle. Depending on whether you view things through a static or dynamic type prism, either class of language can be considered more expressive. Dynamically typed languages will try and execute type-unsafe programs; statically typed languages will reject some type-safe programs. Neither choice is ideal, but all our current theories suggest this choice is fundamental. Therefore, stating that one class of language is more expressive than the other is pointless unless you clearly state which prism you're viewing life through."
"And this is precisely what is wrong with dynamically typed languages: rather than affording the freedom to ignore types, they instead impose the bondage of restricting attention to a single type! Every single value has to be a value of that type, you have no choice!"
So, does that mean Lisps are "less powerful" because they tend to use a single data structure (cons) for everything? I think there is a lot of power in simplicity and using a single thing to represent everything else.
Certainly static type systems can let you write different programs than dynamic languages: staticness gives you certain powers that you don't have in dynamic languages. But I would argue that dynamic languages, precisely because they lump everything into a single static type, can have certain benefits as well.
---
"For another, you are imposing a serious bit of run-time overhead to represent the class itself (a tag of some sort) and to check and remove and apply the class tag on the value each time it is used."
Indeed. Static languages are well known to be faster than dynamic languages because they have more information at compile-time. Old news. Oh wait, we've developed JIT compilers that can make dynamic languages go astonishingly fast, so the old "performance" argument is a lot weaker nowadays.
---
I don't think I'd have a problem with a language that was dynamic by default but lets you specify optional static types. I think that would be the best of both worlds. Unfortunately, the languages that have good static type systems (Haskell, ML, etc.) tend to also emphasize staticness everywhere, requiring you to jump through some hoops to enable dynamicism.
So yeah, I don't think static type systems are evil, I think they're a useful tool, but like all tools, it depends on how you use it. Social norms, idioms, what the language encourages/discourages matter just as much if not more so than the language features themself.
Haskell's been trying to make types optional, which will be interesting to watch. Because even if they loosen their type system their social norms are pretty baked at this point.
Do you have a link for that? I had the impression Haskell already did a pretty good job of supporting untyped code within a module, but if it has something to do with explicit passing of type class instances or untyped module exports, that's pretty interesting. Gradual types would be even more interesting than that.
This actually makes a tremendous amount of sense if you consider ";" to be a statement separator rather than a statement terminator. For instance, in C, JavaScript, etc. you can have an "empty statement" which is just a semicolon:
;
This means that in the following block of code...
foo;
bar;
qux;
The semicolon at the end isn't terminating the "qux" statement, it's creating a new empty statement.
---
Slightly relevant: I much much much prefer Ruby over Python, quirks and all.
"The semicolon at the end isn't terminating the "qux" statement, it's creating a new empty statement."
I don't understand that interpretation. If the last semicolon is its own statement, where's the semicolon that separates it from the previous statement?
You might be thinking of Pascal, where semicolons separate statements but there's a zero-character empty statement. It seems like Pascal vs. C always comes up when semicolons are involved, and the semicolon business is one of the first differences highlighted at http://en.wikipedia.org/wiki/Comparison_of_Pascal_and_C.
The last semicolon is the separator. The "thing" to the right that it's separating is an implicit empty statement. So yes, I am exactly describing Pascal. And that's the difference between a language where ";" is a separator and where it's a terminator.
Excluding parts of "03 utils.arc", the above is roughly the minimum needed to define Arc, the "import" macro, and the REPL.
They are loaded in numeric order, but that's controlled by the "arc" executable, so there's nothing stopping you from changing the order, or adding new non-numeric files[1], etc.
Everything else is put into either the "app" or "lib" folder, and none of the files in either folder are numbered.
This means the numbers are purely for documentation purposes, to help people see which parts go where. In particular, changing the numbers doesn't change the order in which things are loaded: you need to make the changes in the "arc" executable.
I think this is a good blend between not being too cumbersome, improving the visibility of order (which matters), and allowing fine-grained control of order.
---
* [1]: As an example of that, the "arc" executable also loads "lib/strings.arc" and "lib/re.arc" in addition to the above.
Interesting. It doesn't address fallintothis's pain point though -- he still needs to remember the numeric prefix to navigate between the files.
As an experiment I've added a script that creates more mnemonic symlinks to all the files in the repo:
$ git clone http://github.com/akkartik/wart.git
$ cd wart
$ git checkout 8332909aec # Unnecessary except for visitors in the far future
$ ./denum # New helper
$ vim -O assign.wart mutate.wart # etc.
I'm going to play with keeping these symlinks around and see how much I use them.
"Looks pretty good though I'm not so sure about the double enter requirement."
It's not just wart: readable has the same issue. The problem is that you don't know if the user is finished or if they want to enter a multi-line expression. With parentheses, you always know whether the expression is completed or not, but with whitespace, you don't.
The same is true even in Python. If you type in this:
>>> def foo(x):
... return x
You have to hit enter twice. In some cases, Python is smart enough to know that the expression ended, so you only have to hit enter once:
>>> 1 + 2
But that's because Python's grammar is more restricted in certain areas. That doesn't work with S-expressions because they're so general.
I realized after I'd read wart's readme that parens were optional, though I don't think that would have been enough to clue me in on the double enter requirement. :)
Yeah, I didn't expect everyone to remember that :) I just wanted a sense of whether the no-prompt prompt was intuitive. So far it seems to be intuitive!
"As long as we know what sort of clauses we're working with, we can define a partial order on them."
You could require that the user provide an order when defining a new pattern.
---
My idea with Nulan is that functions shouldn't change: they should be static algorithms. Thus, Nulan doesn't allow you to overload functions like you can in Arc or Wart. Instead, if you wish to overload a function, you define a new gensym which can be stuck into any object to overload it:
Now, if you call the "foo" function with an object that has a %foo key, it'll call the first clause. Otherwise it'll call the second clause.
This gives complete control to the function to decide whether it can be overloaded or not, and exactly how the overloading happens. I think this is the most flexible approach I've ever seen.
As a more concrete example, objects can have a %len key to define a custom (len ...) behavior, an %eval key for custom evaluation behavior, %call for custom call behavior, %pattern-match for custom pattern match behavior, etc.
And because of the way pattern matching works, you can even match against multiple gensyms at once:
$def %foo (uniq)
$def %bar (uniq)
$def foo
{ %foo F
%bar G } -> ...
{ %foo F } -> ...
{ %bar G } -> ...
... -> ...
The above defines different behavior when an object has both the %foo and %bar keys, and also when it has just %foo or just %bar.
---
"So, how do I manipulate quotes in macroexpansions, as needed in qq.wart?"
I think hardcoding symbols (like in quasiquote) is a terrible idea, so I would just not define quasiquote. Instead, I'd use Arc/Nu quasisyntax, which doesn't use quote at all:
`(foo bar qux) -> (list foo bar qux)
`(foo (bar) qux) -> (list foo (list bar) qux)
If you want quote, just use unquote:
`(foo ,'bar qux) -> (list foo (quote bar) qux)
Also, I wouldn't define quasiquote as a macro, I'd have it be a part of the reader.
You could require that the user provide an order when defining a new pattern.
Interesting idea, but the proper mechanism for it eludes me.
- You could have the user give a precedence level, since integers are totally ordered, but that's a terrible idea---magic constants in every generic declaration. Any more restricted domain (e.g., specifying high, medium, or low precedence) gets a little vague.
- You could have a simpler mechanism where each new rule is added to a known, fixed location in the linearization. So basically the order is a double-ended queue and generic declarations have keywords for "push me to the front" or "push me to the back". But that's probably a bit too basic, and still relies on declaration order (in fact, complicating it).
- You could have the user specify a previously-declared chunk of code to execute first, like
But that's too tightly-coupled and requires code duplication. Plus, if you're already duplicating the code that's close by a definition, why not instead reorder the definitions so a simpler order-they're-read mechanism works?
- I'm really scraping the bottom of the barrel now...Have the user supply their own predicate that determines which generic to apply first? Which means the user basically has to implement their own customized partial-order. I really don't see that happening.
When using generics myself, I don't want to think about these sorts of things too hard. That's what I like about class-based generic dispatch: I just have to think about the function one class at a time, and the right function will be applied to the right instances using their simple, implicit order. For general predicates, instead of hard-coding an order I'd rather use a big if/case/pattern-match that I at least know is ordered the way I wrote it.
"You could have the user specify a previously-declared chunk of code to execute first"
I think this is similar to Inform 7's approach, where rules can be referred to by name. Generally, every rule in a library has a name, but individual stories omit them unless necessary.
---
"- I'm really scraping the bottom of the barrel now...Have the user supply their own predicate that determines which generic to apply first? Which means the user basically has to implement their own customized partial-order. I really don't see that happening."
That's the approach I took in Lathe a long time ago.[1] Under my approach, the partial order is itself extensible, and it even determines the ordering of its own extensions. I quite like the expressiveness of this approach, but this expressiveness is barely meaningful for in-the-large programming: In order for the precedence rule programmer to have enough information to make judgments by, they need to require all other programmers to annotate their extensions with appropriate metadata. That or they need to maintain their own up-to-date metadata describing common libraries! Instead of programming language battles, we get framework battles full of boilerplate.
Since finishing that system, I've been pondering the "framework" conventions I'd like to use myself, and I've been trying to fold those decisions into the core of a language design.
Whereas Magpie and many other multimethod systems make a big deal about subclasses, I try to avoid any kind of programming where almost all X's do Xfoo, but some X's are also Z's so they do Zfoo instead. By the same principle, I'd avoid the [odd? _] vs. [= 1 _] case altogether. If fact, as long as I succeed in formulating in the non-overlapping designs I like, I avoid the need for precedence rules altogether... but it's still an option I keep in mind just in case.
Currently, I favor supporting extensibility by way of sealer/unsealer pairs and first-class (multi)sets.
Sealer/unsealer pairs work when each extension is an all new range of possibilities. In Arc, I'd just represent these as tagged values, and I wouldn't bother to pursue the secure encapsulation associated with sealer/unsealer pairs. In linear logic, the additive operators describe this kind of combination.
First-class (multi)sets work for when each extension is a new participant in a single computation. In Arc, a list is a good representation for this. In linear logic, the multiplicative operators describe this kind of combination.
When precedence is necessary, it can be modeled explicitly as a transformation of a (multi)set into a more structured model. I think any programmer who makes an extensible tool should carefully choose a transformation that makes sense for their own purposes--whether it's really "precedence" or something else.
---
"For general predicates, instead of hard-coding an order I'd rather use a big if/case/pattern-match that I at least know is ordered the way I wrote it."
That's my take on it, too. My examples of multi-rule functions have included factorial and exponentiation-by-squaring, but those are unrealistic. There's no reason for an extension to come interfere in that process, so it might as well be the body of a single declaration.
When I was using predicate dispatch more often, I often discovered that I could satisfy most of my use cases with just two extension annotations:
- This is the real meaning of the function, and all the other cases are just for auto-coercion of parameters into the expected format. Use this extension first. (This came up when implementing 'iso in terms of 'is, and it also makes sense for coercion functions themselves, such as 'testify.)
- This extension is the last resort. Use it last. (Maybe it throws an informative error, or maybe it returns false when everything else returns true. This also works for all-accepting coercion functions like 'testify, but I'm suspicious of this kind of design. A call to 'testify always does Xfoo, except when it does Zfoo.)
---
[1] Unfortunately, right now arc-Lathe's version of the precedence system isn't something I maintain, and Lathe.js's version has no easy-looking examples because I no longer store extensions in mutable global state. Lathe.js's "hackable" utilities are now higher-order utilities that take all their once-global dependencies as explicit parameters.
"they need to require all other programmers to annotate their extensions with appropriate metadata."
If Nulan had multimethods, I'd probably just require programmers to add additional metadata to the object in addition to the %pattern-match gensym. But Nulan doesn't have linearization or multimethods, so I avoid that whole mess!
---
"Whereas Magpie and many other multimethod systems make a big deal about subclasses"
And I have to agree: objects are amazing, but classes and subclasses are terrible. In fact, I'm of the opinion that probably all hierarchial relationships are too restrictive. Hence Nulan's object system which is completely flat, but has pretty much all the benefits of classes/prototypes.
Basically, the behavior that is common to all objects (of a particular kind) is put into a function, and behavior that is specific to a particular object is put into the object itself. And immutability gives you wonderfully clean semantics for prototype/class behavior, without all the complication and baggage.
Well, I was thinking in terms of Nulan, where you define new patterns with a custom "%pattern-match" property. It wouldn't be hard to add in a "%pattern-match-order" property or such, though I admit I haven't really thought through how such a property would work...
Obviously that kind of system wouldn't work in wart where predicate dispatch is based on arbitrary function calls. Hence my idea in Nulan of not allowing function mutation. I would write the above code like this:
The above is basically exactly the same as Arc, except it uses an array rather than a cons, and it has a custom %len property. If you don't want the different parts of the queue to be public, you could use gensyms like this:
(w/uniq (len left right)
(def queue ()
(dict %len (fn (x) (x len))
len 0
left nil
right nil)))
Rather than returning an array of 3 elements, it returns an object, which has a "len", "left", and "right" property.
In either case, the object that is returned has a %len property, which is a function that computes the length. The "len" function would then be defined like this:
(def len (x)
((x %len) x))
That is, it first looks up the %len property in the object, and then calls it with itself as the first argument.
---
* [1]: You might be wondering what's going on here... well, in Nulan, a "list" is just like a JavaScript array: it's an ordinary object which uses numeric keys and has a custom %len property. In particular, that means that all the tools that work on objects work on arrays too.
The @ syntax is used for splicing. So, for instance, to merge 3 objects together, you'd say { @foo @bar @qux }. And because arrays are objects too, you can splice them.
So what we're doing is, we first create an array of 3 elements, and we then splice in a new object. This new object has a custom %len property, which overrides the %len property of the array.
Alternatively, we could have done it like this:
(set (array nil nil 0) %len ...)
But I think it's better to use the splicing notation, especially when you use [] for arrays and {} for objects. By the way, these two are actually equivalent:
{ @[[] [] 0]
%len ... }
[[] [] 0
@{ %len ... }]
In the first case we take an empty object, splice in an array, and then assign a %len property. In the second case, we take an array and splice in an object that has a %len property.
In either case, the object that is returned has the same keys and values.
Sorry to hijack your wart thread, but... I just realized something. I wanted to allow for iterating over the keys of an object, but that caused issues, as I discussed with rocketnia earlier (http://arclanguage.org/item?id=16823)
Anyways, JavaScript gets around this problem by allowing you to define a property on an object that is "non-enumerable". But any kind of system that I add in that lets you define "non-enumerable" properties is going to be big and complicated.
Instead, I found a very simple way to create enumerable objects in a way that is completely customizable, and doesn't even need to be built-in:
(= %keys (uniq))
(def iterable-object ()
(let keys []
{ %set (fn (x k v)
(pushnew k keys))
%rem (fn (x k)
(pull k v))
%keys (fn (x) keys) }))
Here's what's going on. Firstly, we got the %keys gensym, which is supposed to be a function that returns a list of keys in the object.
The function "iterable-object" returns a new object that has custom %set, %rem, and %keys properties:
%set is called when assigning a property to the object. It takes the key and pushes it into the "keys" array.
%rem is called when deleting a property from the object. It takes the key and removes it from the "keys" array.
%keys just returns the "keys" array.
Now, these objects are capable of being enumerated, which means they could be passed to "map", for instance. But here's the fun part: you can completely control which properties are enumerated and which aren't.
In this case, the object will only enumerate properties that are added or removed after the object is created. So any properties defined previously are still hidden. At least, depending on how I implement splicing and %set...
---
What the above basically means is... "every computer problem can be solved by the addition of more properties on an object" :P
After fidgeting with the syntax, here's what I got:
$def iterable-object ->
[ %set -> x k o n
[ @x %keys -> ~ (pushnew (keys x) k) ]
%rem -> x k o
[ @x %keys -> ~ (pull (keys x) k) ]
%keys -> ~ {} ]
I actually realized that swapping [] and {} is way better, meaning that [ foo bar ] is (dict foo bar) and { foo bar } is (array foo bar). There's two reasons for this:
1) {} is closer to () than [] is, which is really nice in macros:
$mac $accessor -> n v
$uniq %a
{$def n -> %a
{{%a v} %a}}
2) I found that {} didn't stand out enough, but [] does.
---
By the way, in case you're curious about the Nulan syntax... $ is prefixed to vau/macros, which is why it's "$def" rather than "def"
-> is the function macro, which means (-> x y z ...) is equivalent to (fn (x y z) ...) in Arc
~ is the "wildcard syntax" which matches anything, just like _ in Haskell
[ foo bar ] translates to (dict foo bar), and { 1 2 3 } translates to (array 1 2 3)
@ is for splicing. Which means that [ @foo @bar @qux ] merges three objects into one. If you want to update an object with new properties, it's idiomatic to say [ @foo ... ]
Which, if translated into JavaScript, would look something like this...
var iterableObject = function () {
var a = {};
a.set = function (x, k, o, n) {
var a = Object.create(x);
a.keys = function () {
return pushnew(keys(x), k)
}
};
a.rem = function (x, k, o) {
var a = Object.create(x);
a.keys = function () {
return pull(keys(x), k)
}
};
a.keys = function () {
return []
};
return a
}