If you're looking at long-term prospects, my understanding is that the idea that computers have "speed", which "matters", is actually temporary (should stop around 2040 if current trends continue). At that point you run into the thermodynamic limits of computation, where the question is how much energy you're willing to put into your computation to get it done.
However, I'm not sure that affects language design a great deal. It seems to me the goal of a language is to let you express algorithms in a concise and efficient manner, and I don't see that becoming unimportant any time soon.
Well, I think we're sort of agreeing with each other. You said that speed should be irrelevant, because language design was about describing algorithms concisely and efficiently; I said that we should try and think outside the box, by ignoring our sense of efficiency. If it didn't matter how the feature was implemented, what features could we come up with that could radically change how easy it is to code? Maybe there aren't any, but it is an interesting idea. That is not to say that I'm only interested in inefficient improvements ;)
Another possibility:
Lisp is an awesome family of languages partly because they have the ability to create higher and higher levels of abstraction via macros. "Complete abstractions" they were called in Practical Common Lisp. Are there any other means that could create "complete abstractions"? Or easily allow higher level abstraction?
fatal: protocol error: expected sha/ref, got '
*********'
You can't push to git://github.com/user/repo.git
Use git@github.com:user/repo.git
*********'
I don't know why. The config files for my clone of that repository and my clone of the nex3 arc repo appear to be equivalent, although I've never tried pushing to that one (someone who has pushed, please see if your results are different).
I looked in .git/refs/remotes/origin, and the files seemed pretty similar too.
After seeing the message, I tried
git push git@github.com:cchooper/publictest
but that failed with
ERROR: Permission to cchooper/publictest denied to noahl.
fatal: The remote end hung up unexpectedly
I don't know what's going on now. Anyone else? (If you have time, someone please check the github docs too.)
The first issue happens with Anarki too. The default address is read-only.
The second one: could be a setup issue. To push to github you need an account with an SSH key set up, and SSH needs to be configured to associate that key with github.com.
Okay, I think I get what's going on with macros. So the idea of current macros is that they can only insert references to variables in the scope of wherever they're expanding. Therefore they do operate purely on syntax. If they want to insert references to functions, you need to make sure that the functions are bound in the expansion scope, but on the other hand, the macros are conceptually clean and really really easy to implement.
For what you're saying, I suggest the old MIT Scheme macro system as an example (before they implemented Scheme macros correctly): essentially, a macro is a function of three arguments - the form it's applied to, the environment of that form, and the environment that the macro was defined in. There is a procedure to return a "symbol" that refers specifically to a name bound in a given environment, so you can make specific references to it. The macro procedure returns some sort of code (just like normal lisp macros), but it can optionally include some of these special "symbols" that refer to bindings in a specific other environment.
That is much more complicated to implement, though, and requires environments as first-class objects.
Well, I wouldn't say "on syntax", I'd say "on code" - and code only contains symbols, not "references to variables in a specific scope"; scope is determined by context - the surrounding code, which the macro can't alter: you can't substitute code into other code while retaining the same surrounding code, that's a contradiction in terms! But this is just terminology.
The old MIT Scheme macro system seems interesting from what you say - is there any place I could go to find an implementation which has this behavior? Or at least examples of code which use it? It seems like it lets code do precisely what I said it couldn't above: contain "references to variables in a specific scope", which is pretty cool. I don't think you'd need to implement environments as first-class run-time objects, merely second-class compile-time objects, with this technique, unless you also allow macros themselves to be first-class.
Okay, I think I'm starting to see. There is quite a big difference between the way Lisp people think of macros and the way Scheme people think of them.
From the documentation, I think that the current version of MIT scheme has this behavior, so look at http://groups.csail.mit.edu/mac/projects/scheme/. (By the way, in case you run Ubuntu, the version of MIT Scheme in the repositories is broken for me.) Look in the documentation for macros (under "Special Forms"), and it's their non-standard low-level macro system. If you're interested in stuff like that, you should also check out syntax-case, which I don't know much about, but I understand is the new, cool way to write Scheme macros. It includes hygienic and unhygienic functionality. Google search for "syntax case" and you'll get some documentation.
The more I look at it, though, the more I think that Scheme macros solve a different problem than Lisp macros. I don't know what it is yet, but it would be interesting to know.
I think you've hit the nail on the head. Hygenic macros and unhygenic macros are very different things (unlike dynamic vs lexical scoping, which are just different ways to create a function). Lisp macros are 'true' macros (Wikipedia: "Macro: a set of instructions that is represented in an abbreviated format"). Hygenic macros are more like a new abstraction that was inspired by Lisp macros.
Well, I'd rather not argue about what 'true' macros are, but I would point out that your definition is basically data compression for programs (which, by the way, I think is an interesting approach to take to programming language design). I'm pretty sure both types of macros and normal functions would all fall under it.
As for the hygienic vs. unhygienic difference, unhygienic macros are certainly easier to define: they rearrange source code into other source code.
The one thing I can think of that hygienic macros can do that unhygienic ones can't is that while they are rearranging source code, hygienic macros can insert references to things that aren't bound to any variable in the expansion scope. The common example I've seen for this is that it lets you protect against people redefining your variables weirdly. For instance, if you insert a reference to 'car', it means whatever 'car' meant where you defined your hygienic macro, even if 'car' has been redefined to be something crazy in the place where your macro is used. The Scheme hygienic macro system also has a way to break hygiene if you want to, so it can do everything other Lisp macros can do.
I guess the question then is, is it useful to be able to do that?
And if you decide you want to be able to do that, are Scheme-style hygienic macros the right way to go about it?
(One option would be to just let you insert objects straight into forms, instead of symbols that you think should reference those objects. This would be fine unless you wanted to be able to set those things later, in which case you'd need some way to get at the actual underlying variable boxes.)
> That is much more complicated to implement, though, and requires environments as first-class objects.
Given the interpreted nature of Arc, first class environments shouldn't be too hard to implement, but in a compiled implementation it would be a lot more difficult.
They could not be implemented on top of the existing arc-to-scheme translator, because that's what it is: an arc-to-scheme translator, not an interpreter. Scheme doesn't have first-class environments, so we can't shove them into arc without writing a full-fledged arc interpreter.
If the environments are temporaries created only while processing macros, and are discarded afterwards, they don't necessarily have to be difficult for compilers.
Oh, I agree, dynamic scoping and lexical scoping are formally equivalent. For one thing, basic computer architectures are all dynamically scoped, with 2^32 (or whatever) variables, one at each memory address. Also, as you say, you can just pick one dynamic variable and use it to hold the lexical environment as you walk through a program. (Unfortunately, I don't know enough to see what the really cool implications of this are, but I'm sure it has some.)
My question is, why is lexical scoping considered better for functions, but dynamic scoping for macros? And more specifically, are these actually cases where different solutions are better, or is it an artifact of the fact that we don't know how to implement lexically scoped macros very well?
Yes, exactly. I see why that is true. But think about a parallel function:
(def n (f a)
(f a))
In the function, you can pass f as an argument to call. In a macro, f is passed implicitly in the environment the macro is expanded in. The method of passing f to the macro m seems very much like using dynamic scoping to pass arguments to functions. My question is, what about a macro system where you could pass real arguments to macros? I.e., other macros (or functions, etc.)? What makes these things different?
> what about a macro system where you could pass real arguments to macros?
Like this
(mac n (f a)
`(,f ,a))
(n [+ _ 1] 9)
==> 10
where you pass a form that is then evaluated by the macro, or did you mean something else? Macros' arguments aren't evaluated, so you can pass only forms. To pass the result of a computation in CL (not in Arc) you can use read macros, that are evaluated before macro expansion time:
(n #.(computation) 9)
This is quite unusual, because macros are intended mainly to modify the syntax, so it's quite natural to make them work on the syntax itself (i.e. the forms).
Ah, I see this. I think I have been thinking of macros differently than you have (and probably wrongly). I suppose lexical scoping for macros would make the most difference in the case where a macro expands to a function call, like this:
(let x 0 ; call (count) to keep a count of things
(def count () (set x (+ x 1))))
; count-ops: count how many lines of code you run.
(mac count-ops body
(if body
(cons (car body)
(cons '(count)
(count-ops (cdr body))))
'())
(def foo ()
(count-ops ; this count-ops call fails
(with (count 1 step 2)
; do some loopy stuff here
)))
If that's not a compelling example, pretend that count-ops is inserting an interrupt check between every two lines of code. Why is dynamic scoping better for cases like these?
As for real arguments to macros, yes, I meant something like the CL stuff. You're right, though, that macros modify syntax, and I wasn't thinking about them that way. Two posts up you said that macros are "expanded in place". I think that you were thinking about the effect macros have on the code they're called on, whereas I was thinking about the call to the actual macro procedure, and passing arguments to it.
It seems to me that Arc provides a solution to this problem in the fact that you can call any data object as a function, and define what it means to do so. What we're really getting at, though, is a theoretical issue with what a function is.
The easiest solution to your problem might be a callable table whose keys are the types of arguments and whose values are different function implementation. A slightly cooler implementation could provide arbitrary predicates by having a list of (predicate, implementation) pairs. When it was called, it would iterate down until it found a predicate that matched the arguments, and then call that implementation. You could make either of these pretty trivially.
However, this gets at the idea of what a function really is. A theoretical "pure lambda" is something you can apply to other objects and maybe get a result from. I think we need these in the language, to support primitive types and a foreign function interface.
When you talk about chaining, however, you're thinking of something like a linked list of these things, and you want to manipulate the structure of the list. That's already getting away from the pure lambda idea. My ideas get farther away, and at the risk of looking foolish, I'm going to claim that anything that solves your problem also includes a way to get inside a function object and mess with its internals. The question then is, what should a function be? What should you be able to do to one? Here's a short list of what we've talked about:
; f is an extended function object.
; the following should work.
(apply f ...) ; call f on some args
(redef f (argtyps) ...) ; redefine the behavior of f on some argument types, or add a new implementation if one isn't already there.
(extended-function
argtyps imp
...) ; takes pairs of argtypes and implementations, and makes an extended function object.
You also need the primitives still, so you'll want this:
; f is some sort of object.
(callable? f) ; see if f is any sort of callable
(primitive-procedure? f) ; see if f is a pure lambda. this might be a bad name for this function.
There would be primitive constructors for assembly code and other languages.
(asm ...) ; takes asm, returns a primitive-procedure
(from-shared-object file name) ; loads a primitive from a library. this should probably be renamed too
(from-c-header header name impfile) ; loads a primitive, but deals with types nicely by reading a C header file that defines this.
> The easiest solution to your problem might be a callable table whose keys are the types of arguments and whose values are different function implementation.
Yes. But there's a problem: how do you define the table for, say, the variadic function '+ ?
Your solution(s) are approximately what I had in mind, although I think I found some problems with it last night, and I just woke up and can't remember quite yet (brain is still booting or something).
(def + rest
(reduce <base>+ rest))
(defm <base>+ ((t x string) (t y string))
(join x y))
Further, optional parameters are simply ignored and considered as part of the rest parameter (i.e. they can't be typed). Basically, the typesystem matches on the first N parameters, where N is how many type parameters you care to define.
Why the first N parameters? So that we can protect against optional parameters defaulting to values that are inappropriate for other types, such as the 'scanner example I gave above.
First, I have always taken symbols to be things that are conceptually different than strings, but just happened to be represented as strings. After all, symbols really appear because they are part of the interpreter/compiler data structures - they are a way of referencing an identifier, which you might use as, for instance, a variable name. Yes, they can be displayed, and they print as the string associated with their identifier, but I've always considered that a rather uninteresting side effect. Showing a symbol to a user would seem to break layers of abstraction.
Strings, on the other hand, always seemed like the right way to store user-readable text. They have these nice escape sequences, and you can isolate them and change them without affecting your entire program. They can also include spaces and nice stuff like that.
However, it is completely possible that my gut reaction is wrong. Having interned strings can be very useful. I would suggest, however, that we do it with a real interned string facility, and then make symbols a special case of that. This facility should include strings that would not make valid identifiers, including things with spaces and special characters in them.
To your second point, though, you're right - strings are pointless. After all, they're just a special case of a list (or an array, depending on how you like to think). If you have the general case, then having the specific too is pointless and feels bloated. You could argue that characters, too, are just a special case of numbers, since that's what they really are underneath. In fact, you could say that the current string type is just a vector of numbers, with cool special syntax.
To which I would say, yes, let's go for the more general stuff. Practically speaking, we should add that in before we take out the specific string case, which after all is used a lot. But in general, yeah, let's make it possible to do this with anything.
If you do allow this abstraction, you also get the interesting benefit that internationalization and unicode come for free. As PG says, they're just stupid bloat in the core of the language. In order not to have them there then, there should be some way for people to implement them, without sacrificing anything that a built-in implementation would have.
This means that there needs to be a way to make a new type which is a vector of uniform type (oddly enough, this might be easier in arc2c than in arcn or anarki). It also means that there should be a way to define nice readable syntax for things like strings in double quotes, and isolated single characters.
And it still doesn't address the issue of how you communicate with the development system, which after all does have one preferred string representation - the one that you're writing in.
This definitely wouldn't be an easy or simple thing to do, and it might prove to be a bad idea. But I think it's worth considering.
You make some really good points, but I have to disagree. Because of Unicode, characters aren't numbers. They have different semantic properties. However, I think arc.arc has a good idea sitting on line 2521 (of the Anarki version): "; could a string be (#\a #\b . "") ?" That is a novel idea, and probably (if more things work with improper lists) a good one. The downside would be that (isa "abc" 'string) wouldn't be true, unless we make strings (annotate "abc" 'string), but then we lose the ability to treat them as lists. Maybe we should have a retype operator, so that (retype "abc" 'string) would return an object str for which (isa str 'string) would be true, but for which all list operations (car, cdr, map, etc.) would still work (so it would probably have to return true for (isa str 'cons), too).
Well, Arc lists are (e e . nil). Why do we have a different terminator for strings?
This is where things get hairy. In Scheme lists are (e e . ()), so I would say that having strings be (#\a #\b . "") would make sense there, but Arc specifically tries to pretend that () is nil. Why should "" be different from nil, when () is nil?
As for [isa _ 'string], I propose instead the following function:
(def astring (e)
((afn (e e2)
(if
(no e)
t
; check for circularity
(is e e2)
nil
(and (acons e) (isa (car e) 'character))
(self (cdr e) (cddr e2))
; else
nil))
e (cdr e)))
Edit:
Hmm. I suppose someone will complain that you might want to differentiate between the empty string and nil. To which I respond: How is nil different from the empty list? Arc attempts to unify them; why do we want to not unify the empty string with nil? If someone says http://example.com/foobar?x= , how different is that from http://example.com/foobar ? x is still empty/nil in both cases.
As another example: table values are deleted by simply assigning nil to the value field. And yet in practice it hasn't hurt any of my use of tables; has anyone here found a real use where having a present-but-value-is-nil key in a table?
That works, but it only eliminates half as much nesting. Would it work for more with first-class macros? (I'm just going to expand it into a compose call because I don't want to make it all one symbol)
Exactly what do mean? Macros seem to work in functional position at least.
arc> (and:or nil t)
t
> But really, what improvement in readability do you get with the 'block style?
I agree with you on this. 'block doesn't really remove parens, it just moves them around and removes the nested structure. According to pg's Ansi Common Lisp, "Lisp programmers read and write code by indentation, not by parentheses." I find nested code to be more readable than flat code for exactly that reason, even if the parens stack up at the end of the block.
Which is exactly what I meant by "macros seem to work in functional position." But amkglor's original statement "compose does not work with macros" does not take into account the special treatment of 'compose in functional position, which is why I was confused.
And if I am not mistaken, first class macros, once implemented, could be used in other situations like your example above.
Looking at alkmglor's comment, it appears to indicate that a literalcompose doesn't work; when it's optimized out, it works fine. In other words, everyone is vehemently agreeing with each other ;)
And of course, first class macros could do this. But we don't have those yet…
I find it more readable personally, and definitely easier to write, if there are fewer parentheses at the end of the block. Also, the lack of indentation keeps things lined up nicely, without spilling them off the right edge of the screen.
But really, the block macro defines context (in fact, I considered calling it "context"). That is the semantic meaning. It gives you a list of expressions that define the context for the rest of the list.
I personally find it easier to read, but I am curious about whether other people do, and why or why not. You said the details of 'repeat don't look special anymore. Do you think they should?
Stalin, but there's no need to make the whole program freeze. If compiled code needs to inline operators, just inline them as they are in that particular function, and let the rest of the world keep on changing. If it turns out to matter later, you can always say (recompile function) or something like that. Maybe (update-internal-defs function) would be better.
It would make it a lot easier if you just kept the code to all interpreted functions you defined around, so you could just inline them with whatever definition happened to be around. Keeping the code around is a good thing anyway, though, so that's all right.
(Note: I do not mean keep the code to compiled functions. That would be bad, because compiled function objects are exactly how an ffi is going to get into arc later. We don't want to start assuming that everything has sexps. This does make update-internal-defs a little weird, though. Even though they don't necessarily have sexps, I think they do need to have a table listing everything from the environment that they use.)