Well, almkglor and I had talked once about having [ .. : .. ] syntax, which overloads the normal brackets if the reader detects the colon. In this version, the variable names come before the colon, and the function body after. Zero arguments should also work. I don't know if it was ever included into anarki, but it seems like it should be useful.
Those proposals shorten fns by at most three characters. Are multi-arg fns used often enough to warrant this? news.arc contains 23 multi-arg fns in 1769 lines of code; therefore they would save about 1 char every 26 lines.
That would be ok if the proposals were simple and elegant, but personally I find them hackish and inconsistent with the rest of the language. They also don't fully replace fn because they lack an equivalent for (fn args ...).
Here's my idea: just replace 'fn with a special symbol, like \. This seems to work:
--- brackets0.scm 2008-11-11 17:06:01.000000000 -0600
+++ brackets.scm 2008-11-11 17:06:17.000000000 -0600
@@ -18,7 +18,8 @@
; a readtable that is just like the builtin except for []s
(define bracket-readtable
- (make-readtable #f #\[ 'terminating-macro read-square-brackets))
+ (make-readtable #f #\[ 'terminating-macro read-square-brackets
+ #\\ 'non-terminating-macro (lambda _ 'fn)))
; call this to set the global readtable
personally, I think that (fn (a b) (+ a b)) is more readable than (\(a b) (+ a b)), and readability matters much more than number of characters.
Also, the [:] form could save more characters, if it automatically applied the outer set of parens to the body form.
However, I don't think it's really that much of an improvement; fn works well enough unless you really like extra syntax.
What was it the original poster wanted, anyway? It sounded like something that was more readable than _1 etc. for the var names; thus my dredging of the old thread. If not, then obviously, it wouldn't be a good choice. Maybe the [:] form should be capable of only naming some of the args, and leaving the rest to the other naming convention? Then the [] form can name the first n arguments by putting them before a :, and have the args after that referenced by $, $0, $1, $2, etc. or some better character set, if _ looks bad.
It's only a problem if fns of two or more args are common, and they don't seem to be. In news.arc, srv.arc and blog.arc they appear once every 123 lines. In my CL code they appear every 250 lines. Are they more common in your code?
The only things I don't like about arrow form are 1) two characters, and 2) it looks like other math symbols.
I like the colon form, but some text editors make it almost invisible. If the font makes it bold enough, it can be easier to recognize than many of the others.
I originally like the pipe form, as it's also pretty obvious. However, since this is a lisp, you can always rewrite it to suite your individual tastes ;) How about writing a "config" file for arc, and various conversion tools, that allow us to all write in our own style, and easily convert between them? Then we wouldn't have to argue over which separator to use.
I implemented something like the first one in my m-expression reader, "a -> b;" is translated into "(fn a b)". It could be used with cchooper's customisable reader, so you can still use s-exprs most of the time, like this:
Any idea on how this could be implemented? I think it might help if functions carried around pointers to their source code in an easy to access manner.
; convert a cycle into transpositions
(def trans (cycle (o first nil))
(if (no cycle) nil
(~cdr cycle) (list:list cycle.0 first)
(cons (list cycle.0 cycle.1)
(trans (cdr cycle)
(if first first cycle.0)))))
; permute a list using a list of disjoint cycles
(def permute (cycles l)
(with (ts (apply join (map trans cycles))
ret (copy l))
(map [= (ret (- _.1 1)) (l (- _.0 1))] ts)
ret))
(permute '((1 2 3) (4 5)) '(a b c d e))
=> (c a b e d)
Thanks, that was what I was looking for. Unfortunately, I think I've configured something wrong in git or ssh, and cannot clone the arc repo for actually using it ;)
By temporarily, I originally meant without defining a shadow function, like you did, just reorganizing them for the single call. But that might not be easy to do in a concise, flexible way so I suppose shadow functions make the most sense.
You shouldn't need ssh just to clone the repo. This should work fine:
git clone git://github.com/nex3/arc.git
However, you can't push using that URL, so you'll end up with an un-pushable copy. Have you tried looking through all the guides? http://github.com/guides/home
To reorder arguments in a single call... it would certainly be easy to do that for binary functions:
As for the silence, it appears almkglor and stefano are busy working on hl (http://github.com/AmkG/hl/tree/master). I don't know if they still read the forum. I suspect a lot of other people are getting disheartened.
Also, until there's another release of Arc, there isn't much to talk about.
Yes, I think I've followed all of the guides. However, I keep getting the "Permission (publickey)" error. I wonder if there are more configurations like username or email address that have to be the same as what github expects?
Speaking of optimizing for "100 years", what are some design decisions which would make the language more flexible/easy to use that, while possibly sacrificing speed or efficiency, would make the language more useful in the long run?
It seems to me that computers (currently, it may not last) are doubling in speed rather frequently, so any efficiency problems in the language will quickly be outweighed by it's usability. That's partly why one of the most popular tools for making websites is ruby. It may not be fast, but it's cheaper to double the server capacity than hire more programmers.
If you're looking at long-term prospects, my understanding is that the idea that computers have "speed", which "matters", is actually temporary (should stop around 2040 if current trends continue). At that point you run into the thermodynamic limits of computation, where the question is how much energy you're willing to put into your computation to get it done.
However, I'm not sure that affects language design a great deal. It seems to me the goal of a language is to let you express algorithms in a concise and efficient manner, and I don't see that becoming unimportant any time soon.
Well, I think we're sort of agreeing with each other. You said that speed should be irrelevant, because language design was about describing algorithms concisely and efficiently; I said that we should try and think outside the box, by ignoring our sense of efficiency. If it didn't matter how the feature was implemented, what features could we come up with that could radically change how easy it is to code? Maybe there aren't any, but it is an interesting idea. That is not to say that I'm only interested in inefficient improvements ;)
Another possibility:
Lisp is an awesome family of languages partly because they have the ability to create higher and higher levels of abstraction via macros. "Complete abstractions" they were called in Practical Common Lisp. Are there any other means that could create "complete abstractions"? Or easily allow higher level abstraction?
Moore's law has actually been failing as of late. I suspect that exponential increase in computer capability is a feature of the yet-nascent state of computer technology. Assuming that it will continue into the foreseeable future is a bad idea. Performance is important; it's just not as important as some think, and in particular, it's not so much all-around performance as the ability to avoid bottlenecks that's important. This is why a profiler for arc would be nice, as would some easy and standard way to hook into a lower level language for speed boosts.
If you generalize Moore's law to cost of computation per second, it isn't failing. In 100 years, network availability, speed, and capacity will lead to yet-unimagined changes.
How will programming languages take advantage of a thousand or a million CPU's?
Interesting. I was just wondering recently what capabilities arc or cl had for traversing tree structures (since that is, after all what lisp is "all about") for doing something a la jquery; searching for arbitrary elements in a tree based on their name, or some attribute of their contents. I just haven't had the time to bother looking.
Seems like this would work for most cases, and it's pretty easy to make an arbitrary selector.
Did I miss anything? Are there any better options?
Also, how would you modify the data structure using this? I'm not too familiar with arc yet.
If you can modify it, I would presume it's done by editing-in-place via reference. How would you edit the tree via copying? i.e. return a new tree that's been modified?
The boring stuff, like building nice parameterized SQL queries and getting back the data from SQL. Launching a system process in parallel and keeping track of its status (and potentially aborting it if e.g. it takes too long)
If we do all of the boring stuff in a clean, concise way, that makes everything easy, with the option of adding macros on top to boot, the boring stuff might well become fun, or at the very least, painless.
Some times ago I started a GTK+ binding, now "paused". It's more boring than I thought initially. If you wish look at it for a starting point (file gtk.arc in Anarki). I now think a binding towards tcl/tk would look nicer and easier to use, though.
These would require a standard FFI system. Or else we would end up writing Anarki specific code. Such a fork would be a real Arc killer (in the bad sense of the term).
sacado built an FFI on Anarki.... well forks are generally bad but with PG asleep until october or so .... (maybe he's getting ready for oktoberfest or something ^^)
That's true. Programming languages take years to evolve, not months. Moreover, Arc is in the language design phase. As of today, the syntax and the semantics of the language are more important than libraries. For example, I'd like to see something like Kenny Tilton's cells (http://github.com/stefano/cells-doc/) nicely integrated within the core language.
Is it in the language design phase? Does anyone know if this is in fact true, and if so, what is currently being designed? The impression I have is that it just sort of stopped.
I would suppose that since a) this language is not used in anything "important" yet, and b) it's open source; yes, it can be in the design phase. I should think that the design phase persists until, for some reason, the language becomes "formalized", or until it is impossible to change anything major without ruining things for a lot of people. At that point you can still "design" the language, but since it has a generally understood "style" and so forth, it won't be able to change too much unless you make a whole new language.
What do you want to be designed? One of the major problems about "designing" a new lisp is that, since anyone can "do it" in their spare time, they don't see the point. Maybe they're right. ^^
Sorry for all of the quotes; it looks kind of funny, I'm sure.
We really need a module system, so that we don't have to always use the global namespace. I don't really want arc to turn into a lispy, slightly better designed php. The comparison is due to the fact that most functionality is added to php not via libraries, but "shotgun blasts of functions into the global namespace"
Maybe you already created a module system, and I am just unaware of it?
Why is the arc community so interested in adding lots of intrasymbol syntax? Adding all of these special case reader macros sounds like it might be nice to use if it was easy to add, but they seem rather wasteful and likely to cause problems in the future. Maybe this particular instance isn't too much of a problem, but I wonder if all of these extra non-lispy symbols will be healthy in the future
Why not use (import (package-name symbol-name) new-symbol-name)? or just (import package symbol new-symbol-name) where the last two variables are optional? With that syntax, packages don't seem too far off from hash-tables. That doesn't mean hash tables should be used to implement packages, I'm just saying they're similar. Calling (package symbol) would just return the value of the symbol from that package, and (def new-name (package symbol)) would be the same as import. There would still be some use for import, or using, to register certain members of the package into the present context without renaming them.
Using and import statements could be equivalent to a "global with," that is only escaped by the complementary "not-using" statement. If you don't want to use the package for the whole file, you could just use a normal with block, or an equivalent statement for packages, possibly built into "using".
I understand your idea of "interfaces," and it seems useful. However, wouldn't that mean that the library file would need to maintain all of the old versions, even if you weren't using them? Is there any way around that?
Thoughts? Have I completely misunderstood everything? I'm not really much of a programming language designer, and I'm not sure if I understood everything you've written on packages so far. How's snap going?
> Why is the arc community so interested in adding lots of intrasymbol syntax?
Because PG suggested adding them to Lisp and related languages, citing the case of 'quote and 'quasiquote and friends.
That said I don't understand your concern about this here.
Edit2: Personally I'm also somewhat concerned about exploding syntax. This makes things harder, sometimes a lot harder. http://russ.unwashedmeme.com/blog/?p=58
>Adding all of these special case reader macros sounds like it might be nice to use if it was easy to add
cref. my ssyntaxes.arc on Anarki
> Calling (package symbol) would just return the value of the symbol from that package
This will require special handling for macros; if I wanted to use a macro explicitly from a particular package, I'd need to do something like this:
; library.arc
(in-package foo)
(mac a-macro (x) (+ "the foo macro says: " x))
; my-program.arc
(in-package my-program)
; since a.b == (a b)
(pr (foo.a-macro "temp"))
However this means that the arc base system has to check that a list in head position might resolve to a macro or a symbol for a macro. Someone actually posted a patch for this but didn't push it on Anarki (I'll search for it later when I have a bit more time).
Your suggestion is certainly valid; one can implement packages as macros that perform lookup for a hidden global table and would probably use gensyms.
However the problem lies not in functions or macros but in types. Currently the standard Arc Way is to use plain symbols for types. Now, suppose package A defines a type 'parser, while package B also defines a type 'parser. If symbols are kept global (unpackaged!!) then we have a type clash: if you have an object of type 'parser, is it 'parser as known by A or 'parser as known by B? My proposal, on the other hand, will transform them into '<A>parser and '<B>parser.
Edit: PG: "I can't imagine why users would want to have type labels other than symbols, but I also can't see any reason to prevent it." http://www.paulgraham.com/ilc03.html
Edit: Of course, currently Arc is semi-standardized on using symbols, and I doubt it'll change sometime soon. In principle it's possible to use non-symbol type tags (say some random object containing inheritance and fields information)
Digression:
I'm also implementing multimethods/generic functions, and they depend on type. So you can do something like, say:
(in-package foo)
(using <arc>v3)
(interface v1 my-type)
(def my-type (x)
(annotate 'my-type x))
(defm <base>+ ((t a my-type) (t b my-type))
(my-type (<base>+ (rep a) (rep b))))
... and you can do something like:
(using <foo>v1)
(= a (my-type 1))
(= b (my-type 2))
(= c (my-type 3))
(+ a b c)
=> #3(tagged <foo>my-type 6)
It involves some trickery on the Scheme side (mostly to keep '+ efficient even though it's mostly defined on the Arc side), but hey, overloading + is cute ^^
Of course, package <bar> can define <base>+ on its own my-type, and adding a <foo>my-type and <bar>my-type will correctly throw a typing error ^^, which is even cuter.
> However, wouldn't that mean that the library file would need to maintain all of the old versions, even if you weren't using them?
How good or bad is backwards compatibility? If you use a library today and someone fixes a bug in it, but in the process completely changes the interface from under you, would you be pleased or pissed?
Backwards compatibility accumulates cruft, true. BUT, it helps with a very human need to be lazy. If updating a library means I might have to look through all my other tools, checking that something the library provides in older versions is changed to conform to newer versions, I might prefer not to update anyway.
> How's snap going?
Badly. I'm stuck on I/O. The bad thing is with sockets. A socket is bidirectional (one FD number for input and output directions), but mzscheme splits it into two mzscheme ports, one for input and one for output. Arc-on-mzscheme thus expects monodirectional sockets. This means I might have to redesign a bit of the central I/O, since I was designing with listen/socket-accept returning a single port object.
The bits I'm doing are actually listed in the second-to-the-last post on the SNAP VM blog, i.e. the backlog.^^
I've been thinking a bit more on the "package" problem, which is a bit dangerous, I know.
I don't really know how you're doing scoping in SNAP, or even much of how it's done in arc, but if the lexical scoping is done the same way as it is according to SICP (which I'm just now reading), then each "environment" has a pointer to it's parent frame, and if the interpreter can't find the definition in the current frame, it moves up one level.
What if you just merged the concept of "packages" "environments" and "scope" into the same thing? I don't really know what the syntax should be but basically when you load a package at a certain level of the program, it adds a pointer to the current frame (maybe between this frame and its parent? I don't know how well multiple parents works for lexical scoping) and it's treated by the interpreter as if it was just another scope.
Then, to make things fun and interesting for more than just packages, you add a syntax for naming scopes, and referring to them either by name or by relative height. This way you can access not only package specific variables, but also get around some shadowing problems by telling it "I only want 'a if it's in this scope" or "I want 'a from the frame two levels up"
I don't really care too much what the syntax looks like, but this would be an interesting way to reference packages that should be (if scoping actually works in a way similar to I've presumed) a) fast and b) capable of managing more than just packages.
One major problem is that it would probably require lots of low-level munging in scheme to get this to work. It would probably be easier to do in SNAP, since I think you're implementing most of this from scratch there.
What do you think?
Edit: I've been thinking again ^^, and I just realized that though this might be an interesting way to implement packages, I don't think it would fix your typing issue. Maybe variables should all carry around info of which scope they're from (kind of what you were suggesting)? That way the system can look in the right place for variables more quickly, too. If 'a says it's from scope this+2, we wouldn't have to wait for the interpreter to miss twice before it found the value.
In a true nested-environment implementation, the innermost function will still carry around 'x and 'y even though it doesn't need them. This means that GC will not be able to collect the contents of those variables.
A faster way would be to "flatten" the environment. In effect, what's really done is:
For mutated and shared variables, most implementations have some sort of "boxed" variable; this is what are called "shared variables" in arc2c/SNAP.
------- BUT!
Just because environment flattening is how it's usually implemented doesn't mean it's not possible to capture and name an environment.
For example consider that what we call a "function" is really an object constructed by a 'closure very-hidden function. It's really a pair of the environment and the function code. Similarly if Arc had ways of destructuring a function, we could implement a call-with-current-environment function. Suppose we had a 'environment-of function which extracts just the data of a given function:
> What if you just merged the concept of "packages" "environments" and "scope" into the same thing?
This has been implemented. Take a look at lib/module/module1.arc . But aside from having problems with macros, there's also....
> I've been thinking again ^^, and I just realized that though this might be an interesting way to implement packages, I don't think it would fix your typing issue.
Quite right ^^
> Maybe variables should all carry around info of which scope they're from (kind of what you were suggesting)?
Err. I don't quite understand this. Also, I don't quite understand how this realates to symbols-as-types...
I seem to be slowly arriving to similar conclusions that you have, the only differences being things that I didn't know about the way that lisp is optimized. Maybe I'll end up learning something through all of this.
> Err. I don't quite understand this. Also, I don't quite understand how this realates to symbols-as-types...
Well, what I had originally thought when I said this was rather silly, which was that in the value of the variable there should be information about which scope it's located in. Doh! Maybe that would work if it looked in the current env to figure out where the "real" var is.
What I now think is something very similar to what you've been saying, except I still think it would be cool if this could be a method of managing general environments, instead of just modules. I don't know if that was what you were thinking, or not; it's more of an implementation detail.
I do like the idea of being able to extract and name the environment at any point in the code. If all that a package was was an environment with a name and we had the ability to access items from each environment directly, it could (possibly?) be useful for a lot more than just modules.
I really like the syntax (env sym), where env is the environment object, and 'var is the symbol we're looking up. This way env.sym works. That's probably a syntax that many people are used to. I just don't know if it would work or not.
For your typing problem, I would do what you recommended. If it made any sense, I would use the syntax env.sym, so that it can be evaluated to figure out what type it is. You would still have to evaluate everything in head position to see if it evaluated to a macro. On the other hand, you could just check the car of the first item, and see if it was an environment. Or you could treat the whole thing as a single symbol and have it bypass the reader? How does arc's typing system work?
Also, how much would changing the type symbol to <env>sym help? How would it help the code decide how to interpret it?
I just don't know enough about scheme's pointer and environment system, or arc's typing system to know if any of my thoughts are even going in the right direction. So I just make more of them to cover the bases, and have you explain why none of them work. ^^
Basically at this point, I'll agree with anything you say, because I don't know enough to disagree with you. I would be very interested in learning/having you teach me all of the details about arc's implementation. As always, I'd love to help, but I don't know nearly enough to be useful.
Maybe we should start a new thread for the module system that you're proposing?
As per one of you previous comments: If symbols weren't global, how would you do them?
Edit: PG: It would be really nice if there were "reply" and "View Thread" buttons for each comment on the comments page. How hard would these be to add?
> What I now think is something very similar to what you've been saying, except I still think it would be cool if this could be a method of managing general environments, instead of just modules. I don't know if that was what you were thinking, or not; it's more of an implementation detail.
Yes, it would definitely be cool to access the environment. For one, it would make serializing functions possible: aside from giving accessors to the environment and to the function code, you can give constructors for functions that accept a serializable environment and a serializable function code.
> If it made any sense, I would use the syntax env.sym, so that it can be evaluated to figure out what type it is.
Typical 'isa pattern:
(isa foo 'cons)
Translating to module.sym form:
(isa foo 'arc.cons)
.... except that arc.cons == (arc cons) and you'd be comparing the type to the unevaluated (arc cons), because of the ' mark.
With <arc>cons syntax, the "<" and ">" characters are simply passed directly, i.e. <arc>cons is just a single symbol.
Is there any reasonable way to prevent the dot from being interpreted there? Or alternatively, evaluate the whole statement?
And how is the isa statement going to be updated to work independent of which module it's running in? Does each module type all of it's objects that way from the beginning, or were you going to have the interpreter do something fancy?
> Is there any reasonable way to prevent the dot from being interpreted there? Or alternatively, evaluate the whole statement?
First things first. We want to make the type arc.cons readable, don't we? So if say 'arc here is a macro, it will expand to a symbol, which is basically "the symbol for cons in the arc package".
Now that symbol can't be a 'uniq symbol, since those are perfectly unreadable.
What we could do is....... oh, just make up a symbol from the package name and the given symbol. Like, say...... <arc>cons. The choice of <> is completely arbitrary.
Now.... if (arc cons) is a macro that expands to <arc>cons, why go through the macro?
> And how is the isa statement going to be updated to work independent of which module it's running in? Does each module type all of it's objects that way from the beginning, or were you going to have the interpreter do something fancy?
Something fancy, of course. Specifically it hinges on the bit about the "contexter". Remember that in my proposal I proposed adding an additional step after the reader, viz. the contexter.
Basically the contexter holds the current package and it puts any unpackaged symbols it finds into the mapping for the current package.
Now the contexter goes through it and maintains hidden state. This state is not shared and is not assured of being shared across threads (it might, it might not, implementer's call - for safety just assume it doesn't)
Initially the contexter has the package "User".
It encounters:
(in-package foo)
This changes its internal package to "foo". This (presumably) newly-created package is given a set of default mappings, corresponding to the arc axioms: fn => <axiom>fn, quote => <axiom>quote, if => <axiom>if, etc. The contexter then returns:
t
The t is evaluated and returns.... t.
Then it accepts:
(using <arc>v3)
This causes the contexter to look for a "v3" interface in the "arc" package. On finding them, it creates default mappings; among them are:
def => <arc>def
isa => <arc>isa
annotate => <arc>annotate
...etc.
Upon accepting this and setting up the "foo" package to use <arc>v3 interface, it again returns:
t
Then it accepts:
(interface <foo>v1
make-a-foo foo-type is-a-foo)
This causes the contexter to create a new interface for the package "foo", named "v1". This interface is composed of <foo>make-a-foo, <foo>foo-type, and <foo>is-a-foo.
After creating the interface, it then returns:
t
Then it accepts:
(def make-a-foo (x)
(annotate 'foo-type x))
Since it isn't one of the special contexter forms, it simply uses the mapping of the current package - specifically the package foo - and returns the form:
Notice how x is implicitly mapped into the package foo; basically any unpackaged symbol which doesn't have a mapping in a package is automatically given to that package.
See, I told you you'd convince me to come around to your way of thinking, once I'd asked enough questions to find out the reasons for the choices you've made.
So, how hard is it to take what you've made so far, and generalize it to work with any kind of environment, and not just packages? So, things like destructuring functions, naming environments, passing them around, etc. Would this allow us to avoid shadowing variables by naming which level they came from explicitly?
Also, if I type in the symbol '<arc>bar, will the contexter read that and presume I'm looking for the bar in arc, or will it rename it <foo><arc>bar? And which is better?
If it's the latter, how would you propose we directly reference a particular environment? Would that be pack.sym, like I had thought earlier?
> So, how hard is it to take what you've made so far, and generalize it to work with any kind of environment, and not just packages?
Hmm. Well, if you want to be technical about things, one Neat Thing (TM) that most compilers of lexical-scope languages (like Scheme and much of CL, as well as Arc) do is "local renaming". Basically, suppose you have the following code:
(fn (x)
(let x (+ x 1)
x))
This is transformed (say by arc2c) into:
(fn (x@1)
(let x@2 (+ x@1 1)
x@2))
Note that the renaming could actually be made more decent, i.e. something readable. For example, in theory the renaming could be done more like:
In fact I am reasonably sure that Scheme's hygienic macros work after local renaming, unlike CL's which work before renaming. You'll probably have to refer to some rather turgid papers though IMO, and decompressing turgid papers is always hard ^^.
> Also, if I type in the symbol '<arc>bar, will the contexter read that and presume I'm looking for the bar in arc, or will it rename it <foo><arc>bar? And which is better?
Well, since packages are constrained to be nonhierarchical, <arc>bar is considered already packaged and the contexter will ignore them. The contexter will add context only to unpackaged symbols.
Well, I'm sure that since you're actually having to use arc, and implement it in a multi-thread environment, you'd know much more about the problems that a module system is likely to have than I. How does erlang do theirs? I found a short whitepaper on the subject, but it didn't have very many details. I'm sure that whatever they use is perfectly safe for whatever you'd be doing in snap, though the implementation/syntax might not be the nicest.
About macros, hasn't there been some interest for a while in getting macros to be "first-class" and so forth? How are they implemented, exactly, that makes this so hard? Is it really that hard for the reader to take the output of a function and automatically interpret it as code?
Also, you mention some effort being made into allowing arc to see if the form in head position resolves to a macro. How hard is that to do? I really don't know how macros work, though I understand what they do.
Aren't they just functions tagged to let the interpreter know that they can a) be expanded at compile time, because the output is not dependent upon the value of the input, and b) return forms that need to be evaluated?
How hard would it be to let normal functions do this? Suppose we had a syntactical feature on function definitions, such that (fn x 'a) prequoted a, so that the form located at a is captured, instead of it's value. Would that be enough to "simulate" macros? Or does the interpreter need to know that the forms that come out the other end probably need evaluation?
I should think that all a macro is is a tag that guarantees to the compiler that it's output is completely independent of it's input, and can thus be expanded at compile time.
I know we kind of went through something like this before, but what am I missing. If you clear this up, maybe I'll finally understand how macros really work ;)
And good luck on SNAP. I would love to help you, as lisp + erlang (feature wise) is something I am very interested in. Unfortunately, it's just way beyond my abilities at this point. I shall look forward to reading your blog!
Is the problem with the sockets due to the fact that you're trying to maintain compatibility with the present arc-on-mzscheme? Or is it something more fundamental than that?
The difference is that Arc evaluates forms from a loaded file one expression at a time. 'def in Arc is simply an assignment to a global variable; technically, 'load doesn't load a module, it executes a program (which in most cases just assigns functions to global names).
Erlang source files, on the other hand, are a set of function definitions - they aren't executed at the time you load. There are no globals in Erlang (although the function names are effectively equivalent to global variables that can't be mutated normally). Each Erlang source file is compiled as a single unit, meaning one source file == one Erlang module.
> About macros, hasn't there been some interest for a while in getting macros to be "first-class" and so forth?
I presume you mean something like this:
(let my-macro (annotate 'mac
(fn (x y)
(+ "my-macro says " x " and " y)))
(my-macro "hmm" "haw"))
> How are they implemented, exactly, that makes this so hard?
It isn't how macros, per se, are implemented that makes this hard, it's how efficient interpreters are implemented that makes this hard.
One of the slowest implementations of interpreters are what's called "AST traversers". Basically, the interpreter simply goes through the list-like tree structure of the code and executes it. In a Lisp-like, the AST is the list structures input by the s-expression syntax. This is what macros fool around with.
The slowness of this is usually because it needs to enter each sub-AST (i.e. a sub-expression, e.g. in (foo bar (qux quux)), (qux quux) is a sub-AST) and then return to the parent AST (in the example, it has to return to (foo bar _)).
However a faster way to do it is to pre-traverse the syntax tree and create a sequence of simple instructions. This is usually called a "bytecode" implementation, but take note that it doesn't have to be a byte code.
For example (foo bar (qux quux)) would become:
(call qux quux) ; puts the return value in 'it
(call foo bar it)
The increase in speed per se is not big (you just lose the overhead of the AST-traversal stack while retaining the overhead of the function-call stack), but it gives an opportunity for optimization. For example, since the code is now a straight linear sequence of simple instructions, the interpreter loop can be very tight (and relatively dumb, so there's very little overhead). In addition, it's also possible to transform the linear sequence of simple instructions to even simpler instructions... such as assembly language.
However, consider the above sequence if 'foo turns out to be a macro. If it is, then it's too late: the program has already executed 'qux. If it were part of say a 'w/link macro, then it shouldn't have executed yet. Also, recreating the original form is at best difficult and in general highly intractible, and remember that the macro expects the original form.
So in general for efficient execution most Lisplike systems force macros to execute before pretraversing the AST into the bytecoded form. This also means that macros aren't true first class, because they must be executed during compilation.
In short: most lisplikes (mzscheme included) do not execute the AST form (i.e. the list structures). They preprocess it into a bytecode. But macros work on the AST form. So by the time the code is executed, macros should not exist anymore.
> Also, you mention some effort being made into allowing arc to see if the form in head position resolves to a macro. How hard is that to do?
Trivial, just add a few lines in ac.scm. However rntz didn't push it on Anarki, which suggests that the modification hasn't been very well tested yet. http://arclanguage.com/item?id=7451 but the patch itself has been lost T.T . I think it'll work, but I haven't done the patch too either ^^.
> And good luck on SNAP. I would love to help you, as lisp + erlang (feature wise) is something I am very interested in.
Ah, I see now. How naive of me to presume that lisp actually worked with the AST like it says it does. Oh well.
Is there any way to optimize the interpreter without sacrificing AST interpretation? Or should I write my own language that says "interpreted languages are supposed to be slow; don't worry about it" for the sake of more powerful (in theory) macros? ^^
Or is there actually no difference between the qualities of the two macro systems? Would you care to enumerate the pros and cons of each system? You can do it on a new thread, if you like.
> Or should I write my own language that says "interpreted languages are supposed to be slow; don't worry about it" for the sake of more powerful (in theory) macros? ^^
You might be interested in Common Lisp's 'macrolet form. I've actually implemented this in Arc waaaaaaay back.
Considering that CL is targeted for direct compilation to native machine code (which is even simpler than mere bytecode), you might be interested in how CL makes such first-class macros unnecessary.
I'm very interested in both of those. Would you care to explain? If not, do you have any particularly good resources (besides a google search, which I can do myself)?
So, how does that work, exactly? Does macrolet tell lisp that since the macro is only defined in that scope, it should search more carefully for it, because it doesn't have to worry about slowing down the whole program?
Err, no. It simply means that the particular symbol for it is bound only within the scope of the 'macrolet form. In practice, most of the time, the desire for first-class macros is really just the desire to bind a particular symbol to a macro within just a particular scope, and 'macrolet does that.
For other cases where a macro expansion should be used more often than just a particular scope, then usually the module or whatever is placed within a package and a package-level macro is used.
Nice. I only did the gcd function, so you've got an amazing head start on me. However, the version of gcd that I did was Euclid's algorithm. Probably faster than prime factorization; and I included a case for only one number, and a list of greater than two.
Since I couldn't commit to github for some reason, here's the code:
;;Int (predicate)
(def int (n)
"Returns true if n is an integer; i.e. has no decimal part."
(is (trunc n) n)) ; return true if the number is the same as itself without the decimal part
;;Greatest Common Denominator
(def gcd l
"returns the greatest common denominator, or divisor, of a list of numbers. Numbers should be integers,
or the result will default to 0."
(with (a (car l) c (cdr l))
(if (len> c 1)
(gcd a (gcd (car c) (cdr c))) ; handle lists of more than two numbers
(no a) 0
(no c) (abs a)
(let b (car (flat c))
(if (or (~int a) (~int b)) 0 ; skip non-integers
(is a b) a ; return common divisor
(if (> a b)
(gcd (- a b) b)
(gcd (- b a) a)))))))
Update:
Figured out what I was doing wrong with github. Math library now started in lib/math.arc. Currently it only contains my gcd function and the 'int test, but feel free to add any other math functions.
In the future, we may need to make it a math folder, with individual files for trig functions, hyperbolic functions, calculus, matrix and vector function, quaternions, etc. But for now one file is more than enough to contain my lowly gcd :)
My original int? was identical to yours, but that's why I changed it. Although, now that I think about it, since prime-factorization discards the results of operations that cast n to a rational, that's unnecessary. Still, there will need to be separate functions for testing if a num's type is int (which yours does), and testing if a num lacks decimal places (which mine does).
(And yes, Euclid's is definitely faster than prime factorization -- I originally looked at better algorithms for prime finding and gcd but I decided to settle for the "pure" approach -- sieve instead of some Fermat's or another fast primality test, prime factorization based instead of Euclid's.)
Ah, I didn't notice that. Apparently I should test things more carefully next time :)
I don't know if you have github access, but if you do, you can feel free to add your own code to the math.arc lib. I probably won't have a chance to add it myself in the next week or so. If I can, though, I will definitely try to do so.
In doing so, I noticed one bug in Darmani's original implementation of 'floor and 'ceil. 'floor would return incorrect results on negative integers (e.g. (floor -1) => -2), and 'ceil on positive integers (e.g. (ceil 1) => 2). This has been corrected on Anarki.
I also used mzscheme's 'sin, 'cos, and 'tan instead of Darmani's, not because of speed issues, but because of decreased precision in those functions. In order to get maximum precision it would be necessary to calculate the Taylor series an extra couple of terms, which I didn't feel like doing at the time.
I didn't commit 'signum, 'mod, 'prime, or 'prime-factorization, because I wasn't sure if they were needed except for computing 'sin, 'cos, and 'gcd... but feel free to commit them if you want.
1) Isn't pushing those math functions straight from scheme sort of cheating? I mean, maybe I'm just wrong, but wouldn't the solution be more long term if we avoided scheme and implemented the math in arc?
2) Shouldn't fac be tail recursive? Or is it, and I just can't tell? Or are you just expecting that no one will try and compute that large of a factorial
3) If some one did compute that large of a factorial, is there some way for arc to handle arbitrarily sized integers?
1) No, you should implement in the math in the underlying machine instructions, which are guaranteed to be as precise and as fast as the manufacturer can make it. The underlying machine instructions are fortunately possible to access in standard C libraries, and the standard C library functions are wrapped by mzscheme, which we then import in arc.
2) It should be, and it isn't.
(defmemo fac (n)
((afn (n a)
(if (> n 1)
(self (- n 1) (* a n))
a))
n 1))
3) Yes, arc-on-mzscheme handles this automagically. arc2c does not (I think it'll overflow)
Implementing numerically stable and accurate transcendental functions is rather difficult. If you're going down that road, please don't just use Taylor series, but look up good algorithms that others have implemented. One source is http://developer.intel.com/technology/itj/q41999/pdf/transen...
That said, I don't see much value in re-implementing math libraries in Arc, given that Arc is almost certainly going to be running on a platform that already has good native math libraries.
I figured that being close to machine instructions was a good thing, but I thought that we should do that via some other method, not necessarily scheme, which may or may not remain the base of arc in the future.
That being said, if you think that pulling from scheme is a good idea, why don't we just pull all of the other math functions from there as well?
Actually I think it might be better if we had a spec which says "A Good Arc Implementation (TM) includes the following functions when you (require "lib/math.arc"): ...." Then the programmer doesn't even have to care about "scheme functions" or "java functions" or "c functions" or "machine language functions" or "SKI functions" - the implementation imports it by whatever means it wants.
Maybe also spec that the implementation can reserve the plain '$ for implementation-specific stuff.