Sorry for being the complainer who changed your default license. :-p For one thing, after that time we talked about this, as I tried to justify my complaints, I ended up feeling way more comfortable with the Artistic License 2.0. Mostly though, I'm just really hard to please when it comes to licenses. I'll demonstrate. :-p
(Note from my future self: Since I'm not going to get around to this later in the post, I just want to say that I like your metric of choosing a license based on how easy it is to understand. ^_^ )
I still have a pretty fundamental objection to the AL 2.0: It doesn't let a branching project become its own thing. Even if it ends up having little in common with the original, you can't distribute "Compiled forms" without providing instructions for obtaining a copy of the original.
Aside from that, for all I know as a non-lawyer, a project that's grown completely out of its roots might have to be fully documented! :-p
...provided that you clearly document how it differs from the Standard
Version, including, but not limited to, documenting any non-standard
features, executables, or modules...
I'm troubled by the idea of licenses that try to define compilation. The AL 2.0 defines the "Compiled" form as "the compiled bytecode, object code, binary, or any other form resulting from mechanical transformation or translation of the Source form." What happens if someone refactors the code in an IDE?
Here, the spirit of the license is what counts (I assume), and there's a clear line. But some projects horribly blur that line. GCC needs a special exception to the GPL so that the C runtime pieces it includes with executables don't pollute those binaries with GPLness. In a language where people program by writing languages, do all their languages need exceptions like the GCC's? What about exceptions nobody's even dreamed of yet, to deal with all new kinds of code modularity and transformation?
As far as I know, the only consistent source-ownership stance someone can take if they want to make great strides in program modularity and transformation is a permissive one. There's no option. All the nonpermissive free software licenses I've looked at assume too much about the languages they'll be applied to, but it's hardly their fault; somehow they have to distinguish maintainable programs from obscured programs, everyday engineering from reverse engineering.
This isn't something I expect to be solved on the technology side, say by making reverse engineering so incredibly easy to do that there's no reason for free software advocates to object to compilation or obfuscation. Software-as-a-service isn't amenable to inspection this way. And I hardly expect to solve the problem socially or economically by convincing everyone in the world that it's valuable to choose permissive licenses. The easiest solution is probably a particular kind of license after all.
Whatever licenses these are would have to maintain their source-ownership stance even as their projects evolved new and unexpected kinds of modularity and transformation. Permissive licenses basically do this (as I said), and a maximally nonpermissive license (comparable to the AGPL) would have to start with a seed like this, and get more specific and legalesey:
You are allowed to distribute and license this product only under the
following conditions:
- You expect a large market of people will promptly have a reasonable
way to obtain technology and knowhow comfortably substitutable for
that you actually used to develop the product. This includes any
source code.
- You expect that your lessees will promptly have permission
comfortably substitutable for the permission you actually exercised
to develop, distribute, and license the product.
- You give your part of that permission immediately and indefinitely.
- You don't refuse to distribute and license this product to any
potential lessee on the basis of how possible it is for that lessee
to obtain that technology, knowhow, and permission.
- You allow your lessees to develop modified versions of this
product.
- You require your lessees to accept these terms as you did.
This license continues to apply to a product if it is modified,
including actualizing the product by performance or execution.
Note that if a license like these lines actually comes to exist, I probably won't even use it myself most of the time. I'm just frustrated at having a lack of good options.
Hmm, I wonder if you're getting Pauan and me mixed up. >.> I'm against 'isa and 'type almost altogether, but I think Pauan considers 'type to be part of the high-level interface of a value:
I consider it to be a part of an interface. If we're using duck typing (which I recommend), what happens when two functions define different behavior, but with the same names? All sorts of subtle bugs ensue. But by leveraging the type mechanism, we can group these differing behaviors into interfaces, avoiding the problem completely.
"Let's suppose you wish to create a custom table type. How would you go about doing that in pgArc?"
I haven't gotten all the way through your post, but I'm pretty sure it's impossible to define callable types in pg-Arc without dropping to Racket, and even then it might be impossible to do it cleanly unless you've loaded ac.scm with Racket's 'compile-enforce-module-constants parameter set to #f. Because of this, I recommend Anarki and Rainbow when it's necessary to have callable types, since they have 'defcall.
But hey, here's an unclean way. ^_^
The only things callable in pg-Arc are functions, tables, strings, and lists, and the only one of those that can have significantly custom behavior is the function. (This is set in stone thanks to 'ar-apply being on the Racket side, where the Racket compiler determines it's a constant.) We'll need to represent custom tables directly as Racket functions that provide the getter behavior.
This doesn't give us an easy way to get the internal representation of the table, or even to distinguish our tables from other functions. Fortunately, we can tack extra metadata onto an encapsulated value like a function by stuffing it into a weak hash table. We just need to make sure our tables aren't 'eq? to any other functions, which in our case should be true thanks to the fact that they capture the local variable(s) holding the storage for this particular table.
(I think pg-Arc does allocate a new function every time a (fn ...) form executes, meaning we don't necessarily need to make sure the function captures something, but Rainbow will compile certain (fn ...) forms into constants, and for all I know Racket may change its behavior in a future version.)
(mac $ (x) `(cdr `(nil . ,,x)) ; bug exploit to drop to Racket
(= isa-my-table ($.make-weak-hasheq))
(= rep-my-table ($.make-weak-hasheq))
(def my-table ()
(let internal-representation (...)
(let result (fn (key (o default))
...behavior for getting...)
(= isa-my-table.result t)
(= rep-my-table.result internal-representation)
result)))
(extend sref (self val (o default) . rest)
(and isa-my-table.self no.rest)
...behavior for setting...)
At this point, type.x will be 'fn for custom tables. Let's say we don't want that.
We could overwrite 'type, but that doesn't change the type seen internally by 'ac-macro?, 'ar-tag, and 'ar-coerce in ac.scm. They all use 'ar-type directly. Fortunately, nothing uses 'ar-tag and 'ar-coerce but Arc's 'annotate, '+, and 'coerce, all of which we can overwrite on the Arc side. That leaves 'ac-macro? as the one holdout... and that one checks specifically for the type 'mac, so there's nothing lost if it sees 'fn rather than our table type.
It would be interesting to actually write the code to replace 'type, 'annotate, '+, and 'coerce, but I don't have time right now.
"I haven't gotten all the way through your post, but I'm pretty sure it's impossible to define callable types in pg-Arc without dropping to Racket, and even then it might be impossible to do it cleanly unless..."
Yes, that was my entire point all along. The whole point of this post was in fact to demonstrate that it's impossible or obscenely difficult with pgArc, but incredibly easy with methods. That is what I've been trying to do this whole time: demonstrate why methods are awesome, and more extensible than things like (readc) and (readb) functions.
A callable thing (like a function) that accepts a key parameter (usually a symbol), that then does something. Yes, it's quite vague, intentionally. I'm using the word "method" because the concept is similar to "methods" in OOP languages.
I'm calling this thing a "method" in the sense of "foo has a method add and a method sub". It's not formalized, just a convention. They don't even need to be symbols. To look at it another way... it's inverting a function call. Thus:
(add foo 5) -> 6
(foo 'add 5) -> 6
Sometimes the first form is appropriate, sometimes the second form is. I believe in the case of input streams, that the second form is better. In other words, (stdin 'char) rather than (char stdin), for the reasons already given.
This is analogous to the concept of classes/objects/methods in languages like JavaScript or Python:
var foo = {
add: function (x) { return x + 1 },
sub: function (x) { return x - 1 }
};
foo.add(5) -> 6
foo.sub(5) -> 4
Incidentally, in JavaScript, a "method" is simply a function that is attached to an object. The only difference between a JavaScript "method" and an ordinary function, is the `this` binding.
That's a potential way to make methods easier to define/use, but doesn't replace the concept. Let me put it this way... where do you define a function?
The traditional answer is "in the global scope" or maybe in a let block or something. That's all well and good, in most circumstances. In the case of a data-type, though, you want the possibility to create custom data types, like a custom table, or a custom stream.
The problem is... in this case, because things like peekc and readc are global functions, you end up needing to do some hacky stuff, extending them in a very verbose way to add in your custom stream type.
So instead, let's think of a different place to define the functions. In this case, the functionality is local to the stream: each individual stream decides how it is read. Thus, rather than defining a global function that operates on streams... we define a 'peek and 'char function on the stream itself. That, is the fundamental concept of methods.
Of course, this is only necessary for the primitive functions like readc and peekc. Functions that build on top of them (like readline) can be written in the normal way. So, I'm proposing that Arc remain essentially the same, with the key difference that data-types (tables, input, output, etc.) are implemented as methods, which makes it way easier to define custom data types.
I'm totally with you, not trying to replace any concept, just suggesting alternative terminology, something that isn't so overloaded.
And pointing out that a language with these 'places to put functions' doesn't need functions anymore. Your 'methods' are just a form of dispatch, and you want programmable dispatch policies. In the simplest case, no dispatch policy. In a language with your methods you need no concept called functions.
Wart already does all this. You can already overload peek or readc. I'd love for you to take a look and see if it does what you're talking about. Don't panic :) it's super brief and readable by all accounts. (or if you've taken a look already just say so and I'll stop with all the PR.)
And yes, I'm aware that the term "method" is overloaded, but I still feel it's appropriate. Consider earlier where I showed functions that are placed into a table. I don't think that could reasonably be described as guards, yet they can be reasonably described as methods.
Also, you're right that methods (in OOP languages) can simulate functions, and vice versa, but I feel like functions are "lighter weight". For starters, a method must always be attached to something, it can't be global. I like functions, so I'd like to keep 'em, I just feel like methods are more appropriate in certain (probably rare?) circumstances.
Assuming that we use the function approach rather than the object approach, then "guard" would indeed be a very reasonable name for them. So thanks for that.
Yeah, we can agree to disagree on terminology :) I think case expressions with guards are more general than C++-style vtables because you can dispatch on arbitrary expressions rather than just lookup a symbol in a table. It seems a pity to come so close and give up power you can get for free.
Yeah, except that C style languages (including JavaScript) make `switch` so awful that you end up using if/else chains. A missed opportunity, to be sure. Arc does a better job, which is good.
By the way, I'm curious, how would you go about creating a custom table type in wart? In my hypothetical ideal Arc, it would be like this:
(annotate 'table (def my-table (x n v)
(case x
'get ...
'set ...)))
Pretend that it's been made pretty with macros. The ... after 'get is called when doing (my-table 'foo) and the ... after 'set is called when doing (= (my-table 'foo) 5)
Incidentally, my hypothetical ideal Arc wouldn't even be particularly difficult to do, but I'm curious if there's a better (shorter or more extensible) approach than what I gave above.
"By the way, I'm curious, how _would_ you go about creating a custom table type in wart?"
Hmm, I didn't really follow your example. Arc already has a table type. Do you mean you want a literal syntax for tables, so that you can say {a 1 b 2} or something?
No... I'm talking about writing something in Arc that behaves like a table, and fits seamlessly into Arc. Consider my earlier prototype post (http://arclanguage.org/item?id=13838). A prototype behaves exactly like an ordinary table, with the only difference being that if it can't find a key, it checks it's parent.
As I explained in that post, it's impossible to do this seamlessly in Arc:
The above is impossible to do in Arc. What I am describing is a simple interface that makes the above not only possible, but easy. It lets you create new data structures that Arc treats as if they were tables. I will write a post about this ina sec.
$ git clone git@github.com:akkartik/wart.git
$ cd wart
$ git checkout a5d25c805a97347d1207 #Current HEAD for those trying this in future
$ cat > 099heir.wart # Just kidding; copy and paste the following lines
# until the '^D' into 099heir.wart
; A new type: heir (because it inherits from a parent)
(def clone-prototype(x)
(annotate 'heir (obj val (table) parent x)))
; Extend ssyntax
(defcall heir h
[or rep.h!val._
rep.h!parent._])
; Extend assignment
(defset heir(h key value)
(= rep.h!val.key value))
; Extend keys
(def keys(x) :case (isa x 'heir)
(join (keys rep.x!val)
(keys rep.x!parent)))
^D
# Now start your engines
$ wart
wart> ; And try the following expressions at the prompt
(= a (table))
(= b (clone-prototype a))
; inheriting from parent works
(= a!x 23)
(prn b!x) ; => 23
; assignment works
(= b!y 34)
b!y ; => 34
a!y ; => nil
; keys works
keys.a ; => (x)
keys.b ; => (x y)
; vals is left as an exercise for the reader :)
wart> (quit) ; or hit ctrl-d a few times
Yeah, that's more-or-less the way to do it in pgArc as well (but using extend rather than :case).
It's nice that wart makes it so easy to extend stuff (with :case and defcall), but that doesn't help with the situation of having to extend multiple built-in functions (like keys, vals... anything that relies on tables, even user functions!) just to define one new type.
A further benefit of my system is that your custom tables can have a type of 'table, so existing code that expects tables can use them too.
If user functions are untouched I think one approach is as verbose as the other, whether you say: (= x (fn args (case car.args ...))) or def :case over and over. The latter seems more declarative to me.
The verbosity isn't in the actual declaration. It's in the need to extend built-in functions. My system avoids that completely: no extending.
As for extending user functions... consider this: what about code that expects something of type 'table? Your prototype is of type 'heir... so that won't work. My system does work in that situation. That can be mitigated by having functions coerce to 'table, and then define a defcoerce for heir, but why jump through so many hoops when you can get all that for free?
To put it another way, my system uses duck typing: if it supports these arguments, use it. This means that all you need to do is define something that looks like a table, and Arc automatically treats it as if it were a built-in table. No need to extend Arc's built-ins to convince it that, "yes I really really really want this to be a table."
"what about code that expects something of type 'table?"
I blame that code for its lack of extensibility. It should just go ahead and use the value as a table, trusting the extensions to be there (and using failcall to recover, of course ^_^ ).
If it must do a boolean check to determine the value's features (perhaps because failcall isn't available, or because it needs to fail fast), it shouldn't do a direct type check. Instead, it should call another global function dedicated to the inquiry of tableness (which is probably too specific an inquiry anyway, most of the time). That way, people can extend the tableness function too.
Extensibility isn't the only benefit of avoiding [isa _ 'table]. It also becomes possible to have a single value that supports table operations and (say) input stream operations at the same time. It's probably a silly combination, but a custom type based on an association list could warrant it.
You're missing the point... the problem is not when the function expects something of type table... it's when the function needs to change it's behavior based on the interface. Consider `each`. It needs to change it's behavior depending on whether it's a table or a cons or a string. How is using (table? foo) better than (isa foo 'table)? Both are the same as far as extensibility goes. And in fact, you can do that right now, in Arc:
(def table? (x) (isa x 'table))
And you're also missing the whole point of message passing... message passing enables duck typing, which means it's possible to write a function that doesn't care what it's type is, as long as it supports the required message.
You're trying to solve the symptom... I'm trying to solve the problem. What I'm saying is that re-extending all the relevant built-ins is ridiculously inefficient and verbose, and that there's a far better way. Message passing not only drastically reduces verbosity, but it also solves the problem you mentioned.
"Consider `each`. It needs to change it's behavior depending on whether it's a table or a cons or a string."
Yes. In a world with failcall, it can try treating the value as a table in one rule, as a cons as another rule, and as a string in a third rule. In a world without failcall, it can do the same thing, but with the help of extensible global functions 'atable, 'acons, and 'astring. That's merely a specific example of what I just said.
---
"How is using (table? foo) better than (isa foo 'table)? Both are the same as far as extensibility goes."
Sure, there's not a whole lot of difference between extending 'table? and extending 'isa. However, in my mind 'isa has a clear role as a way to check the most specific possible feature of a value: the kind of internal data it has.
One could go all the way down, making a (deftype wrap-foo unwrap-foo isa-foo) form that eliminates the need for 'annotate, 'rep, 'isa, and 'type altogether. In this example, 'isa-foo would be a function that indicates support for 'unwrap-foo, and 'wrap-foo would be a simple data constructor that makes an encapsulated value with nothing but innate support for 'unwrap-foo. (In a system with failcall, 'isa-foo is pretty much redundant.)
This is basically how the Kernel draft deals with custom types (except without an eye toward extensibility), but I haven't been inspired to go quite as far as this in practice. I think programmers should be free, if discouraged, to break encapsulation of values they don't know the nature of, like Arc programmers can when they use 'type and 'rep.
"And in fact, you can do that right now, in Arc[...]"
Cool, huh? :-p
Obviously pg-Arc hasn't been built from the ground up this way, but Arc Forumgoers have had an awful lot of extensibility ideas since Arc 3.1 was released, and pursuing any one of them in Anarki would change the core so much as to turn it into a separate language. That's why I don't call Penknife a version of Arc.
---
"And you're also missing the whole point of message passing... message passing enables duck typing, which means it's possible to write a function that doesn't care what it's type is, as long as it supports the required message."
That's the whole point of extending global functions too. I don't care what something's type is, as long as all the utilities I use have been extended to deal with it.
As I mentioned above with 'deftype, the 'extend approach can go all the way down to the level of making the 'type function itself merely a reflection tool. Obviously pg-Arc isn't built this way, but we've had an awful lot of extensibility ideas here since Arc 3.1 was released, and pursuing any one of them in Anarki would practically turn it into a separate language. (That's why I don't call Penknife a version of Arc.)
"Yes. In a world with failcall, it can try treating the value as a table in one rule, as a cons as another rule, and as a string in a third rule."
How? I've provided concrete examples where my solution is significantly shorter at creating new data types than the current solution, while still retaining `isa`, while avoiding problems. If you have a solution that can do all that, then please present it.
---
"In this example, 'isa-foo would be a function that indicates support for 'unwrap-foo, and 'wrap-foo would be a simple data constructor that makes an encapsulated value with nothing but innate support for 'unwrap-foo. (In a system with failcall, 'isa-foo is pretty much redundant.)"
Why? My system already does things similar to that, but in a very simple way, that's compatible with pgArc.
---
"However, in my mind 'isa has a clear role as a way to check the most specific possible feature of a value: the kind of internal data it has."
This is precisely what I meant when I said that I think the current view of types is fundamentally wrong. A table is not a hash. It's an interface that could potentially be implemented as a hash... or something else. Arc leaves it unspecified, which I consider to be a Good Thing.
---
"That's the whole point of extending global functions too. I don't care what something's type is, as long as all the utilities I use have been extended to deal with it."
...but as I've already mentioned repeatedly (and demonstrated in a post), that way is significantly more verbose than it has to be. My solution solves that, in a way that is more hackable/extendable/succinct. That is the goal of Arc, yes? Hackability and succinctness? I still have not seen a solution that matches mine for succinctness, let alone extensibility. Please provide one, if you're not convinced that my way is superior. Then I will back down.
Note that they're using the message-passing idea: after an iterator has been created, you can call it with 'status?, 'dead?, 'alive?, or 'kill! to get/set it's private data.
What I'm suggesting is to treat all compound data types like that. You can then create function wrappers around them, which is in fact what they did too: note the definition of `iterator-empty?`
Every individual iterator has it's own private state, so it doesn't make sense to require extending built-in functions every single time you create a new iterator. Instead, you define a function wrapper (like `iterator-empty?`) that accesses the internal data for you. No need to extend anything! That conciseness is what message passing gives you.
"define something that looks like a table, and Arc automatically treats it as if it were a built-in table"
Bringing http://arclanguage.org/item?id=14268 back here, how would you convert code like (isa x 'table) into a method framework? Wouldn't it require getting rid of types altogether?
"how would you convert code like (isa x 'table) into a method framework?"
Haha... that's what I've been explaining all along! That's the whole point of message passing: defining a low-level framework that makes it easy to both create new types, and also extend existing types.
Go seems to do this really well. Functions can ask for a specific interface, which is defined as a set of operations. But types don't have to declare that they implement the interface. http://golang.org/doc/go_tutorial.html
I still don't see what this has to do with methods, though. (I still can't follow that humongous comment[1]). Both C++ and Java have methods. No language but Go tests for an interface by seeing the functions it supports.
And this isn't the same as duck typing at all. Arc does duck typing just fine. In arc or python there's no way to convey tableness or any other ness. You just use the operations you care about, and hopefully any new types remembered to implement them.
Duck typing has no contract, Java has too rigid and inextensible a contract. Go is just right. And you're hosed in any of them if you write code of the form if (isa x table) ..
[1] I think I've missed half your comments in recent weeks. Either because they're so long and hard to follow, or because there's just too many of them and I forgot to go to the second page of new comments. The latter is a weakness of this forum structure. For the former, well, I think you spend too much time describing methods. Most of us have seen them in other languages. What you suggest is subtly different, and the subtleties are lost in the verbiage.
Which is why runnable examples are useful. You're right that wart differs from arc in subtle ways. That's why I have unit tests, to enumerate all the subtleties. I think you're calling it out (fairly) for lots of gritty details, and then doing the same thing even more egregiously.
Or maybe I just suck at reading comprehension compared to the rest of the group here :)
"Go seems to do this really well. Functions can ask for a specific interface, which is defined as a set of operations. But types don't have to declare that they implement the interface. http://golang.org/doc/go_tutorial.html "
So... Go's interfaces actually look an awful lot like message passing. Here's a comparison:
type reader interface { // Go
Read(b []byte) (ret int, err os.Error)
String() string
}
(annotate 'reader ; py-arc
(fn (m b)
(case m
'read ...
'string ...)))
(object reader ; py-arc, with sugar
read (b) ...
string () ...)
The idea is, every object is a collection of methods, and also a type. You can just call the methods directly, bypassing the type (this is duck typing), or you can check the type to make sure it implements the interface you want... or you can use coerce to change it's interface. Here is an example of the three ways, assuming foo implements the reader interface:
Message passing is simply the low-level way to create objects + methods... or prototypes. Or rulebooks. Or other things. To put it an other way, objects are just macros that expand to message passing.
Think of it like continuations and iterators. Continuations are lower-level than iterators, and it's possible to create iterators by using continuations. Message passing is the low-level idea, which can be used to create higher-level things, like prototypes, objects, or interfaces.
---
"Duck typing has no contract, Java has too rigid and inextensible a contract. Go is just right. And you're hosed in any of them if you write code of the form if (isa x table) .."
But what if `isa` tested for an interface? In other words, (isa x 'table) would be like saying, "does x support the table interface?"
---
"I think I've missed half your comments in recent weeks. [...] For the former, well, I think you spend too much time describing methods."
Yeah, I've probably been a bit too excitable about this... ^^;
---
"I think you're calling it out (fairly) for lots of gritty details, and then doing the same thing even more egregiously."
I'm not saying it's a bad thing that wart differs from pgArc, I was just explaining why I haven't immediately jumped in and started using wart. I think it's good to try out different approaches. Nor am I expecting people to immediately jump in and start using message passing... especially since message passing probably isn't useful for everybody.
I was presenting my ideas in the hope of getting useful feedback... so that my ideas can be improved, or rejected. I know that you guys are smart, and by presenting my ideas here, you all have already helped me improve them. I was just looking for some feedback (which I'm finally getting, which is good).
Also, I think it's perfectly possible to discuss and improve on the idea without needing a runnable example. You're right that I could have explained the idea better, though.
"I think it's perfectly possible to discuss and improve on the idea without needing a runnable example. You're right that I could have explained the idea better, though."
Agreed. I'm giving you feedback on what would help me understand the idea. Sometimes if an idea has too many subtle nooks and crannies it helps me to see running code. That doesn't compel you to provide it, of course.
"I was just looking for some feedback."
You're not just looking for feedback, you're aggressively/excitably demanding it :) But you'll be more likely to get it from me (as opposed to super boring meta-feedback, and agonized grunts about your writing quality) if you provide runnable code. Pseudocode is no different than english prose.
"Nor am I expecting people to immediately jump in and start using message passing.."
I'm not sure what you're expecting when you say things like "what more do I have to do?". What more do you have to do for what?
(Btw, the answer to the question 'what more do I have to do' is.. give me runnable code :) I usually start out talking about an idea; if it is neither shot down nor embraced I go build it. If it's still neither shot down nor embraced I try to build stuff atop it. Keep doing that until you run into limitations or (slim chance) you build something so impressive people have to pay attention. Platforms need killer apps.)
---
"Go's interfaces actually look an awful lot like message passing."
Argh, here we go again..
(annotate 'reader ; py-arc
(fn (m b)
(case m
'read ...
'string ...)))
Since you can write this in arc today, it's utterly confusing and distracting to me that you keep harping on it.
"what if `isa` tested for an interface?"
This seems to be the absolute crux of your idea. This needs changes to ac.scm. But it has absolutely nothing to do with message passing. Messages are not types, and your suggestion is to augment the type system with a keyword called interface or object that isa can accept. If you made that change, arc would support 'message passing' according to you. Am I right?
I'll summarize my position again. I buy that interfaces are foundational, but lots of things are foundational. Unification like in prolog, pattern matching like in haskell, lazy evaluation like in haskell, messages like in smalltalk, function calls like in lambda calculus. That something is foundational doesn't necessarily make it a good fit for a specific language. To show that something foundational is a good fit in arc and also backwards-compatible, provide runnable examples that look good/short and behave intuitively. Show us that existing mechanisms in arc can coexist with the new feature, that it occupies a niche that they don't already provide. No hand-wavy ellipses.
---
"I'm not saying it's a bad thing that wart differs from pgArc, I was just explaining why I haven't immediately jumped in and started using wart."
No explanation necessary. I really don't think wart needs to be part of this conversation. Just compare your approach with extend and we don't have to keep on going on about "backwards compatible? at least it runs!" and "I don't care about running it, it's not backwards compatible." :)
"Since you can write this in arc today, it's utterly confusing and distracting to me that you keep harping on it."
Having built-in types like table/cons/etc. written in the same way, rather than as opaque blobs in Racket.
---
"I'm not sure what you're expecting when you say things like "what more do I have to do?". What more do you have to do for what?"
"What more do I have to do to demonstrate that message passing has advantages over extend?"
I asked that because it seemed that people weren't convinced that message passing was actually better than extend. This was unfortunate because:
1) It appeared to ignore the evidence that I was presenting
2) It seemed to basically dismiss message passing, but without explaining why. How am I supposed to know how message passing is flawed, unless people explain?
If the dismissal was caused by not understanding my idea, then it's my fault for not explaining well enough.
---
"[...] This needs changes to ac.scm."
Yup!
---
"But it has absolutely nothing to do with message passing. Messages are not types, and your suggestion is to augment the type system with a keyword called interface or object that isa can accept."
You're right: message passing is the low-level idea. Interfaces are built on top of message passing, but aren't necessary to support message passing.
---
"If you made that change, arc would support 'message passing' according to you. Am I right?"
No, in order for Arc to support message passing (at least in my mind), it would be necessary for the built-in types to also use message passing. As you pointed out, user-created Arc code can already use message passing, but that's not useful if you want to create something that behaves like a built-in type.
---
"[...] provide runnable examples that look good/short and behave intuitively."
Kay. I'll need to patch up py-arc first, though, because right now it doesn't support basic stuff (I'm looking at you, `apply`). Once I get py-arc into a decent enough shape, it should take less than a day to get message passing working.
"Having built-in types like table/cons/etc. written in the same way, rather than as opaque blobs in Racket."
"in order for Arc to support message passing, it would be necessary for the built-in types to also use message passing"
Are you planning to replace (f x) everywhere with (x f)? That hardly seems backwards-compatible. (Forgive me if this is a stupid question. I have zero 'expertise' since I haven't read 75% of what you have written in this thread.)
If you provide a way to handle (f x) as an 'f message to x, then you shouldn't need to implement primitives.
"It seemed to basically dismiss message passing, but without explaining why. How am I supposed to know how message passing is flawed, unless people explain?"
You have to try it :) It's pretty clear that nobody here has tried quite what you're proposing, so what you thought of as dismissal was just people thinking (like me for the past few weeks) that they had nothing to add since they haven't tried it, and wanting to reserve judgement until they had an opportunity to play with an implementation. But maybe that's just me :)
...but that's all hidden behind functions, so ordinary Arc code doesn't need to know that. In other words, `apply` would automagically convert (my-table 'something) into (my-table 'get 'something), so Arc code can't tell the difference. That's why it's backwards compatible.
---
"Or are you seeing dismissal in statements rather than silence?"
Mostly rocketnia, but that's fine since they now understand what I'm talking about, so we can actually discuss the idea. :P Their dismissal seemed to be because of a conflict in motivations.
"So... Go's interfaces actually look an awful lot like message passing."
That's a bit of a tautology. ^_^ If I'm not mistaken, you took the term "message-passing" from its object-oriented "foo.bar( baz )" context to begin with.
---
"Yeah, I've probably been a bit too excitable about this... ^^;"
Well, I (rocketnia) can't blame you for that, and I can't blame you for not having runnable examples either. I tend to be guilty of both those things a lot, and this discussion is a great example. >.>
my system uses duck typing: if it supports these arguments, use it. all you need to do is define something that looks like a table, and Arc automatically treats it as if it were a built-in table.
I see, yeah you're right. The alternative to duck typing would be subtypes. Ugh.
No need to extend Arc's built-ins to convince it that, "yes I really really really want this to be a table."
You're saying more than that, you're saying how. That logic is common to both approaches. Your approach is superior anytime user code has language like if (isa x 'table) ... Otherwise the two are equivalent in power.
I always treated branching on types as a code smell. wart basically grew out of trying to remove every case of that pattern in the arc codebase. You've given me something to think about, thanks.
By the way... I know using (case) is ugly. On the bright side, my system doesn't care how you write the actual function, just so long as you follow the interface. So you could use wart's :case for it as well:
(def my-table ()
...)
(def my-table (m) :case (is m 'keys)
...)
(def my-table (m k) :case (is m 'get)
...)
(def my-table (m k v) :case (is m 'set)
...)
...at least, I think that's how :case works? I actually like the wart style. That's supposed to be equivalent to this:
(= my-table (fn (m k v)
(case m
'keys ...
'get ...
'set ...)))
So yeah, it doesn't matter how you define your function (case or :case or `if` or extend or whatever), just so long as it follows the interface.
P.S. My goal here isn't necessarily to increase raw power, it's to increase extensibility and conciseness. Being able to do the same thing, but in a very hacky and verbose way doesn't sound appealing to me. So if wart has equivalent power, that's great! But I want to do better. I want to have equivalent power and make it shorter and make it more extensible.
Yeah that was the half of your point I understood when I was peddling wart :) But yes wart won't handle user code that checks the type of an object if that is important to you.
My http://awwx.ws/table-vivifier patch sounds like it may be close to what you want, though it goes too far: it extends tables so that if a key isn't found in the table, it calls a specified function to generate the value -- and then inserts the new value in the table.
If it would be useful, I could simplify it to leave off the last part, and then you could specify the action to take if the key isn't found (such as looking it up in a parent, for example).
Pretty cool. But with my proposal, you could write that in Arc itself.
My solution is general... it works on all compound data types: tables, input, lists, etc.
Imagine being able to write cons, car, cdr, instring, peekc, readc, and more, all in Arc! It drastically cuts down on the number of primitives Arc needs.
"C style languages (including JavaScript) make `switch` so awful that you end up using if/else chains. A missed opportunity, to be sure. Arc does a better job, which is good."
In languages derived from C, including JavaScript, they use `switch` rather than `case`, like so:
; Arc
(case x
"foo" ...
"bar" ...)
// JavaScript
switch (x) {
case "foo":
...
break;
case "bar":
...
break;
}
As you can see, switch blocks are atrocious, especially because they require a break; statement or they'll fallthrough. So instead of using switch, some developers prefer if/else chains:
if (x === "foo") {
...
} else if (x === "bar") {
...
}
Which, as you can see, are less verbose. Arc gets it right, JavaScript doesn't.
I'm going to argue that you're comparing apples to oranges to pears.
Arc and Javascript case functions are not expected to do the same thing. JS does not allow expressions for input arguments while Arc can. And that break; statement is a feature that some would say Arc lacks. ie, what if you want your case statement to fall through? - now Javascript is golden and one could say Arc is lacking.
I also don't believe developers, generally speaking, prefer if/else chains over switch/case statements as they are different tools intended for different purposes.
They both are not perfect, both are missing features that the other could benefit from.
I think Clojure got it right having it all: case + cond + condp.
On the contrary, switch should not have a break statement! It should have a fallthru; or continue; statement. Why make the common case difficult? This should be uncontroversial... it's been well established that switch's design is poor, and they should have made it explicit fallthru (like with a continue statement) rather than implicit.
As for Arc "lacking" fallthru... can't you implement that with continuations? That would be the equivalent of explicit fallthru.
---
My point was that switch statements are so atrocious that even in the situations that they were intended for, some developers still prefer if/else chains because they're shorter and more readable.
I think that's because the above semantics correspond more directly to assembly language, which I imagine was done because "switch" was defined back when that was either important or just not seen to be bad. (Perhaps I'm just making that up, though.) Here's how a switch statement would look in x64 assembly:
switch:
;put desired thing in rax, let's say
case_1:
cmp rax, val_1
jne case_2 ;jump if not equal
<case 1 code>
case_2:
cmp rax, val_2
jne case_3
<case 2 code>
case_3:
cmp rax, val_2
jne done
<case 3 code>
done:
<whatever>
By default, the machine will just plow through the remaining cases. If you want it to break out after one case, you have to tell it to do so:
switch:
;put desired thing in rax, let's say
case_1:
cmp rax, val_1
jne case_2 ;jump if not equal
<case 1 code>
jmp done ;jump.
case_2:
cmp rax, val_2
jne case_3
<case 2 code>
jmp done
case_3:
cmp rax, val_2
jne done
<case 3 code>
done:
<whatever>
Assembly language is kind of awesome, by the way. And so is the Miller-Rabin primality test. Some disorganized code: http://pastebin.com/raw.php?i=wRyQ2NAx
Fine. That's great and all for C, but I dislike how Java copied C word-for-word, and then JavaScript copied Java. It would have been nice if they had said, "hm... using continue rather than break would make a lot more sense."
It's not a huge deal, and developers can simply avoid switch if they don't like it. My point was merely that "JavaScript's switch sucks, Arc's (case) is awesome."
> JavaScript allows arbitrary expressions in switch statements
That's right... looks like it was the test data I was thinking of, which isn't entirely relevant, given you can accomplish the same thing in the end.
> Why make the common case difficult?
A fall-through could be nicer than many breaks. Trying to think of all the scenarios required to make this usable, i.e., do you need a break to get out of a fall-through? Maybe that's why they just went with break. either way, it's a good idea.
The current behavior for switch is simple: it will continue until it finds a break statement. Thus:
switch (true) {
case true:
console.log(1);
case true:
console.log(2);
case true:
console.log(3);
break;
case true:
console.log(4);
}
...which outputs 1, 2, and 3. If you wanted to have the same behavior using the continue statement, you would use this:
switch (true) {
case true:
console.log(1);
continue;
case true:
console.log(2);
continue;
case true:
console.log(3);
case true:
console.log(4);
}
As far as I know, this would have exactly the same power as using break; but would be much better suited for the very common case of not wanting fallthru. I think it's much more readable, too: makes it very obvious where it falls through. And it's less error prone... no more strange bugs if you accidentally forget to add in a break; statement.
In fact, in my years and years of programming in JavaScript, I have used if/else blocks many many times (some of which could have been switch statements), and only wanted fallthru a handful of times. I think allowing fallthru is fine, but it should be made explicit, rather than implicit.
The only possible argument I can think of in favor of implicit fallthru is using a switch inside a `for` loop:
for (var i = 0; i < array.length; i += 1) {
switch (array[i]) {
case "foo":
continue;
}
}
...but that's actually ridiculous because I'm fairly sure that breaking out of a loop is far more common than continuing. In which case explicit fallthru would be better with loops.
Oh, that makes a bit more sense. Technically speaking, Arc's case expression is sorta like guards, though, in the sense of dispatching to different code depending on a condition. You're right that wart's :case is essentially a super nice version of `case`.
Also, what's wrong with `switch` whining? :P This is a JavaScript forum, right? Oh wait...
"because it's a function call, it implies that the value is created dynamically (could change at runtime), when in fact the three ports are constants."
"So here's my plan for the three ports in PyArc: [...] the three variables can be overwritten, allowing Arc programs to redirect the three ports."
Let's take a look at overwriting these "constants" in pg-Arc:
arc> (let s (outsÂtring) (w/stÂdout s (is s (stdoÂut))))
t
arc> (is (stdeÂrr) (stdoÂut))
nil
arc> (do (stdeÂrr:stdout)Â (is (stdeÂrr) (stdoÂut)))
t
"A stream is simply a function that returns successive values."
Then how do you peek at the first character of a stream without consuming it? How do you read bytes?
Trick questions, actually. You can still have old-fashioned 'peekc and 'readb even if input streams can also be called as functions. :)
---
"To make defining streams easier, I plan to support a (yield) expression, that works similarly to Python's yield, except it avoids the complicated distinction between functions, generators, and iterators"
I have nothing against that approach, but I think it might be easier to implement continuations, at which point a) people can implement their own generator syntaxes, and b) people sometimes won't want mutation-based iteration at all, since (Scheme) continuations traditionally preserve only the "pure" part of the control flow.
If you're not thinking about implementing full continuations, you might be thinking about just doing CPS (continuation-passing style) transformations of just a few things like 'if and 'each. That's how I expect Python gets away with it, and it seems to be how Common Lisp programmers approximate continuations too (http://www.cliki.net/CPS). The thing is, you probably don't like the very idea of special-casing things like 'if and 'each. :) If you're making PyArc an fexpr language, then all (if ...) and (each ...) forms are examples of just one case, and that case is too general to be useful for purposes other than evaluation.
---
"Should stdin return bytes or chars?"
I think returning characters is fine. There are lots of things someone who deals in binary might like to "read": Bits, words, large fixed-size buffers, and so forth. These are rather easy things for the programmer to make on their own, unless efficiency is an issue. On the other hand, someone dealing in text almost always wants to treat a stream as a sequence of characters (or a sequence of lines), and characters would be kind of a nightmare to decode manually.
---
I'd be interested to hear if (str) is actually any more convenient than readc.str. In fact, I like to say call.foo rather than (foo), if only to reduce parentheses, but I don't know if there's a practical advantage: Occasionally it helps 'cause I end up refactoring it into call.foo.bar, and occasionally it hurts 'cause I refactor it into foo.bar, (bar:foo), (foo:bar), or (foo bar baz).
Hmm, yeah, in this case readc.foo probably isn't as useful as (foo). Since readc.foo returns a character, we probably won't ever say readc.foo.bar.
"Let's take a look at overwriting these "constants" in pg-Arc:"
Yes, but what if you could define outstring, instring, call-w/stdin, and call-w/stdout in Arc itself? What if you want to redirect to something other than a string? What if you want to redirect stdin to stdout or stderr? You can temporarily redirect them with w/stdin etc. but as far as I know, you can't globally redirect them. Thus, py-arc not only removes unnecessary parentheses, but allows more flexibility when dealing with the three ports. Even if it were possible to get the same functionality with pgArc, py-arc makes it easier.
P.S. When I said "constants" I didn't mean "doesn't change." I meant "the value isn't created dynamically at run-time." If you have a better term than "constant", I'd like to hear it. Perhaps I should have used "static" instead.
---
"Then how do you peek at the first character of a stream without consuming it? How do you read bytes?"
That's a good question! I gave it some more thought after I posted. As you said, I could define peekc and readb specially. That would actually be a good suggestion for the stream proposal: allow for peeking rather than consuming. Something like this:
(stdin 'peek) -> peek rather than read
(stdin 'line) -> read a line rather than a char
(coerce (stdin) 'byte) -> read a byte rather than a char
Since they're functions, they can accept parameters, thus you can create any API you want with iterators. This is actually similar to how objects/methods behave, except that they're functions that accept symbol arguments. So:
calling with no arguments = default behavior
calling with arguments = do something different
---
"I have nothing against that approach, but I think it might be easier to implement continuations, at which point a) people can implement their own generator syntaxes, and b) people sometimes won't want mutation-based iteration at all, since (Scheme) continuations traditionally preserve only the "pure" part of the control flow."
I would like to support proper continuations in addition to yield, with yield being a convenience. Of course, I'm not sure how to implement continuations, and likely won't get around to it for a while. In any case, I see yield as being a particular subset, rather than the ultimate solution to laziness.
---
"If you're not thinking about implementing full continuations, you might be thinking about just doing CPS (continuation-passing style) transformations of just a few things like 'if and 'each. [...] The thing is, you probably don't like the very idea of special-casing things like 'if and 'each. :)"
Yeah, I actually ran into that problem. I got yield working for simple things, but it didn't work with `each`. Python gets away with it because `for` isn't a function... it's a language construct. But in Arc, `each` expands into a function... So I may need to scrap the yield idea and do something else. In any case, I want some relatively-easy way to create streams in Arc.
---
"I think returning characters is fine."
Yeah, I had thought the same thing. Having py-arc deal with Unicode issues sounds a lot nicer, since then Arc can just deal with chars. And I decided to make it return chars because it's possible to define `readline` with only chars, but more difficult to define `readc` if all you have are lines...
"but as far as I know, you can't globally redirect them."
That's what I did with (stderr:stdout). Calling 'stderr with an argument sets (stderr) to that value permanently (or somewhat permanently, depending on whether other code also mucks around with 'stderr).
---
"I meant "the value isn't created dynamically at run-time.""
The result of (stderr) or (stdout) isn't freshly allocated or anything, it's just kept in a dynamic box (a Racket parameter) which you call to unwrap. It's just as constant as if it were kept in a variable, except that when you rebind it, you have the option to temporarily rebind it in a way that's friendly with continuations and threads.
If you replace them with regular global variables, it's probably a good idea to implement those particular variables using threading.local() somehow, just so one thread's (w/stdout ...) doesn't mess up any other threads.
Wow. In the current source for Penknife (http://github.com/rocketnia/penknife), I have an abstraction layer over input streams, which deals in tagged values of type 'fn-input. Any function that allows x!read, x!peek, and x!ready can be wrapped up as a 'fn-input, and Penknife uses this to skip comments and normalize newlines. I find it amusing our designs are so similar. :)
Your (stdin 'line) example hits upon something else I've thought about for Penknife, the ability to read different kinds of units from streams by passing the unit as a parameter. That said, the approach lost favor with me before I did anything with it, 'cause I want people to be able to define new units rather than being limited to ones that are hardcoded into the stream's implementation. The current way I imagine tackling this domain in Penknife is to have extensible global functions [lines x], [chars x], and [bytes x] which coerce arbitrary (but sensible) things into things which can be iterated upon. An input stream would be one example of a sensible x.
By the way, (coerce (stdin) 'byte) doesn't seem like a good way to read bytes if Unicode's on the scene.
---
"Of course, I'm not sure how to implement continuations, and likely won't get around to it for a while."
Adding continuations to a language implementation seems tough. I see two options:
Piggyback on a platform that already has them, or use low-level wizardry like stack copying to force the platform to appear to have supported them all along. If you can manage this, then the code of the language implementation can stay relatively straightforward, it can be mostly the same code as you had before adding continuation support, and it can interact intuitively with the platform's existing tools and libraries.
Implement them explicitly in the language core, for instance by calling everything in CPS or by maintaining explicit call stacks. This lets you implement continuations that have exactly the features you want, rather than just the features that come naturally from the platform.
---
"In any case, I want some relatively-easy way to create streams in Arc."
I understand if that's not as easy as you're looking for, though.
...Hey, you could implement 'yield using threads. That's probably easier, huh? XD It might have odd interactions with thread-locals and locking, but implementing 'yield with continuations probably isn't everyone's idea of intuitive either.
"...Hey, you could implement 'yield using threads. That's probably easier, huh? XD It might have odd interactions with thread-locals and locking, but implementing 'yield with continuations probably isn't everyone's idea of intuitive either."
Threads seem kinda heavy-handed, though, don't you think? In any case, I'm mostly sold on this idea for streams/iterators:
;; adapted from http://c2.com/cgi-bin/wiki?ExternalIterationUsingContinuations
(def make-stream (stream)
(let (yield next) nil
(= next (fn (value)
(ccc (fn (c)
(= next c)
(yield value)))))
(fn args
(ccc (fn (return)
(= yield return)
(apply stream next args))))))
(mac stream (parms . body)
`(make-stream (fn ,(cons 'yield parms) ,@body)))
Note: the above is designed to work in pgArc, and would require some small changes to work in py-arc (when I get continuations working). Also, it mostly works, but there's still one bug I need to sort out.
---
Basically, `make-stream` accepts a function, whose first argument behaves like `yield`:
I'm amazed at how short your definitions are, but I see two problems, not necessarily counting bugs: First, your generators can't yield nil without appearing to terminate. Second, it seems like the created iterators will take arguments the first time they're called and ignore them thereafter, which makes looping over them a bit more of a pain since the initial loop is treated differently from the others.
I have a suggestion we can both benefit from: Forget generator parameters altogether. While it's tempting to emulate Python and C# with a definition form like this,
(gen foo (a b c)
(each elem b
yield.a yield.elem yield.c)))
it's probably simpler to implement, just as useful, and just as flexible to have a parameterless syntax like this one:
(def foo (a b c)
(accum-later:each elem b
yield.a yield.elem yield.c))
Incidentally, 'yield is a good default name for the 'accum variable. I almost always say (accum acc ...), and I've wanted to make an anaphoric version for a long time, but I've never been comfortable having two related names that look like they abbreviate the same word. :-p
"First, your generators can't yield nil without appearing to terminate"
Yeah, I know. I considered that an okay tradeoff, since they're designed to be... well, streams, you know? If necessary, you could wrap the returned values in a list, letting the caller know whether it's nil the value, or nil the terminator.
---
"Second, it seems like the created iterators will take arguments the first time they're called and ignore them thereafter, which makes looping over them a bit more of a pain since the initial loop is treated differently from the others."
"I have a suggestion we can both benefit from: Forget generator parameters altogether."
How is that clearer, shorter, or overall better? Looks clunky to me. To be more specific, what do we gain from it, aside from (as you claim) it being easier to implement?
Just simplicity. My question is, what do we gain from having these parameters?
What's an iterator good for, except to iterate over? There are only three everyday utilities I see using these things with: 'each, 'all, and 'some--and 'each and 'all can be implemented in terms of 'some.
None of those utilities has any spot to provide extra arguments to the Iterable when making its Iterator (to use the Java terms), and it doesn't have any spot to provide extra arguments to the Iterator when getting each of the elements. This doesn't mean we need to add these spots.
You could go backwards... or reset it to the beginning, or send it values, ala coroutines. Or then there's peek vs. consume, etc.
I see streams as a generalization of iterators. They can be used to create simple iterators, yes, but they can also provide additional features that iterators don't (usually) have.
It's okay that functions like `each` can handle iterators without needing to pass arguments: that's a good thing. In fact, I considered it to be pretty important that passing it no arguments would do a "default action" like retrieve the next element.
---
What do we gain from having the parameters? I believe more extensibility. In fact, here's an idea that I was juggling around: prototypes (surprised?). Basically, it could work something like this:
What's going on here? Basically, a prototype is a function, that can selectively delegate to another function, by calling the `fail` continuation. I note that this is similar to rocketnia's idea of :fail or somesuch.
All this could be wrapped up in a macro or two, making it quite simple and short:
(proto bar (foo y)
'hello (+ "hello " y))
Anyways, when a proto calls `fail`, it then calls the prototype (in this case foo) with the same arguments. It can then do whatever it likes. Why is this good?
Well, for starters, it unifies the concepts of objects, classes, methods, functions, and prototypes... secondly, it can give us more extensibility. Specifically, how do we easily extend all functions of this type, without affecting that type? And what if we want to extend a particular function, without affecting other similar functions?
One solution would be to create prototypical functions, that act as "bases" that other functions can build off of. Consider the case of stdin, which would have the 'char and 'peek "methods":
(def stdin (x)
(case x
'peek ...
'char ...))
If you want your function to behave like stdin, you could define such methods in your function, but using stdin as the prototype:
(proto my-stdin (stdin)
'peek ...
'char ...)
Now, what happens is, if you call (my-stdin 'peek) or (my-stdin 'char) it will behave normally. But... what if you want to add an 'xml-node method to all the input functions? Simple, just extend stdin:
(extend stdin (x) (is x 'xml-node)
...)
Because my-stdin inherits from stdin, it'll see the new method automatically! Oh my, this is looking an awful lot like OOP! Indeed, except rather than using objects, methods, classes, etc... it's all just functions. Functions can inherit from any other function. Also, it's a somewhat strange form of OOP, since it uses prototypes rather than more traditional classes.
Note: this methodology would only be useful in the cases where you want to make a certain "type" of function behave differently. Obviously this won't apply to all circumstances, so ordinary functional programming is still good.
I have no more doubt that you're going somewhere interesting with this stuff. ^_^ Funny, I woke up this morning out of a dream in which someone online had started making something very similar to Penknife, and here you are, taking at least some degree of inspiration from failcall.
(Actually, the codebase I dreamed about was written in Java, was organized almost exactly the same way as my Arc code (probably thanks to dream magic), and was actually a language for distributed computing rather than just for extensible syntax. If you hit all those marks, then I'd be outright scared of you, lol.)
---
"it unifies the concepts of objects, classes, methods, functions, and prototypes"
A few years ago I was working with someone who used the phrase "add an else-if" when talking about almost any new feature. But between 'extend, failcall, prototype inheritance, scope shadowing, and operator precedence, it turns out we have an awful lot of sophisticated terms and frameworks for what amounts to adding an else-if. ^_^
"(Actually, the codebase I dreamed about was written in Java, was organized almost exactly the same way as my Arc code (probably thanks to dream magic), and was actually a language for distributed computing rather than just for extensible syntax. If you hit all those marks, then I'd be outright scared of you, lol.)"
Sorry, I'm not going anywhere close to Java code. You must be thinking of some other magic person. :P
Okay, let me put it like this. You want a function that behaves like stdin, so calling (peekc my-stdin) will work. But (readc my-stdin) needs to work too... and possibly (readb) as well. Which means that those functions need to extract internal details out of my-stdin, in order to return a useful value.
The problem is that we want my-stdin to behave differently in different situations, but on the surface it just looks like an ordinary function call, so how do the various functions above get at the internal stuff in my-stdin?
This provides a solution: you call the function and tell it what data you want, and then the function provides that data. Another problem with things like (readc) is that it's all-or-nothing: either you extend it so it changes behavior for all input functions, or you jump through some clunky hoops to make it only work on a subset... but even that's not very extensible.
But with prototypes, you get completely fine-grained control. You can extend an individual function, or extend that function's prototype, or extend that function's prototype, etc. This not only gives a way to expose the function's internal details, but also allows for fine-grained extensibility.
Rather than extending something based on it's `type` (cons, string, fn, etc.), we can extend a function based on it's prototype. The functions still have a type of 'fn, but we get fine-grained control to say "these kinds of functions behave differently in these situations".
"The problem is that we want my-stdin to behave differently in different situations, but on the surface it just looks like an ordinary function call, so how do the various functions above get at the internal stuff in my-stdin?"
In Anarki, here's an easy way to go (if not an easy way to get all the way to what each of us might want want):
(defcall input (self)
readc.self)
With this in place, we can say (my-stream) to read a character, but all the other things we can do with an input stream are still available.
---
One random thing I'm not comfortable with about the (my-stream 'peek) approach is the fact that it uses a symbol, 'peek, which might mean different things to different libraries. Obviously 'peek itself would be toward the core, but I could easily imagine two pull parsers trying to use 'xml-node.
---
"Another problem with things like (readc) is that it's all-or-nothing: either you extend it so it changes behavior for all input functions, or you jump through some clunky hoops to make it only work on a subset... but even that's not very extensible."
Could you elaborate on that somehow? If I "extend" 'readc, I'm probably only changing it to support a whole new range of stream types, not (intentionally) modifying the behavior for existing ones. It shouldn't be too clunky to determine whether something's in that new range, and I don't have a clue what part of this strategy you're saying is less extensible than your idea.
---
"You can extend an individual function, or extend that function's prototype, or extend that function's prototype, etc."
I know what you're talking about there, at least. ^_^ It's interesting to note that both of our approaches put it on the API designer's shoulders to make sure the system is well-factored into extension points. My approach, with direct use of failcall, and with rulebooks rather than deep dependency chains[1], would promote patterns like this:
rulebook.readb
; To simplify the example, we'll assume readc itself isn't a rulebook.
[fun readc [str]
[let first-byte rely:readb.str
...]]
; The use of the "rely" macro above makes readc fail if readb.str
; fails. The above definition of readc is roughly equivalent to this:
;
; [= readc [failfn fail [str]
; [let first-byte [failcall fail readb.str]
; ...]]]
;
; The "fun" macro introduces the "fail" variable anaphorically, and
; the "rely" macro expands to refer to it.
;
; Then "failcall" itself is a macro. It parses its second argument
; as an expression, then tries to unwrap it into a function call
; (potentially causing an error), then wraps it back up as a function
; call with a particular fail parameter subexpression. I'm still
; thinking over this design.
;
; Note that most of this stuff is still just a sketch right now.
Certain rulebooks could have rules that acted as inheritance, in your sense:
rulebook.my-plus
[rule* my-plus [] args
my-plus/+ ; This is the name of the rule.
[rely:apply + args]]
I expect this to be a common design pattern, common enough that I might end up with macros that capture patterns similar to some of your prototype system's patterns. In fact, depending on how I settle on implementing rulebooks (which is up in the air thanks to module difficulties), that pattern might already be easy to accomplish by going to a lower level:
[1] By "deep depencency chains," I mean that I assume you're talking about having patterns whereby A is the prototype of B, B is the prototype of C, C is the prototype of D, and people only ever use D most of the time. (A, B, and C might have longer names.) The rulebook equivalent would be to have a rulebook E which keeps track of A, B, C, and D. I think rulebooks probably make it easier to forget about the precedence order of different cases (for better or worse), while also making it possible to add and remove cases without having to rewire their neighbors (say, possible to remove B without fiddling with C's prototype).
"With this in place, we can say (my-stream) to read a character, but all the other things we can do with an input stream are still available."
Yeah, but that doesn't help if you want to extend readc or peekc so they understand your new stream. You need to extend readc and peekc directly. And then how do those functions get at the data they need?
---
"One random thing I'm not comfortable with about the (my-stream 'peek) approach is the fact that it uses a symbol, 'peek, which might mean different things to different libraries. Obviously 'peek itself would be toward the core, but I could easily imagine two pull parsers trying to use 'xml-node."
That is true, but how is that any more arbitrary than saying that the global functions peekc, readc, etc. are reserved for input streams? The symbol 'peek simply means whatever the function wants it to mean. Thus, a function that behaves like an input stream can be called like an input stream. Duck typing.
---
"Could you elaborate on that somehow? If I "extend" 'readc, I'm probably only changing it to support a whole new range of stream types, not (intentionally) modifying the behavior for existing ones. It shouldn't be too clunky to determine whether something's in that new range, and I don't have a clue what part of this strategy you're saying is less extensible than your idea."
Why not modify the behavior of existing ones? In fact, readline does that right now, by relying on readc. Thus, if your input supports readc, it supports readline too.
And it's not just readc either, it could be other functions, it's just that this is a discussion related to input streams. So, yeah, it's possible to do it with the current model, but I think it's clunkier and leads to more (and uglier) code.
Okay... let me try to explain... I want to allow Arc users to create new data types, like input streams. Right now, this is possible only by extending built-in functions like readc, peekc, etc. Something like this:
(extend readc (x) (isa x 'my-data-type)
...)
...but now you end up needing to define a new data type, and you need to extend all the built-in functions individually. You can avoid that by using coerce...
(def readc (x)
(do-something (coerce x 'input))
...the problem is the "do-something" part. Even after readc coerces your custom input type to 'input, how does it extract the data it needs? How does it read a character? How does readc access the internals of your data type? You could special-case it to be a function call, but then what about peek? What about reading bytes?
What I am proposing is a way to solve both problems in one fell swoop: functions are typed not based on their `type`, but based on the pseudo-methods they contain. Duck typing. And by calling these "methods", functions (like readc) can extract the data they need. Now, you can define readc like this:
(def readc (x)
(x 'char))
And now any function that supports 'char automatically gets access to readc without needing to extend it. What if a different function also uses 'char? No problem, it works too! Thus, the functions don't care about the type of their argument, they only care that it supports the methods they need to do their job. And by leveraging prototypes, it lets you extend the scope chain at any point, giving more fine-grained control than simply whether data is a string or a cons or whatever.
By the way, you can combine this approach with conventional types as well:
(def readc (x)
(coerce x 'input).'char)
The above is more strict, because it not only expects the data to be coercable to 'input, but it also expects it to have a 'char "method". This could potentially handle your use-case about conflicting libraries using the same method names: you can use the function's actual `type` to differentiate between them.
---
Random side note: this of course means that I can define tables so that they are functions of type 'table that have a 'get and 'set method. Then you can create a custom table just by creating an annotated function that has 'get and 'set. And then `apply` could be designed so when you call a table with (foo 'bar), it calls (foo 'get 'bar) instead.
This makes it super-easy for Arc code to define custom table types. Or custom input types. Or custom any-type, really, since all composite data types can be represented as functions with methods.
This is what I'm trying to explain. If you have a table called foo, then calling (foo 'bar) is a "get" action, and calling (= (foo 'bar) 5) is a "set" action. foo is a single data-type, it is self-contained, but it has two different behaviors in two different areas.
If you want to emulate that, you end up needing to jump through some hoops, like the following:
(def my-table ()
(obj get (fn (n) ...)
set (fn (n v) ...)))
...and now you have the fun of extending sref and apply (and probably eval as well) so they behave properly with your custom table. Remember my prototype post a while back? I couldn't even get it to work in pgArc. What a pain. So much easier if all your function needs to do is implement the required methods, and everything just works without needing to extend anything.
And this "functions with methods" idea is applicable to other data types as well, like input streams. So for the same reasons that creating custom table types are a massive pain, creating custom input types are a massive pain. But this proposal makes it super easy.
---
"My approach, with direct use of failcall, and with rulebooks rather than deep dependency chains[1], would promote patterns like this:"
By the way, I would expect my proposal to be completely compatible with rulebooks as well... the only requirement is that the call (foo) return a char, and (foo 'peek) return a char, but not consume the stream. It's completely up to foo how it handles delegation and/or inheritance. I merely provided one potential way: prototypes.
But the proposal works even with ordinary functions that aren't prototypes. In fact, it doesn't even need to be functions per se, for instance I would expect the following to work:
(= foo (obj char #\f))
(readc foo) -> #\f
Neat. Since foo is a table, and (foo 'char) returns #\f, readc was able to use it. This, of course, wouldn't work in the strict version of readc, which tries to coerce it's argument to 'input. But this would work:
"By "deep depencency chains," I mean that I assume you're talking about having patterns whereby A is the prototype of B, B is the prototype of C, C is the prototype of D, and people only ever use D most of the time. (A, B, and C might have longer names.)"
It could be that way, but shallower trees are also quite possible, and in fact I expect those would be the norm. My suggestion to authors of functions would be to only increase the depth of the tree as needed.
"Yeah, but that doesn't help if you want to extend readc or peekc so they understand your new stream. You need to extend readc and peekc directly."
Sounds like you're solving your own problem. :)
"And then how do those functions get at the data they need?"
Why hide it?
Anyway, I'd put the extensions of 'readc and 'peekc in the same place as the rest of my code that dealt in the concrete details of my custom stream type. That way, I can pretend in all my other code that the data is strictly encapsulated, and when I do change the implementation, everything I need to refactor is in the same page or two of code.
---
"That is true, but how is that any more arbitrary than saying that the global functions peekc, readc, etc. are reserved for input streams?"
If you're saying the symbol 'peek is just as arbitrary as the global variable name 'peekc, I agree, but global variables are the more likely thing to be handled by a namespace system. :) If that's not what you're saying, whoops.
---
"Why not modify the behavior of existing ones? In fact, readline does that right now, by relying on readc. Thus, if your input supports readc, it supports readline too."
Huh? Using 'readline doesn't change what happens the next time you use 'readc. I think we're having some word failures here.
Maybe what you mean by "modify" is more of a pure functional thing, where you "change" a list by removing its first element when you call 'cdr. But then I still don't understand what you meant in your original statement that "Another problem with things like (readc) is that it's all-or-nothing."
"you need to extend all the built-in functions individually . You can avoid that by using coerce..."
Darn 'coerce, always enticing people to use it. ^_^ It's actually possible to use a coercion pattern here, but you'll need to check whether you can coerce at all, and go on to other extensions if not. (This is something failcall and rulebooks make convenient.) However, to create a value of type 'input, you still need to specify all the reading and peeking behavior somewhere, and I prefer to specify those behaviors in separate pieces of ravioli, in this case by extending each function individually.
"Even after readc coerces your custom input type to 'input, how does it extract the data it needs?"
Exactly, I wouldn't turn something into an 'input and then turn it back; by extending 'readc directly, I'd preempt the coercion step.
To digress a bit, coercion is a positively useful design pattern when there's more than one sensible set of "axioms" for a set of utilities. If I see utilities A1, A2, B1, B2, B3, and B4, and I notice that the B's can be implemented in terms of the A's, then I can do the following:
1. Write a function C of the form "Take any value, and return a similar value that supports all the A's." I call this a coercion function, but I don't expect everyone to agree with that. ^_^
2. Extend the B's so that they try using C. (If there's no standard way to "try" using a function, such as failcall, then C needs to indicate failure in its own way, or there needs to be another function D that people call to see if they can call C.)
3. Sometimes it's useful (if uninspired) to create a boilerplate concrete type for C to return when there's no better way to make the right A-savvy value. This type tends to take the form of a wrapper containing a function to call for the A1 behavior and a function to call for the A2 behavior. Obviously, the A's should be extended to support this type; that's the only point of it.
After this, if I ever want to extend the B's, there's a good chance I can do it by finding (or making) something that extends the A's instead, and then extending F to do the conversion. After a certain initial cost (making C and C's general-purpose return type), this eventually becomes a net decrease in the number of extensions needed.
...And I've run out of time. I'll get to the rest of your post later! ...And your followups someday. XD
I would like to point out that prototypes and methods are completely separate subjects that serve a similar purpose (extensibility) in completely separate ways. Perhaps I shouldn't have discussed them in the same post.
Methods solve the problem of polymorphism, namely the ability to easily define new types (and new instances of existing types) that can intermesh with existing code (like the built-ins readc and peekc). It does this by implementing duck typing: if the function supports the method, just use it.
This can be augmented by a technique that I've come to like: coercion. Rather than saying "my argument needs to be of type foo", the function just coerces to type 'foo. If it can't be coerced, it will automatically throw an error.
The reason I like the coerce approach is because it means you can easily create completely new types. So if you create a new 'byte type, you can extend coerce so it can be coerced into 'char, 'int, 'num, etc. and existing code will work with it.
The reason I like the method approach is that it makes it easy to create new instances of existing data types. Like I mentioned, it makes it easy to create custom 'table types. It also maximizes the benefit of prototypes, and in some cases allows completely new types to piggyback off of existing functions, which is made easy with prototypes.
The reason I call it "more extensible" is for the exact same reason I call the coerce approach more extensible. With the coerce approach, the function doesn't care what type it is, it only cares that it's possible to coerce it to the type it wants.
In the case of methods, the function doesn't care what type it is, it only cares that it's possible to extract the data it needs, using the function's methods.
---
Prototypes, however, serve a different purpose. Specifically, it tries to solve the problem where you sometimes want to extend a particular function, and sometimes want to extend many functions at once. It's designed to give fine-grained control beyond merely what the type is. Also, by letting one function serve as a "base" for other functions, it tries to reduce duplicate code.
All three concepts are designed to enhance extensibility, but they do so from different angles, and in different ways. You'll note that all three attempt to achieve extensibility by ignoring what the type of something is. Instead, they focus on either the desired type, the desired functionality, or the desired ancestor.
The three combine to form a coherent whole, which I think is as it should be.
Perhaps I should write up a post comparing the two approaches (prototypes/methods vs. pure functions). Depending on the results, that could either convince me that my idea is silly, or convince you guys that it has merit.
Coercion means you only need to define coercion rules in one spot, and existing code can Just Work.
Methods mean you only need to define a method that implements the functionality, and existing code can Just Work.
---
Prototypes get rid of the traditional concept of type entirely. Rather than saying "that's an input stream" or "that's an object of type input" we instead say "that's a function that inherits from stdin."
Of course, I don't plan to get rid of types, since I think they're useful (especially with coercion), but at the same time, I think having more fine-grained control can be useful, so I think there's some synergy between types and prototypes.
"That's what I did with (stderr:stdout). Calling 'stderr with an argument sets (stderr) to that value permanently (or somewhat permanently, depending on whether other code also mucks around with 'stderr)."
Huh, I didn't know that.
---
"If you replace them with regular global variables, it's probably a good idea to implement those particular variables using threading.local() somehow, just so one thread's (w/stdout ...) doesn't mess up any other threads."
Right, threads. I hadn't given any thought to them (they're not implemented yet, either). I'll have to revisit this issue later, when I actually implement threads.
---
"By the way, (coerce (stdin) 'byte) doesn't seem like a good way to read bytes if Unicode's on the scene."
Hm... I guess it depends on what you expect it to return. If it's a Unicode character that can be represented as a single byte (in UTF-8), then just returning that should be fine. Trickier if the code point is represented as more than one byte, but wouldn't that still be possible, assuming code would be expecting multiple bytes? Of course then there's the old fallback of (readb) or (stdin 'byte).
---
"Implement them explicitly in the language core, for instance by calling everything in CPS or by maintaining explicit call stacks. This lets you implement continuations that have exactly the features you want, rather than just the features that come naturally from the platform."
I was leaning toward that approach (everything CPS), but I'm still mulling it over. Python doesn't have TCO, though, so wouldn't CPS exceed the recursion limit in Python, unless I implement a trampoline? In which case things will get even slower, hurrah.
"Adding continuations to a language implementation seems tough."
The more I think about it, the more I feel like implementing my interpreter in Ruby, which has call/cc, and proper lambdas (unlike Python's terrible ones). And then, I could name it Arubic. Awesome name, right?
Pretty awesome, yeah. XD It would actually make it less useful to me personally though, since my cheap web host doesn't support Ruby. >.>
Anyhow, I think Ruby's continuation support is a bit sketchy. There's no equivalent of Scheme's 'dynamic-wind built into the language as far as I know, so it may be necessary to use a patch like this one, which replaces callcc with a dynamic-wind-aware equivalent: https://github.com/mame/dynamicwind
Since there's no dynamic-wind, I haven't looked to see if there's a way to do continuation-friendly dynamic scope, like Racket's parameters. If that callcc replacement works though, it should be possible to implement parameters on top of Ruby, just maybe not that efficiently. :-p
Racket does a bunch of other interesting stuff with continuations, such as continuation marks, continuation guards, and... maybe that's it... and we'd lose those, but I think they're mainly useful for performance and well-specified 'letrec behavior, and who cares about that stuff? :-p
You might also find writing an Arc interpreter in Racket or even Arc3.1 a lot of fun:
- you'll get continuations, threads, and parameters that actually work
- you'll be able to easily implement things like fexpr's, first class macros, or serializable closures in your interpreter that Arc3.1 doesn't have (hard to do in a compiler)
- you can write parts of your program in Arc3.1 that you want to run fast (and doesn't need your extra interpreter features) and other parts using your Arc interpreter, and it can all work together
Yeah, I also considered that, or Chicken Scheme, or Common Lisp, but honestly... as much as I may like Lisp... Scheme and Common Lisp just feel... bloated. I love Arc's shortness and simplicity, and Python has it too, I just hate the way Python behaves sometimes (curse Python's lambdas and lack of proper closures).
I admit that would certainly be an easier choice, in some ways. I do like the idea of writing an Arc interpreter in Arc... but from my initial tests, Arc seems quite slow with string/list processing. Maybe it was just my program. Isn't that already being handled with arcc, though?
Ah, if by "arcc" you're referring to https://github.com/nex3/arc/tree/arcc, that's an Arc compiler (replacement for ac.scm) written in Arc. Thus while it produces code that runs faster than interpreting code would, it has the same limitations as ac.scm.
Arc seems quite slow with string/list processing
Do you have an example handy? ("Here's a program in Python, and here's the program in Arc, and see how Arc is slower"). ...There may be things we can do to speed Arc up, but having a runnable example is useful. I.e., I might do something that makes my Arc program run faster, but it might not necessarily make your Arc program run faster.
Oh, at first glance it seemed to behave somewhat like an interpreter, my mistake.
---
Yeah, I do, actually. I wrote a program that can take an (ad-hoc and simple) playlist format and convert it into .xspf. I then rewrote the program in Arc (which was much nicer than writing it in Python), but it ended up being drastically slower. Which means I'm not sure if it's a problem in Arc, or Racket, or my program! It could be any combination of those three.
The program itself should have O(n^2) complexity, due to error checking (which actually isn't even implemented in the Arc version...) If I got rid of error checking, I could get it down to O(n log n) but I don't want to do that.
In any case, the complexity should be the same for both Python and Arc. If I recall, the slowness was primarily caused by me searching for whether one string is a substring of another string. Python has string.find (which performs faster than regexps, when I tried it), but I'm guessing Arc's implementation is slow.
This is all whishy-washy guessing, though. I haven't done concrete tests or anything, so take it with a grain of salt. However, I'm vaguely interested in finding out what the performance bottleneck is, and possibly fixing it.
---
Edit: I just tried these:
# Python
for item in range(100000):
"foobarquxcorge".find("foobar") != -1
; Arc
(repeat 100000 (posmatch "foobar" "foobarquxcorge"))
And got some very weird results... the Python version consistently takes about 2 seconds. When I first tried the Arc version, it took something like a minute or two. Aha! So that's the bottleneck? Not so fast. I then tried it a second time, and it took ~5 seconds... slower than Python, but not by too much.
It seems pgArc's performance isn't very reliable. I've noticed sometimes at the REPL that running the same (or similar) code will sometimes be instantaneous, and other times it'll chug along for a few seconds. I don't think it should be taking 1-2 minutes for something that should be taking 5 seconds, though.
However, my program pretty consistently took much longer than it does in Python, so I think I can safely rule out posmatch, which actually seems quite fast (almost as fast as Python, anyways).
There's room for improvement in posmatch, but it doesn't seem to be the smoking gun (at least not yet). For fun, I tried this:
for item in range(100000):
re.search("foobar", "foobarquxcorge")
It took ~10 seconds, so as I said, regexps are slower than find, in Python. Possibly because it has to create a match object, rather than just returning a number? I don't know.
Times will be variable because of garbage collection, that's normal. But having your posmatch example take a minute is very weird though, I've never seen anything like that happen.
I'm actually surprised to hear that posmatch is almost as fast as Python's find. Python's find, after all, isn't written in Python (like how Arc's posmatch is written in Arc), it's written in C.
If you use a regex more than once you'll want to compile it. Both Python and Racket have this capability.
I've sometimes experienced extremely long delays with DrRacket (I use DrRacket as an editor; I use the command-line "racket" to interact with Arc, and haven't observed these delays with racket). Perhaps Pauan is using DrRacket? And as for why that happens... I think it was a) trying to complete a garbage collection with b) a large part of its memory put into swap space. I'm puzzled by how it could take so long to un-swap that much memory (max 360-ish MB)... perhaps it did it in an impressively inefficient way (like, load page 1 containing a certain cons cell, then seek the disk for page 25359 containing its car, then seek back to page 2 for the cdr, and so on). Also, perhaps displaying the "recycling" animation contributes to the problem. Hmm, perhaps I'll figure that out at some point.
If you're not using DrRacket, then I'm not sure what might cause it, other than some unlikely things (e.g. memory of "racket" was swapped to disk and you were moving massive files around).
Meanwhile, if you really care about string-searching, or find it to be a pain point, you may want to implement the awesome Boyer-Moore algorithm: http://en.wikipedia.org/wiki/Boyer-Moore
No, I'm just using Arc on the command line. Garbage collection was my bet, and you're right that it could have been swap as well.
---
Yeah, might actually be a good idea to integrate that into the base posmatch, but as I said, posmatch isn't actually that slow compared to Python's find, so the bottleneck is probably elsewhere.
But yes, obviously in production-level code you should be compiling/caching them yourself, if solely on a "just-in-case" basis.
---
Also, I would like to point out that in all my time using Python, it's been very consistent as far as speed goes (no 3 second pauses), so would that imply that Python's garbage collector is more incremental than Racket's?
One possibility is that Python uses reference counting, which immediately frees non-cyclic garbage as you go, and, iirc, an additional occasional garbage collection cycle to free the remaining cyclic garbage. So I'm just guessing, but if you're not creating much cyclic garbage, maybe that's an explanation for why you're not seeing many garbage collection pauses.
By the way, one reason I'm liking the (stdin 'peek) style rather than (peekc stdin) is that it makes it easy to extend both the built-in ports (stdin, stdout, and stderr) and also makes it easy to extend user-created functions:
(extend stdin (x) (is x 'line)
(readline stdin))
(extend stdin (x) (is x 'byte)
(coerce (stdin) 'byte))
Though extending readline and readb would be comparable, I think both approaches have merits, in different situations.
Interesting, you're favoring (stdin 'peek) specifically because you want it to be extensible, whereas I just finished noting that it wouldn't be as extensible as I'd like. XD
You might be right that the approaches have pros and cons. However, I'm not sure what the (stdin 'peek) approach has in its favor. Here's a scenario which is kinda in favor of (peekc stdin):
Suppose someone wanted to introduce a new kind of read unit, like 'xml-node. Technically, they could extend 'stdin and redefine 'instring, and that would probably be enough most of the time. However, I think it would be even nicer to be able to read XML nodes from any stream that supports character reading. If the XML pull parser could work like that, then anyone could integrate it with their own favorite miscellaneous stream libraries, with far less glue code--specifically, without redefining everything in the stream library to support 'xml-node. However, to read XML nodes they have to involve the XML library somehow, and a (my-stream 'xml-node) interface doesn't work.
Actually, there's at least one way to force (my-stream 'xml-node) to work in this case: Somehow establish that every time a stream is called, if it has nothing better to do, it refers to a global framework and looks for a behavior there. This could just be established by convention, and a macro could make it extra convenient for people to create streams that default to the framework.
(xml-read stdin) -> reads XML nodes from stdin
(xml-read some-stream) -> reads XML nodes from some-stream
The differences between (readc stdin) and (stdin 'char) are small. One can be easily defined in terms of the other, so it's mostly a matter of what you want to extend.
Let's suppose you want to add a shiny new input type. Since input/output are just functions, this is easy enough to do:
(def my-stdin ()
...)
But now, let's say you wanted to have this function behave differently. There are two ways to do this: call the function with different parameters, or call a function that wraps around it, like readc.
I prefer the former because it avoids having a ton of tiny specialized functions like readc, peekc, etc. In other words, it's the idea of "where is this functionality encapsulated?" The functional approach is, "hey, let's make a function, that can then do stuff to different stuff" ala peekc, map, each, etc.
That's fine, and it's very extensible and generic in most circumstances. But in this particular case, reading or peeking a character are more specific. They only really make sense for streams, so it makes sense to encapsulate that functionality within streams, rather than making it a generic built-in function. Hence (stdin 'char) rather than (readc stdin).
This also makes it easier to special-case behavior for a particular stream. Consider the case where you want to change the behavior of stdin, but not change the behavior of other input streams. Piece of cake:
(extend stdin (x) (is x 'char)
...)
But if you try to extend `readc`, then now you end up with stuff like this:
(extend readc (x) (is x stdin)
...)
...but that won't work if somebody overwrites stdin. So now you need to store it in a let:
(let old stdin
(extend readc (x) (is x old)
...))
Basically, by passing arguments to the streams, I make it easier to extend an individual stream, but harder to extend all streams. Of course, I could still keep `readc` around, and have it defined to simply call (foo 'char), in which case we could get the best of both worlds: extending individual streams and extending globally (assuming code uses readc, rather than calling streams directly).
Oh, it also lets streams dynamically alter their behavior, without needing to extend global functions, and it makes it easier to define custom behavior for a stream, once again without needing to create more global functions, etc. etc.
I don't think it's a huge enough deal to fight over, though. As I said, the differences between the two approaches are small, and I could settle for either one. My gut says that passing arguments to streams to change their behavior is better, though. I could be wrong.
what if you could define outstring, instring, call-w/stdin, and call-w/stdout in Arc itself? What if you want to redirect to something other than a string?
Hmm, perhaps I should replace these definitions with a single generic set of methods that can be extended for new types:
"What would you use besides conses to represent s-expressions?"
What are s-expressions? I thought their main aspect was the way they used lists to represent user-extensible syntax forms (macro forms, function applications, fexpr applications). I'm not a fan, but even if we assume s-expressions, they could always use some other implementation of lists.
--
"I wonder if there's a way to merge quote and lambda into a single operator."
Do you mean like in PicoLisp, where all user-defined functions are lists containing their code, and applying a list causes it to be interpreted as a function?
I don't like it, 'cause you don't get any lexical scoping this way. It probably makes more sense in Factor and Joy thanks to the lack of local variables for run-time-created functions to capture.
In Common Lisp and Scheme, the syntax is pretty darn regular: Pretty much everything is (or can be) a reader macro. It's just that the string syntax (the #\" reader macro) and the list syntax (the #\( reader macro) are rather arbitrary and inconsistent with each other. It's mostly a matter of readability:
; current
(def foo (a b)
(+ "foo: " a b))
; somewhat more consistency
(def (foo ((a (b nil
((+ ("foo: " (a (b nil nil
; a somewhat consistent version for a lisp without cons cells
(4 def foo (2 a b
(4 + "foo: " a b
; an extremely consistent version, without the need for escape
; sequences or separating whitespace
(^4 |^3def |^3foo (^2 |^1a |^1b
(^4 |^1+ "^5foo: |^1a |^1b
(I put in some whitespace anyway 'cause I'm nice like that.)
The real issue here is that a syntax that stops at providing reader macros isn't arbitrary enough to meet the arbitrary demands of readability, so the individual syntaxes end up having to make the choices the core didn't. That makes for a greater number of arbitrary choices overall, and that more or less determines the apparent number of arbitrary choices made, and someone who has a similar but less arbitrary alternative in mind (regardless of whether it's possible to achieve) will see inconsistencies.
The arbitrary aspects of a language might be distracting and reduce productivity, but on the other hand they could punctuate the programming experience in an enjoyably artistic way, or even keep a programmer's attention focused while they plan their next moves. Maybe we don't always know what's best for us....
But I think we'll better know the full benefit of background music when we have convenient blank slates to compose it on. Programming's enthralling enough for me without background music anyway. ^_^ So I might as well continue to apply Occam's Razor without regret.
I don't think cons cells are a hack, but I do think it's a hack to use them for things other than sequences. Since we almost always want len(x)=len(cdr(x))+1 for cons cells, rather than len(x)=2, they aren't really useful as their own two-element collection type.
Yeah, I'm tempted to agree with you. In the arc.arc source code, pg even mentions a solution to improper lists: allow any symbol to terminate a list, rather than just nil.
Of course, an easier fix would be to change `cons` so it throws an error if the second argument isn't a cons or nil. Honestly, are improper lists useful often enough to warrant overloading the meaning of cons? We could have a separate data-type for B-trees.
That's one area that I can actually agree with the article, but that has nothing to do with conses in general (when used to create proper lists), only with improper lists. And contrary to what he says, it's not an "unfixable problem", instead it would probably take only 2 lines of Python code to fix it.
One thing though... function argument lists:
(fn (a b . c))
Of course I can special-case this in PyArc, so . has special meaning only in the argument list. This is, in fact, what I do right now. But that may make it harder for implementations in say... Common Lisp or Scheme (assuming you're piggybacking, rather than writing a full-on parser/reader/interpreter).
If so... then you may end up with the odd situation that it's possible to create improper lists, using the (a . b) syntax, but not possible to create improper lists using `cons`
---
By the way... how about this: proper lists would have a type of 'list and improper lists would have a type of 'cons. Yes, it would break backwards compatibility, but it might be a good idea in the long run. Or we could have lists have a type of 'cons and improper lists have a type of 'pair.
I don't know if improper lists are really a problem, just hackish. :) My "solution" would be to remove the need for them by changing the rest parameter syntax (both in parameter lists and in destructuring patterns).
---
"how about this: proper lists would have a type of 'list and improper lists would have a type of 'cons."
I don't think I like the idea of something's type changing when it's modified. But then that's mostly because I don't think the 'type function is useful on a high level; it generally seems more flexible to do (if afoo.x ...) rather than (case type.x foo ..), 'cause that way something can have more than one "type." Because of this approach, I end up using the 'type type just to identify the kind of concrete implementation a value has, and the way I think about it, the concrete implementation is whatever invariants are preserved under mutation.
That's just my take on it. Your experience with 'type may vary. ^_^
Recently, I whined a bit about Racket's namespaces and modules being too "opaque" (http://arclanguage.org/item?id=14029). This got me looking around at just how opaque they were, and I ended up figuring out how to do a lot of the stuff I wanted to do. As part of the process of figuring this stuff out, I've put together some utilities and pushed them to Anarki as lib/ns.arc. These utilities should make Racket's modules and namespaces much less of a pain to use. For instance, modules aren't exactly first-class in Racket--you have to kind of name them, attach them to module registries (which can only be interacted with by way of namespaces that share them), and then require them--but ns.arc's module creation utilities use gensyms for the module names, put them all in a single module registry, and wrap up references to the modules using a 'module tagged type, so they're effectively first-class after all.
The ns.arc utilities aren't especially well-rounded yet; there's currently no utility for actually requiring a 'module value in the current namespace, and there's no utility for getting a Racket module given a Racket module path. However, those should be easy things to add. The code covers most of the hard parts, and it should act as a good cheat sheet for implementing whatever hard parts are left over.
One especially hard part, which this doesn't begin to cover, is to make it easy to compose Arc programs. While ns.arc may help you manipulate Racket modules and namespaces from Arc, but it's hardly an Arc module system. The leading comment in ns.arc says it best.
Meanwhile, this adventure revealed some necessary (IMO) changes to ac.scm. The comment for the new 'arc-exec function says that best, and it's short enough to quote here:
///
"To make namespace and module handling more seamless (see lib/ns.arc), we use Racket's 'set! even for undefined variables, rather than using 'namespace-set-variable-value! for all Arc globals. This makes it possible to parameterize the value of 'current-namespace without getting odd behavior, and it makes it possible to assign to imported module variables and use assignment-aware syntax transformers (particularly those made with Racket's 'make-set!-transformer and 'make-rename-transformer).
"However, by default 'set! is disallowed when the variable is undefined, and we have to use the 'compile-allow-set!-undefined parameter to go against that default. Rather than sprinkling (parameterize ...) forms all over the code and trying to keep them in sync, we put them all in this function ['arc-exec], and we use this function instead of 'eval when executing the output of 'ac.
"In the same spirit, several other uses of 'namespace-variable-value and 'namespace-set-variable-value! have been changed to more direct versions ((set! ...) forms and direct variable references) or less direct versions (uses of full 'arc-eval) depending on how their behavior should change when a module import or syntax obstructs the original meaning of the variable. Some have instead been kept around, but surrounded by (parameterize ...) forms so they're tied the main namespace. Another utility changed in this spirit is 'bound?, which should now be able to see variables which are bound as Racket syntax."
\\\
With any luck I haven't wrecked people's code too much, and hopefully someone can figure out what I've put together and take advantage of it for some bigger purpose. ^_^
You beat me to the punch by 25 minutes. ^_^ My reply below (at http://arclanguage.org/edit?id=14084) covers the same sort of stuff as yours but in a longer and more scatterbrained way.