By the way, I just realized something... symbols, when evaluated, are basically a variable reference... but strings are self-evaluating. So if strings == symbols... then what should (+ "foo" 1 2 3) do? Should it treat the "foo" as a variable reference, or a literal? Would we then need to use '"foo" instead?
So... because symbols evaluate to variables, and strings are self-evaluating, I don't really see a way to use the same syntax for both of them. If we decide that "foo" is equivalent to 'foo then that means we cannot have variable names with weird stuff in them.
On the other hand, if we decide "foo" is equivalent to foo, then that means we can no longer use string literals without quoting them, which is pretty funky. So any solution we choose will be right half the time.
Thus.. we would need to have a separate syntax for them, like using #s"foo" for symbols, and "foo" for strings...
---
Hm... crazy brainstorming: global symbols that don't refer to anything are self-evaluating. That would solve the problem at the cost of being really crazy and confusing. :P
In any case, if strings == symbols, then "foo" would basically just be a shorthand for '#s"foo" (note the quote)
That's pretty much what I used for my original example at http://arclanguage.org/item?id=14883. If I thought (quote ...) around strings was a deal breaker, the topic would have never come up. :-p
---
"Hm... crazy brainstorming: global symbols that don't refer to anything are self-evaluating."
Isn't that the same idea as http://arclanguage.org/item?id=13823? Or are you having this be determined at compile time rather than run time? That could be an interesting spin on things.
"Isn't that the same idea as http://arclanguage.org/item?id=13823? Or are you having this be determined at compile time rather than run time? That could be an interesting spin on things."
Yes. To be more specific, it's the same idea as point #3 in that post. But it doesn't matter whether it's done at compile time or runtime. The end result is the same: if you have a string literal "foo" and somebody defines a global variable foo, then suddenly your program's behavior changes drastically. Which is why I called it "really crazy and confusing".
--
So, right now, my opinion is that "foo" should basically mean the same thing as (quote \s"foo") where \s"" is syntax for creating a symbol with odd characters in it. That approach isn't 100% ideal in every circumstance[1], but it should be overall the best in the general case.
There is one question, though: should symbols or strings be the primitive? In other words, should it be (isa "foo" 'sym) or (isa "foo" 'string) ?
Personally, though the term "string" is more familiar, I'm actually leaning toward sym. The word "string" is pretty confusing, when you think about it, but "symbol" is a very reasonable thing to call a sequence of characters.
---
As a side effect of making symbols eq to strings... we would also make this work:
(coerce 'foo 'cons) -> (\\f \\o \\o)
I'm using \\f to mean the same thing as #\f by the way.
Hm... what if we removed chars from the language? What's the purpose of them, really?
---
P.S. Somewhat related: PHP apparently sometimes treats strings as variable references:
"But it doesn't matter whether it's done at compile time or runtime. The end result is the same: if you have a string literal "foo" and somebody defines a global variable foo, then suddenly your program's behavior changes drastically."
If it's done at compile time, it's a bit better: The program's behavior doesn't change at all if someone defines a global variable 'foo after your code has been loaded.
---
"So, right now, my opinion is that "foo" should basically mean the same thing as (quote \s"foo")..."
My first impression seeing reader syntaxes like \s"foo" is that the reader will recursively read "foo" and then convert the result somehow.
I guess it could read "foo" as (quote foo) and convert that result using 'cadr, lol. :-p
---
"The word "string" is pretty confusing, when you think about it, but "symbol" is a very reasonable thing to call a sequence of characters."
Huh, nice observation. ^_^ I think I've called strings "texts" sometimes, so mayhaps that's another option.
---
"I'm using \\f to mean the same thing as #\f by the way."
I see that as using \ in a non-escaping way. I don't have a problem with using it that way, but I don't have a problem with using it as a delimiter either.
---
"Hm... what if we removed chars from the language? What's the purpose of them, really?"
Would you have a symbol be a sequence of length-one symbols, and have every length-one symbol be its own element? Anyway, I don't have any opinion about this. :-p
---
"P.S. Somewhat related: PHP apparently sometimes treats strings as variable references"
Jarc does the same thing if you call a symbol. I don't really have an opinion about this either.
"Huh, nice observation. ^_^ I think I've called strings "texts" sometimes, so mayhaps that's another option."
Sure, but traditionally Lisp has used the term "symbol", and even languages like Ruby have picked up on it. And there's another thing. Symbols in Lisp are often used as variable names. In that context, the word "text" doesn't make much sense, but the word "symbol" still makes perfect sense. So I still vote for "symbol", even though I think "text" is more reasonable than "string".
---
"Would you have a symbol be a sequence of length-one symbols, and have every length-one symbol be its own element? Anyway, I don't have any opinion about this. :-p"
Yes. :P At least, when coerced to a list. This would let us get rid of two similar-but-not-quite-the-same data types: strings and chars. It annoys me that I have to use (tokens foo #\newline) rather than (tokens foo "\n")
I don't really see much purpose or benefit of having a distinction between chars and strings... Python and JavaScript get by just fine with only strings, for instance.
In addition, I find myself juggling between symbols and strings a lot... converting to a sym when I want Arc to see it as an identifier, and as a string when I want to do things like index into it, or coerce it to a list... or when Arc code requires a string rather than a sym, etc... The more I think about it, the better it seems to unify strings/symbols/characters.
I thought the point of the term "symbol" was to signify something that was used as a name. I do think it's a tad more evocative to call every string a symbol, but it feels a bit like calling every cons cell a form.
Inasmuch as I have any desire to be traditional, I'll call strings strings. :-p