Arc Forumnew | comments | leaders | submitlogin
Observations on homoiconicity (
4 points by Mitranim 235 days ago | 6 comments

1 point by akkartik 234 days ago | link

Thanks for sharing! It's been a while since I thought about this, but my conclusion a few years ago[1][2] was that homoiconicity was an ill-posed idea. A language only really needs two properties to make macros convenient:

* Everything is an expression, and

* Unambiguous parens.




That said:

* I like symbols. Having symbols and strings in a Lisp doesn't seem any more weird than having variables and string literals in Algol or Java. Even if I _could_ use one in place of the other, it seems useful to separate symbols as part of the program from strings as part of the environment/domain/data model. You could implement one with the other, sure, but I wouldn't question anyone who chose not to. After all, every attempt I've seen to support spaces in symbols has been quite ugly.

* Your other points seem to be in the context of building tools like code formatters where the output will be seen by humans. That's not really the context for Lisp macros. I haven't really thought much about how useful they would be. Since I believe in macros, I can probably be persuaded. So that might be an interesting post to write. Why do you care about emitting exactly the code that was parsed? What new apps does it enable?

Are you aware of any languages that perfectly reproduce input layout? Even Go supports gofmt only by tightly constraining how programs "should" be indented, and only outputting that layout.


2 points by Mitranim 234 days ago | link

> Why do you care about emitting exactly the code that was parsed? What new apps does it enable?

Got a `gofmt` addiction, can't go back. Auto-formatting should ship with every language.

Briefly skimmed the Go implementation, and you seem to be right: it seems to lose whitespace and enforce its own formatting.

> Are you aware of any languages that perfectly reproduce input layout?

For now just my own. [1] The language isn't real yet, and might never be realized, but it has a base data notation (very Lisp-like), a parser, and I just started writing a formatter. Because the AST for the data notation preserves whitespace and comments, the formatter can print the code _exactly_ as is. This has interesting repercussions.

For a fully-implemented formatter for a fully-defined language, you wouldn't need whitespace; see Go. However, being able to print everything back means your formatter is usable from the start. It can support one or two simple rules, making only minor modifications, but you can use it on real code right away. Furthermore, this means we'll _always_ be able to choose which rules to enable or disable, which can be handy if the family of languages described in terms of this notation has different formatting preferences. I actually want the formatter shipped with the language, like `gofmt`, to be non-configurable, but this still seems like a useful quality.



2 points by Mitranim 234 days ago | link

Should clarify something (follow-up to previous reply, see below or above).

Most macros don't want to deal with whitespace and comments. We also might want to _not let_ them, otherwise people will start using comments as code, like in Ruby. Macros just want expressions. So, we would define a second level of the AST and perform a second pass.

For the same base notation, there may be multiple languages defined in terms of it. If such a language has any form of prefix or infix, or uses the `outside_parens()` calling convention, the second pass would have to group nodes into expressions, in ways specific to that language. Furthermore, it should be addled with metadata about packages, types, and so on. The resulting AST is compiled, fed to macros, etc.


1 point by akkartik 233 days ago | link

You might enjoy this talk which mentions this equation:

    code = AST + formatting/comments


2 points by Mitranim 233 days ago | link

I might check it out later, but is there a gist in text form somewhere?


2 points by Mitranim 235 days ago | link

While Lisp-style homoiconicity can simplify the initial implementation of a dynamic language, there are tradeoffs; the post lists a few.