Arc Forumnew | comments | leaders | submit | aaco's commentslogin
3 points by aaco 6175 days ago | link | parent | on: Clarification about Character Sets

I fail to see where Arc doesn't support Unicode, since it seems to me that Arc is just using MzScheme strings, which are just Unicode strings.

Can someone explain this to me?

Some examples:

  ;◠ is a 2 bytes Unicode char, but I guess it's escaped in this forum, so replace it with the correct character when testing.
  
  arc> (len "a◠b") ; Unicode
  3
  
  arc> (len "axb") ; ascii
  3
  
  arc> (coerce #\◠ 'int) ; Unicode
  9696
  
  arc> (coerce #\x 'int)  ; ascii
  120
  
  arc> (subseq "a◠b" 1 2) ; Unicode
  "◠"
  
  arc> (subseq "axb" 1 2)  ; ascii
  "x"
  
Where does Arc don't support Unicode?!

-----

3 points by olavk 6175 days ago | link

That just shows how agile PG is. He added unicode support the minute he saw people request it! :)

Seriously, PG explicitly claims that Arc intentionally doesn't support anything but ASCII (http://www.arclanguage.org/), so that might be why people (including me) believed that to be the case.

-----

1 point by aaco 6175 days ago | link

Yes, I think Arc intentionally supports only ASCII just to not bother with Unicode issues as of right now.

Anyway, I can't see how Unicode can break in Arc. I'm not a Lisper, but I think you can't extract 1 byte from an Arc string (since it's just a MzScheme string), but 1 char instead. That's a different concept, because in Unicode 1 char can be formed with 1, 2 or more bytes.

-----

2 points by bobbane 6175 days ago | link

Watch out - that's single-portable-implementation thinking. When Paul puts out another release of Arc based on, say, another Scheme implementation, or SBCL, those tricks won't work.

-----

2 points by aaco 6176 days ago | link | parent | on: Ask Arc: What's the Arc symbol going to be?

I'd use an Unicode arc.

◠ 9696 25E0 UPPER HALF CIRCLE

See http://www.alanwood.net/unicode/geometric_shapes.html

Would work as a presage for a soon Unicode implementation.

-----

2 points by bayareaguy 6176 days ago | link

Character 2221 (measured angle) from the mathematical operators range would be good too.

-----