Arc Forumnew | comments | leaders | submitlogin
Subseq is verbose compared to ranges in Ruby and Python
43 points by emmett 6119 days ago | 25 comments
From the arc code, slices don't appear to be nearly as powerful as in Ruby or Python.

In Ruby:

    >> "foobar"[2..-2]
    "oba"
To do this in arc, you currently need to write:

    > (subseq "foobar" 2 (- (len "foobar") 1))
    "oba"
This should be improved in two ways. First, subseq should allow negative indices:

    > (subseq "foobar" 2 -2)
    "oba"
Next, string and array indexing should allow two arguments:

    > ("foobar" 2 -2)
    "oba"


6 points by fallintothis 6119 days ago | link

Incorporating the step parameter as in Python's indexing notation could also be interesting:

  "abcdefghijklmnopqrstuvwxyz"[10:2:-1] #=> "kjihgfed"
  "abcdefghijklmnopqrstuvwxyz"[2:20:2] #=> "cegikmoqs"
The one way that it might not work so hot in an s-expression is Python's blanking-out of indexes to indicate the default, e.g.,

  "abcdefghi"[::-2] #=> "igeca"
I'd like to see as much of this sort of range functionality as possible -- it's quite convenient -- but in an s-expression the above would be something like

  ("abcdefghi" -2) ;no distinction!
Perhaps use _ or some other symbol for the default? The cleanest would probably be to stick to the one-plus-optional-second-or-third format and just have a function that collects every nth element (then negative steps with default begins / ends would also be a chain of rev and firstn / nthcdr). Just my two cents.

-----

9 points by simonb 6119 days ago | link

To me blanking is expressed most naturally as nil.

-----

8 points by nex3 6119 days ago | link

I vote yea!

I think lists should support this as well, if strings and arrays do. Also, indexing from the back also comes in very handy:

  > ("foobar" -3)
  #\a
  > ("foobar" -4 -2)
  "oba"

-----

17 points by pg 6119 days ago | link

This looks promising. Thanks, Emmett!

-----

1 point by pg 6107 days ago | link

I've changed subseq (whose name is also going to become cut).

I don't want to add the second arg in string references, though, because omitting the cut implies you're getting a pointer within the original string, which you could then modify by setting elements.

Are pointers within strings useful? I.e. would be it be useful if this worked:

  arc> (= ss "foobar")
  "foobar"
  arc> (= ((ss 2 -2) 0) #\x)
  #\x
  arc> ss
  "foxbar"
  
There's no built-in way to get make a var point into the middle of a string in MzScheme, but I could make it happen if there was a need for it.

-----

6 points by scn 6119 days ago | link

This is the subseq from arc.arc with an extra (commented) line that would allow -ve end parameters.

  (def subseq (seq start (o end (len seq)))
  ; (if (< end 0) (= end (+ (len seq) end)))
    (if (isa seq 'string)
      (let s2 (newstring (- end start))
        (for i 0 (- end start 1)
          (= (s2 i) (seq (+ start i))))
        s2)
      (firstn (- end start) (nthcdr start seq))))

-----

2 points by nlavine 6118 days ago | link

Hmm. The obvious implementation to me seems to be this:

  (def slice (seq start (o end (len seq)))
    (subseq seq (mod start (len seq)) (mod end (len seq))))
However, this would give slice meaning for indices outside the length of the sequence, by modding them back into the range. Is this ever actually useful? (Perhaps in some sort of loop that doesn't know the range of the sequence? Maybe you want something that looks like an infinitely long sequence but is actually a cycle of some length?)

-----

2 points by mdemare 6119 days ago | link

This topic came up earlier: http://arclanguage.org/item?id=218

-----

2 points by mdemare 6119 days ago | link

Did you mean to omit subseq in the last example?

-----

4 points by Xichekolas 6118 days ago | link

Yes, he did.

-----

2 points by mst 6119 days ago | link

It also strikes me that it'd be nice to be able to do N indices -

  > ("foobar" (list 1 3 -2))
  ("f" "o" "a")
(I probably meant characters there rather than one-char strings, but you get the point ...)

-----

4 points by vsingh 6118 days ago | link

The negative index syntax doesn't work yet, but for the rest of your example:

    (map "foobar" (list 0 2 4))
    (#\f #\o #\a)

-----

2 points by rapp 6119 days ago | link

(if (< end 0) (= end (+ end (len seq))))

(if (< start 0) (= start (+ start (len seq))))

-----

4 points by scn 6119 days ago | link

That gives

  (subseq "hello" 0 -1)
  "hello"
I was going for

  (subseq "hello" 0 -1)
  "hell"
to match python's

  "hello"[:-1]
  "hell"

Are reverse strings 0 or -1 indexed? :)

-----

3 points by mdemare 6119 days ago | link

Both. They use ranges, with have the inclusive (..) and the exclusive notation:

    "ruby"[0 .. -1] #=> "ruby"
    "ruby"[0 ... -1] #=> "rub"

-----

2 points by nex3 6119 days ago | link

Ruby does it the other way:

  > "hello"[-3..-1]
  "llo"
However, given that we can do something like

  > (subseq "hello" -4)
  "ello"
if we want to take the last n characters, doing zero-based reverse indexing might make more sense.

-----

2 points by oddbod 6118 days ago | link

Tcl lrange: http://tcl.tk/man/tcl8.5/TclCmd/lrange.htm

    (subseq "hello" 0 'end)
    "hello"
    (subseq "hello" 0 -1)
    "hell"

-----

2 points by rapp 6119 days ago | link

Your 100% right. I was blindly matching the test case:

    > (subseq "foobar" 2 -2)
    "oba"

-----

1 point by immanuel 6119 days ago | link

"asdfa"[4..-3] ??? "asdfa"[9 -20] ?? "asdfa"[0..-0] ?? Only seems to make reasoning about code more difficult in return for brevity. Helpful for programmers whose main productivity obstacle is typing speed.

-----

1 point by sjs 6118 days ago | link

"aoeui"[3:1] ??? "aoeui"[12:42] ?? "aoeui"[0:0]

With or without negative indices one can do pointless things with slicing/subseq. Returning an empty string/list is the only sane thing to do in those cases.

    >>> s="aoeui"
    >>> s[3:1]
    ''
    >>> s[4:-3]
    ''
    >>> s[9:-20]
    ''
    >>> s[0:-0]
    ''
    >>> s[12:42]
    ''

-----

1 point by mec 6118 days ago | link

Why not return the reverse when going backwards within bounds?

    >>> s="aoeui"
    >>> s[3:1]
    "eoa"
    >>> s[4:-3]
    "ue"
    >>> s[9:-20]
    ''
    >>> s[0:-0]
    'a'
    >>> s[12:42]
    ''

-----

2 points by sjs 6118 days ago | link

That is too magic for my taste. [0:-0] returning the first char is madness. That should be '' no matter what, as 0 == -0 on modern cpus (thankfully).

-----

6 points by noahlt 6119 days ago | link

It just takes getting used to, like prefix notation.

-----

1 point by mayson 6119 days ago | link

>>"asdfa"[4..-3] ??? "asdfa"[9 -20] ?? "asdfa"[0..-0] ??

either "" or an error (-0 doesn't strike me as very nice syntax, reverse indexing should probably start w -1)

-----

2 points by gugamilare 6118 days ago | link

Totally agreed! This is much more arc!

-----