Racket sequences in data/collection vs built-in sequences

Question

I've been playing with some of the interfaces in data/collection and I'm loving it so far. Having generic interfaces to different Racket collections like lists, streams, and sequences, is really handy -- especially given the diversity of interfaces to such types otherwise (list-*, vector-*, string-*, stream-*, sequence-*, ... !).

But do these interfaces play well with the built-in sequences in Racket? Specifically, I'm running into this error:

(require data/collection)
(take 10 (in-cycle '(1 2 3)))

=>

; take: contract violation
;   expected: sequence?
;   given: #<sequence>
;   in: the 2nd argument of
;       (-> natural? sequence? sequence?)
;   contract from: 
;       <pkgs>/collections-lib/data/collection/sequence.rkt
;   blaming: top-level
;    (assuming the contract is correct)
;   at: <pkgs>/collections-lib/data/collection/sequence.rkt:53.3

The function in-cycle returns a built-in "sequence," while the polymorphic take provided by data/collections expects its own special sequence interface.

In this particular case I could manually define a stream to replace the built-in in-cycle, something like:

(define (in-cycle coll [i 0])
  (stream-cons (nth coll (modulo i (length coll)))
               (in-cycle coll (add1 i))))

... which works, but there are an awful lot of built-in sequences defined so I'm wondering if there's a better, perhaps standard/recommended way to handle this. That is, can we take advantage of all the built-in sequences in terms of the sequences defined in data/collection, the same way the latter wraps other existing sequences like lists and streams?

5 revsmindthief · Accepted Answer

After working through it some more, I think I have a better understanding of sequences in Racket and in data/collection. I'll take a crack at summarizing all the points that have been brought up in other answers and comments, and also include my own learnings.

Racket sequences, that is, the ones that come built-in, are intended to be a generic interface to all ordered collections, the same way that you can use dict-* functions to work with any dictionary type including hashes. In addition, there are also many handy utilities that provide builtin sequences to make it easy to work with ordered data in different scenarios, like a sequence of elements taken from a collection, or a sequence of inputs received on some input port, or a sequence of key-value pairs taken from a dictionary -- this last of which isn't inherently an "ordered" collection, but which can be treated as one by using the built-in sequence interface.

So we can think of builtin sequences as having a twofold purpose:

being a uniform interface for ordered data, and
making it convenient to work with sequences in diverse scenarios by providing a natural sequence interface implementation in each case.

Now, although builtin sequences are intended to be a uniform interface for ordered collections in theory, in practice they are not especially usable for this purpose due to their verbosity, e.g. sequence-take and sequence-length instead of just the take and length that we'd use for lists.

data/collection sequences address this shortcoming by virtue of their names being short and canonical, like take instead of sequence-take. In addition, these sequences also provide drop-in replacements for many of the sequence utilities provided by builtin sequences, like cycle and naturals instead of in-cycle and in-naturals, along with a generic in function to derive lazy versions of any sequence for use in iteration (like (in (naturals))). These data/collection versions are generally more "well-behaved" by virtue of being immutable, which builtin sequences do not guarantee. As a result, data/collection sequences can be thought of as a replacement for builtin sequences in many cases, largely taking over the first of the two purposes of builtin sequences.

That is, in places where you are dealing with sequences, consider using data/collection sequences instead of builtin sequences, not as a way to work with builtin sequences.

On point (2), however, the following are the types that are currently treatable as data/collection sequences:

lists
immutable hash tables
immutable vectors
immutable hash sets
immutable dictionaries
streams

(source)

That's plenty, but there are still more scenarios in which a common-sense sequence could be derived. For any such cases that are not covered above, the builtin sequence utilities are still useful, like in-hash and in-port which have no analogues in data/collection sequences. In general there are many cases in which we can easily derive a builtin sequence (see the utilities here) but not a data/collection sequence. In these special cases, we could simply convert the builtin sequence so obtained into a stream via sequence->stream and then use it via the simpler data/collection sequence interface, since streams are treatable as sequences of either type.

Racket sequences in data/collection vs built-in sequences

Tags:

racket

generic-programming

mindthief

1 Answers

5 revsmindthief

Recent Activity

Donate For Us

Racket sequences in data/collection vs built-in sequences

Tags:

racket

generic-programming

mindthief

1 Answers

5 revsmindthief

Related questions

Recent Activity

Donate For Us