Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group by Numerical Ranges in F#

Tags:

f#

deedle

How would one group by a numerical range in F# and/or Deedle. I.e. I'm looking at data in feet, and I want to group into buckets of 500ft

E.g.

I have data like

5000 5200 5700 5800 6100 6200 6300

And I want groups

{5000, 5200} {5700, 5800} {6100, 6200, 6300}

like image 691
Tyler Cowan Avatar asked Jan 22 '26 09:01

Tyler Cowan


2 Answers

As you mentioned Deedle in the question, I'm going to add an answer based on Deedle series. This would be useful if you had some observations and wanted to group data based on the keys (e.g. times of the observations). Say we have:

let obs = series [ 5000 => 1.0; 5200 => 2.0; 5700 => 3.0; 5800 => 4.0; 
                   6100 => 5.0; 6200 => 6.0; 6300 => 7.0 ]

Now you can create a series containing one series of values for each bucket using:

obs |> Series.chunkWhile (fun k1 k2 -> k1/500 = k2/500)

This is the same trick as in Fyodor's answer - we will keep things in a single bucket as long as the key divided by 500 is the same for all items in a bucket.

This would be useful if you wanted to do some further calculation, such as get the average per bucket for each starting point of the series:

obs 
|> Series.chunkWhile (fun k1 k2 -> k1/500=k2/500)
|> Series.mapKeys (fun k -> (k / 500) * 500)
|> Series.mapValues Stats.mean

However, if you are only interested in calculating the groups as in your question, then Deedle is probably overkill.

like image 127
Tomas Petricek Avatar answered Jan 25 '26 06:01

Tomas Petricek


It's not entirely clear what you mean by "buckets of 500ft". If I assume that a "bucket" is defined as a range 500*N .. 500*(N+1), where N is an integer number, then you can easily get the index of the bucket to which a given number belongs via integer division by 500. Then you can group by that index:

let data = [5000; 5200; 5700; 5800; 6100; 6200; 6300]
let groups = data |> Seq.groupBy (fun x -> x/500)

> 
val groups : seq<int * seq<int>> =
  seq
    [(10, seq [5000; 5200]); (11, seq [5700; 5800]);
     (12, seq [6100; 6200; 6300])]
like image 28
Fyodor Soikin Avatar answered Jan 25 '26 06:01

Fyodor Soikin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!