Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clojure: Chaining group-by :key with select-keys on remaining keys

Tags:

clojure

I'm trying to understand a simple (as in other languages) workflow with clojure maps.

It basically comes down to this: How can chain these operations?

  1. group-by :key on a vector of maps

  2. select-keys on remaining maps without the previous key

  3. group-by again (0..n times) and select-keys

  4. count unique key instances at the end.

See also my previous question: Aggregate and Count in Maps

Example:

Given a vector of maps

(def DATA [{:a "X", :b "M", :c "K", :d 10}
           {:a "Y", :b "M", :c "K", :d 20}
           {:a "Y", :b "M", :c "F", :d 30}
           {:a "Y", :b "P", :c "G", :d 40}])

performing group-by

(defn get-tree-level-1 [] (group-by :a DATA))

yields a map grouped by the value of that particular key.

{ X [{:a X, :b M, :c K, :d 10}],
  Y [{:a Y, :b M, :c K, :d 20}
     {:a Y, :b M, :c F, :d 30}
     {:a Y, :b P, :c G, :d 40}]}  

So far, so good. But what if I want to build a tree-like structure out of the data, which means selecting the remaining keys and ignoring some, select :b and :c and ignore :d, which would yield in the next level:

(def DATA2   [{ :X [{:b "M", :c "K"}],
                :Y [{:b "M", :c "K"}
                    {:b "M", :c "F"}
                    {:b "P", :c "G"}]}])

And finally, counting all instances of the remaining keys (e.g. count all unique values of the :b key under the Y-root):

(def DATA3   [{ :X [{:M  1}],
                :Y [{:M  2}
                    {:P  1}])

I tried doing a select-keys after the group-by, but the result was empty after the first step:

(defn get-proc-sums []
  (into {}
    (map
      (fn [ [k vs] ]
        [k (select-keys vs [:b :c])])
      (group-by :a DATA))))
like image 737
frhd Avatar asked Dec 08 '25 08:12

frhd


1 Answers

Repeated application of group-by is the wrong tool: it doesn't compose with itself very well. Rather, go over your input maps and transform each of them, once, into a format that's useful to you (using for or map), and then reduce over that to build your tree structure. Here is a simple implementation:

(defn hierarchy [keyseq xs]
  (reduce (fn [m [ks x]]
            (update-in m ks conj x))
          {}
          (for [x xs]
            [(map x keyseq) (apply dissoc x keyseq)])))

user> (hierarchy [:a :b :c] '[{:a "X", :b "M", :c "K", :d 10}
                              {:a "Y", :b "M", :c "K", :d 20}
                              {:a "Y", :b "M", :c "F", :d 30}
                              {:a "Y", :b "P", :c "G", :d 40}])
{"Y" {"P" {"G" ({:d 40})},
      "M" {"F" ({:d 30}),
           "K" ({:d 20})}},
 "X" {"M" {"K" ({:d 10})}}}

This gives you the hierarchical format that you want, with a list of all maps with only the "leftover" keys. From this, you can count them, distinct them, remove the :d key, or whatever else you want, either by writing another function that processes this map, or by adjusting what happens in the reduce function, or the for comprehension, above.

like image 90
amalloy Avatar answered Dec 10 '25 16:12

amalloy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!