Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

groupBy on List as LinkedHashMap instead of Map

I am processing XML using scala, and I am converting the XML into my own data structures. Currently, I am using plain Map instances to hold (sub-)elements, however, the order of elements from the XML gets lost this way, and I cannot reproduce the original XML.

Therefore, I want to use LinkedHashMap instances instead of Map, however I am using groupBy on the list of nodes, which creates a Map:

For example:

  def parse(n:Node): Unit = 
  {
    val leaves:Map[String, Seq[XmlItem]] =
      n.child
        .filter(node => { ... })
        .groupBy(_.label)
        .map((tuple:Tuple2[String, Seq[Node]]) =>
        {
          val items = tuple._2.map(node =>
          {
            val attributes = ...

            if (node.text.nonEmpty)
              XmlItem(Some(node.text), attributes)
            else
              XmlItem(None, attributes)
          })

          (tuple._1, items)
        })

      ...
   }

In this example, I want leaves to be of type LinkedHashMap to retain the order of n.child. How can I achieve this?

Note: I am grouping by label/tagname because elements can occur multiple times, and for each label/tagname, I keep a list of elements in my data structures.


Solution
As answered by @jwvh I am using foldLeft as a substitution for groupBy. Also, I decided to go with LinkedHashMap instead of ListMap.

  def parse(n:Node): Unit = 
  {
    val leaves:mutable.LinkedHashMap[String, Seq[XmlItem]] =
      n.child
        .filter(node => { ... })
        .foldLeft(mutable.LinkedHashMap.empty[String, Seq[Node]])((m, sn) =>
        {
          m.update(sn.label, m.getOrElse(sn.label, Seq.empty[Node]) ++ Seq(sn))
          m
        })
        .map((tuple:Tuple2[String, Seq[Node]]) =>
        {
          val items = tuple._2.map(node =>
          {
            val attributes = ...

            if (node.text.nonEmpty)
              XmlItem(Some(node.text), attributes)
            else
              XmlItem(None, attributes)
          })

          (tuple._1, items)
        })
like image 932
user826955 Avatar asked Oct 26 '25 09:10

user826955


1 Answers

To get the rough equivalent to .groupBy() in a ListMap you could fold over your collection. The problem is that ListMap preserves the order of elements as they were appended, not as they were encountered.

import collection.immutable.ListMap

List('a','b','a','c').foldLeft(ListMap.empty[Char,Seq[Char]]){
  case (lm,c) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res0: ListMap[Char,Seq[Char]] = ListMap(b -> Seq(b), a -> Seq(a, a), c -> Seq(c))

To fix this you can foldRight instead of foldLeft. The result is the original order of elements as encountered (scanning left to right) but in reverse.

List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
  case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res1: ListMap[Char,Seq[Char]] = ListMap(c -> Seq(c), b -> Seq(b), a -> Seq(a, a))

This isn't necessarily a bad thing since a ListMap is more efficient with last and init ops, O(1), than it is with head and tail ops, O(n).

To process the ListMap in the original left-to-right order you could .toList and .reverse it.

List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
  case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}.toList.reverse
//res2: List[(Char, Seq[Char])] = List((a,Seq(a, a)), (b,Seq(b)), (c,Seq(c)))
like image 69
jwvh Avatar answered Oct 29 '25 07:10

jwvh



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!