Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separate multi-value obs with pairs of values and count

Tags:

r

I have a data frame combining single and multi-values obs.

 dataset <- c("Apple;Banana;Kiwi",  "orange", "Apple;Banana", "orange" )

 dataset <- as.data.frame(dataset)

My output :

           dataset
1 Apple;Banana;Kiwi
2            orange
3      Apple;Banana
4            orange

What I want : separate by pairs all the combinaisons of values into 2 columns and count to make a graph

from  |to    |weight
Apple |Banana|2
Apple | Kiwi | 1
Banana| Kiwi | 1
orange|NA    |2

What I tried :

dataset2 <- dataset %>%
  separate_rows(dataset, sep = ";")
like image 424
Wilcar Avatar asked Oct 20 '25 04:10

Wilcar


1 Answers

We may use combn on each row and get the frequency

stack(table(unlist(lapply(strsplit(dataset$dataset, ";"), 
   function(x) if(length(x) > 1) combn(x, 2, FUN = toString) else x))))[2:1]

-output

            ind values
1 Apple, Banana      2
2   Apple, Kiwi      1
3  Banana, Kiwi      1
4        orange      2
like image 101
akrun Avatar answered Oct 21 '25 17:10

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!