Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Retrieve data from split string in a column based on value in another column

Tags:

split

r

strsplit

I have a very large data frame like:

df = data.frame(nr = c(3,3,4), dependeny = c("6/3/1", "9/3/1",
  "5/4/4/1"), token=c("Trotz des Rückgangs", 
  "Trotz meherer Anfragen", "Trotz des ärgerlichen Unentschiedens"))

  nr dependeny                                token
1  3     6/3/1                  Trotz des Rückgangs
2  3     9/3/1               Trotz meherer Anfragen
3  4   5/4/4/1 Trotz des ärgerlichen Unentschiedens

I would like to add a 4th column with an extract from "token", depending on values in "nr" and "dependency". More precisely, I want the elements from "token", that correspond to the values in "dependency" that correspond to "nr".

Examples: Row 1: I want "des", because "nr" is 3, and 2 is the second element in "dependency". The second element in "token" is "des".

Row 3: I want "des ärgerlichen", because "nr" is 4, and 4 is the second and third element in "dependency". The second and third elements in "tokens" are "des ärgerlichen.

I've tried with split and str_split, but do not know how to address the resulting elements.

like image 211
Simone Avatar asked Dec 06 '25 05:12

Simone


1 Answers

We can use base R methods to create the 4th column.

unlist(Map(function(x,y,z) paste(z[x==y], collapse=' '), 
         df$nr,strsplit(as.character(df$dependeny), '/'), 
            strsplit(as.character(df$token), ' ')))
#[1] "des"             "meherer"         "des ärgerlichen"
like image 67
akrun Avatar answered Dec 09 '25 20:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!