Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

With data.table, return between certain characters into a new column

I have a feeling this might be a simple question, but I've searched through SO for a bit now and found many interesting related Q/A, I'm still stumped.

Here's what I need to learn (in honesty, I'm playing with the kaggle Titanic dataset, but I want to use data.table)...

Let's say you have the following data.table:

dt <- data.table(name=c("Johnston, Mr. Bob", "Stone, Mrs. Mary", "Hasberg, Mr. Jason"))

I want my output to be JUST the titles "Mr.", "Mrs.", and "Mr." -- heck we can leave out the period as well.

I've been playing around (all night) and discovered that using regular expressions might hold the answer, but I've only been able to get that to work on a single string, not with the whole data.table.

For example,

substr(dt$name[1], gregexpr(",.", dt$name[1]), gregexpr("[.]", dt$name[1]))

Returns:

[1] ", Mr."

Which is cool, and I can do some further processing to get rid of the ", " and ".", but, the optimist(/optimizer) in me feels that that's ugly, gross, and inefficent.

Besides, even if I wanted to settle on that, (it pains me to admit) I don't know how to apply that into the J of data.table....

So, how do I add a column to dt called "Title", that contains:

[1] "Mr"
[2] "Mrs"
[3] "Mr"

I firmly believe that if I'm able to use regular expressions to select and extract data within a data.table that I will probably use this 100x a day. So thank you in advance for helping me figure out this pivotal technique.

PS. I'm an excel refugee, in excel I would just do this:

=mid(data, find(", ", data), find(".", data))
like image 498
wizard_draziw Avatar asked Dec 21 '25 12:12

wizard_draziw


1 Answers

Umm.. I may have figured it out:

dt[, Title:=sub(".*?, (.*?)[.].*", "\\1", name)]

But I'm going to leave this here in case anyone else needs help, or perhaps there's an even better way of doing this!

like image 194
wizard_draziw Avatar answered Dec 23 '25 02:12

wizard_draziw



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!