Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r: how to insert a value into a specific pattern

Tags:

regex

r

I'm trying to add a special character to a series of values. But I don't know how.

Here is the original input:

chemical <- "200mL of Ac2O3, 3.5mml of AgBF4, 10.0ml of AgBr, 100ml of AgCl3Cu2"

And I want:

"200mL of Ac~2~O~3~, 3.5mml of AgBF~4~, 10.0ml of AgBr, 100ml of AgCl~3~Cu~2~"

Basically, I am adding a "~" before and after anytime there is a number in the chemical formula in the original data.

I was trying to use gsub but I am not sure how I am supposed to tell R to find just those numbers in a chemical formula and then do the insertion.

Does anyone have a thought on this? Thank you!

like image 504
Connie Avatar asked Dec 07 '25 06:12

Connie


2 Answers

gsub("(?<=[A-Za-z])([0-9])","~\\1~",chemical,perl = T)
[1] "200mL of Ac~2~O~3~, 3.5mml of AgBF~4~, 10.0ml of AgBr, 100ml of AgCl~3~Cu~2~"

Here you need to use the positive lookback syntax ?<= to specify that you want your numbers to be preceded by letters, upper case or lower case [A-z]. You use parentheses for the number to make a capture group, that you call with \1, ecsaped with \ in your replacement: ~\\1~. The perl = T is there to allow for the positive lookback syntax

like image 68
denis Avatar answered Dec 09 '25 20:12

denis


This succeeds. Whether it will deliver from a more varied case might be an issue:

gsub("([^ [:digit:].])([[:digit:]])", "\\1~\\2~", chemical)
#[1] "200mL of Ac~2~O~3~, 3.5mml of AgBF~4~, 10.0ml of AgBr, 100ml of AgCl~3~Cu~2~"

Logic is to match a pairing of a {non-digit,non-space, non-decimal point} character followed by a digit and put a tilde flanking hte digit. If the size of the "number" could ever exceeds 9 then you would want to put a quantified after the digit: "[[:digit:]]{1, 30}" perhaps.

like image 30
IRTFM Avatar answered Dec 09 '25 19:12

IRTFM