Add column based on pattern matching

jk454 Source

I have a data frame with two columns and would like to fill the second column (which is currently empty) based on the value of the first. If there is "_ab", "_cd" or "_ef" at any position in the first column, "ab", "cd" or "ef" respectively should be in column 2.

That's how it should look:

c1         c2
s_d_ab     ab
a_cd_aa    cd
s_sar_ef   ef

In Excel I would copy down the formula =IF(FIND("_ab",A1), "ab", ""), then hide the rows where there was no match and overwrite the formula with =IF(FIND("_ab",A1), "ab", ""). And so on. Not the most elegant method but it works.

What is the best way to approach this in R?

I managed to get a vector of logical values where the conditions apply (grepl("_abc", data$c1)) but don't know how to set the value of the second column.



answered 5 years ago Justin #1

You can use the sub function:

x <- c('s_d_ab', 'a_cd_aa', 's_sar_ef')
sub('.*_(ab|cd|ef).*', '\\1', x)
# [1] "ab" "cd" "ef"

comments powered by Disqus