It seems grep is "greedy" in the way it returns matches. Assuming I've the following data:
Sources <- c(
                "Coal burning plant",
                "General plant",
                "coalescent plantation",
                "Charcoal burning plant"
        )
Registry <- seq(from = 1100, to = 1103, by = 1)
df <- data.frame(Registry, Sources)
If I perform grep("(?=.*[Pp]lant)(?=.*[Cc]oal)", df$Sources, perl = TRUE, value = TRUE), it returns 
"Coal burning plant"     
"coalescent plantation"  
"Charcoal burning plant" 
However, I only want to return exact match, i.e. only where "coal" and "plant" occur. I don't want "coalescent", "plantation" and so on. So for this, I only want to see "Coal burning plant"
To Show Lines That Exactly Match a Search String The grep command prints entire lines when it finds a match in a file. To print only those lines that completely match the search string, add the -x option. The output shows only the lines with the exact match.
grep exact match with -w Now with grep we have an argument ( -w ) which is used to grep for exact match of whole word from a file.
Grep is a Linux command-line tool used to search for a specific string or text in the file. You can use it with a regular expression to be more flexible at finding strings. You can also use the grep command to find only those lines that completely match the search string.
If you always want the order "coal" then "plant", then this should work
grep("\\b[Cc]oal\\b.*\\b[Pp]lant\\b", Sources, perl = TRUE, value=T)
Here we add \b match which stands for a word boundary. You can add the word boundaries to your original attempt we well
grep("(?=.*\\b[Pp]lant\\b)(?=.*\\b[Cc]oal\\b)", Sources, 
    perl = TRUE, value = TRUE)
You want to use word boundaries \b around your word patterns. A word boundary does not consume any characters. It asserts that on one side there is a word character, and on the other side there is not. You may also want to consider using the inline (?i) modifier for case-insensitive matching.
grep('(?i)(?=.*\\bplant\\b)(?=.*\\bcoal\\b)', df$Sources, perl=T, value=T)
Working Demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With