Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing multiple strings in multiple files

Tags:

bash

awk

I have a file containing a list of regular expressions and replacement literal strings in the following format :

OLD_REGEXP_1 NEW_STRING_1
OLD_REGEXP_2 NEW_STRING_2
...

I want to replace all of the strings that match OLD_REGEXP_X with NEW_STRING_X in multiple files *.txt.

I believe that this is a common question and someone should have already done something similar before, but I just couldn't find an existing solution written in bash.

For example :

Tom Thompson
Billy Bill&Ted
goog1e\.com google.com
https?://www\.google\.com https://google.com

Input :

Tom and Billy are visiting http://www.goog1e.com

Expected output :

Thompson and Bill&Ted are visiting https://google.com

The major challenges are :

  • The strings to be replaced are described by POSIX Extended Regular Expressions, not literal, and any character that is not a POSIX ERE metacharacter, including / which is often used as a regexp delimiter by some tools, must be treated as literal.
  • The replacement strings are literal and can contain any literal character, including chars like & and \1 that are often used as backreference metacharacters in replacement strings but must be literal in this case.
  • Replacements must occur in the order they appear in the mapping file so if we have A->B and B->C in that order in the mapping file and A appears in the text file that is to be changed, then the output will contain "C" in place of "A", not "B".
like image 303
kit Avatar asked Oct 14 '25 09:10

kit


1 Answers

You can convert your substitution list file into a sed script file, then let sed do the job for you.

give this a try with gnu sed:

sed -i -f <(sed -r 's/^(\S*) (.*)/s@\1@\2@/g' listfile) *.txt
like image 191
Kent Avatar answered Oct 17 '25 01:10

Kent



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!