I want to take the string foofoofoo, map foo to bar, and return all individual replacements as an array - ['barfoofoo', 'foobarfoo', 'foofoobar']
This is the best I have:
require 'pp'
def replace(string, pattern, replacement)
results = []
string.length.times do |idx|
match_index = (Regexp.new(pattern) =~ string[idx..-1])
next unless match_index
match_index = idx + match_index
prefix = ''
if match_index > 0
prefix = string[0..match_index - 1]
end
suffix = ''
if match_index < string.length - pattern.length - 1
suffix = string[match_index + pattern.length..-1]
end
results << prefix + replacement + suffix
end
results.uniq
end
pp replace("foofoofoo", 'foo', 'bar')
This works (at least for this test case), but seems too verbose and hacky. Can I do better, perhaps by using string#gsub with a block or some such?
It is easy to do with pre_match ($`) and post_match ($'):
def replace_matches(str, re, repl)
return enum_for(:replace_matches, str, re, repl) unless block_given?
str.scan(re) do
yield "#$`#{repl}#$'"
end
end
str = "foofoofoo"
# block usage
replace_matches(str, /foo/, "bar") { |x| puts x }
# enum usage
puts replace_matches(str, /foo/, "bar").to_a
EDIT: If you have overlapping matches, then it becomes harder, as regular expressions aren't really equipped to deal with it. So you can do it like this:
def replace_matches(str, re, repl)
return enum_for(:replace_matches, str, re, repl) unless block_given?
re = /(?=(?<pattern>#{re}))/
str.scan(re) do
pattern_start = $~.begin(0)
pattern_end = pattern_start + $~[:pattern].length
yield str[0 ... pattern_start] + repl + str[pattern_end .. -1]
end
end
str = "oooo"
replace_matches(str, /oo/, "x") { |x| puts x }
Here we abuse positive lookahead, which are 0-width, so we can get overlapping matches. However, we also need to know how many characters we matched, which we can't do as before now that match is 0-width, so we'll make a new capture of the contents of the lookahead, and calculate the new width from that.
(Disclaimer: it will still only match once per character; if you want to consider multiple possibilities at each character, like in your /f|o|fo/ case, it complicates things yet more.)
EDIT: A bit of a tweak and we can even support proper gsub-like behaviour:
def replace_matches(str, re, repl)
return enum_for(:replace_matches, str, re, repl) unless block_given?
new_re = /(?=(?<pattern>#{re}))/
str.scan(new_re) do
pattern_start = $~.begin(0)
pattern_end = pattern_start + $~[:pattern].length
new_repl = str[pattern_start ... pattern_end].gsub(re, repl)
yield str[0 ... pattern_start] + new_repl + str[pattern_end .. -1]
end
end
str = "abcd"
replace_matches(str, /(?<first>\w)(?<second>\w)/, '\k<second>\k<first>').to_a
# => ["bacd", "acbd", "abdc"]
(Disclaimer: the last snippet can't handle cases where the pattern uses lookbehind or lookahead to check outside the match region.)
I don't think Ruby provides such a functionality out of the box. However, here's my two cents, which may be more elegant:
def replace(str, pattern, replacement)
count = str.scan(pattern).count
fragments = str.split(pattern, -1)
count.times.map do |occurrence|
fragments[0..occurrence].join(pattern)
.concat(replacement)
.concat(fragments[(occurrence+1)..count].to_a.join(pattern))
end
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With