I'm trying to get the value of the value entry in this xml line via terminal so I'm using sed.
abcs='<param name="abc" value="bob3" no_but_why="4"/>'
echo $abcs | sed -e 's/.*value="\(.*\)" .*/\1/'
echo $abcs | sed -e 's/.*value="\(.*\)".*/\1/'
The output is:
bob3
bob3" no_but_why="4
Why does the second way without the space cause more than just what I wanted to be printed out? Why would the \1 be affected by that
As you can see difference is use of greedy pattern .* in second regex after " without space.
Reason why it is behaving differently because there is a double quote after no_but_why= as well and .* being a greedy pattern is matching until last " before /> in second regex.
In your first regex "\(.*\)" is matching only "bob3" because there is a space after this which makes regex engine prevent .* matching till last double quote in input.
To avoid this situation you should be using negated character class instead of greedy matching.
Consider these sed command examples:
sed -e 's/.*value="\([^"]*\)" .*/\1/' <<< "$abcs"
bob3
sed -e 's/.*value="\([^"]*\)".*/\1/' <<< "$abcs"
bob3
Now you can see both command are producing same output bob3 because negated character class [^"]* will match until it gets next " not till the very last " in input as the case with .*.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With