I'm getting hard times understanding how to achieve what I want using awk and after searching for quite some time, I couldn't find the solution I'm looking for.
I have an input text that looks like this:
Some text (possibly containing text within parenthesis).
Some other text
Another line (with something here) with some text
(
Element 4
)
Another line
(
Element 1, span 1 to
Element 5, span 4
)
Another Line
I want to properly format the weird lines between ' (' and ')'. The expected output is as follow:
Some text (possibly containing text within parenthesis).
Some other text
Another line (with something here) with some text
(Element 4)
Another line
(Element 1, span 1 to Element 5, span 4)
Another Line
Looking up on stack overflow I found this :
How to select lines between two marker patterns which may occur multiple times with awk/sed
So what I'm using now is echo $text | awk '/ \(/{flag=1;next}/\)/{flag=0}flag'
Which almost works except it filters out the non-matching lines, here's the output produced by this very last command:
(Element 4)
(Element 1, span 1 to Element 5, span 4)
Anyone knows how-to do this? I'm open to any suggestion, including not-using awk if you know better.
Bonus point if you teach me how to remove syntaxic coloration on my question code blocks :)
Thanks a billion times
Edit: Ok, so I accepted @EdMorton's solution as he provided something using awk (well, GNU awk). However, I'm currently using @aaron's sed voodoo incantations with great success and will probably continue doing so until I hit anything new on that specific usecase.
I strongly suggest reading EdMorton's explanation, last paragraph made my day. If anyone passing by has good ressources regarding awk/sed they can share, feel free to do so in the comments.
Here's how I would do it with GNU sed :
s/^\s*(/(/;/^(/{:l N;/)/b e;b l;:e s/\n//g}
Which, for those who don't speak gibberish, means :
l, which denotes the start of a loopele, which denotes the end of the codeThis can probably be refined, but it does the trick :
$ echo """Some text (possibly containing text within parenthesis).
Some other text
Another line (with something here) with some text
(
Element 4
)
Another line
(
Element 1, span 1 to
Element 5, span 4
)
Another Line """ | sed 's/^\s*(/(/;/^(/{:l N;/)/b e;b l;:e s/\n//g}'
Some text (possibly containing text within parenthesis).
Some other text
Another line (with something here) with some text
(Element 4)
Another line
(Element 1, span 1 to Element 5, span 4)
Another Line
Edit : if you can disable history expansion (set +H), this sed command is nicer : s/^\s*(/(/;/^(/{:l N;/)/!b l;s/\n//g}
sed is for simple substitutions on individual lines, that is all. If you try to do anything else with it then you are using constructs that became obsolete in the mid-1970s when awk was invented, are almost certainly non-portable and inefficient, are always just a pile of indecipherable arcane runes, and are used today just for mental exercise.
The following uses GNU awk for multi-char RS, RT and the \s shorthand for [[:space:]] and works by simply isolating the (...) strings and then doing whatever you want with them:
$ cat tst.awk
BEGIN {
RS="[(][^)]+[)]" # a regexp for the string you want to isolate in RT
ORS="" # disable appending of newlines so we print as-is
}
{
gsub(/\n[[:blank:]]+$/,"\n") # remove any blanks before RT at the start of each line
sub(/\(\s+/,"(",RT) # remove spaces after ( in RT
sub(/\s+\)/,")",RT) # remove spaces before ) in RT
gsub(/\s+/," ",RT) # compress each chain of spaces to one blank char in RT
print $0 RT # print the result
}
$ awk -f tst.awk file
Some text (possibly containing text within parenthesis).
Some other text
Another line (with something here) with some text
(Element 4)
Another line
(Element 1, span 1 to Element 5, span 4)
Another Line
If you're considering using a sed solution for this also consider how you would enhance it if/when you have the slightest requirements change. Any change to the above awk code would be trivial and obvious while a change to the equivalent sed code would require first sacrificing a goat under a blood moon then breaking out your copy of the Rosetta Stone...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With