Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using sed, awk, etc. to separate after middle dot characters

Tags:

regex

bash

sed

awk

I could use your assistance for something; I promise I tried really hard to search for answers, but no luck.

I want to separate text between every occurrence of the "·" (middle dot) character (by syllables, basically).

echo con·grat·u·late | sed -e 's/·.*$/·/1'

The code above outputs:

con·

That is the first part of what I want, but ultimately I would like an output of:

con·
grat·

late

This will involve getting the characters between the 1st-2nd, and the 2nd-3rd occurrences of "·"

If anyone can guide me in the right direction, I will really appreciate it, and I will figure the rest out on my own.

EDIT My apologies, I displayed my desired output incorrectly. Your solution's worked great, however.

Since it is important for me to keep everything as a single line, how would I output the text between the first dot and the second one, to output:

grat·

I am doing it in UTF-8, Jonathan

Once again, sorry for asking the wrong thing.

like image 210
TuxForLife Avatar asked Dec 05 '25 10:12

TuxForLife


1 Answers

In GNU sed you can do this:

echo con·grat·u·late | sed -e 's/·/&\n/g'

The & stands for the matched pattern, in this example the ·. Unfortunately this doesn't work in BSD sed.

For a more portable solution, I recommend this AWK, which should work in both GNU and BSD systems:

echo con·grat·u·late | awk '{ gsub("·", "&\n") } 1'
like image 103
janos Avatar answered Dec 07 '25 01:12

janos



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!