Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I add a carriage return in a text using regex?

Tags:

c#

.net

regex

I have a text file with multiple lines. I'll try to set a pattern to add a new carriage return in some lines of the text. This lines are like that:

lorem ipsum.
dolor sit amet, consectetur adipiscing elit [FIS] Donec feugiat

Well, the pattern is a line followed by other which has some characters and a '[' character too. If '[' is not present the pattern fails and the carriage return hasn't be added.

How can I do it using regular expressions??

I'm using C# as programming language and regex engine too.

like image 608
jaloplo Avatar asked Dec 20 '25 08:12

jaloplo


2 Answers

If you want to add a line break after a . then you just replace it with itself and a line break. To make sure it is the last character, use a lookahead to check it is followed by whitespace, i.e. (?=\s)


So, to replace with newline character (recommended for most situations):

replace( input , '\.(?=\s)' , '\.\n' )


If you must use carriage return (and there are very few places that require it, even on Windows), you can simply add one:

replace( input , '\.(?=\s)' , '\.\r\n' )


If you want to ensure that a . is always followed by two line breaks, and not cause extra line breaks if they are already want, then it gets a little more complex, and required a negative lookahead, but looks like this:

replace( input , '\.(?!\S)(?:\r?\n){0,2}' , '\.\r\n\r\n' )

Because regex engines default to greedy, the {0,2} will try to match twice, then once, then zero times - at which point the negative lookahead for a non-space makes sure it is actually the end of a word.

(If you might have more than two newlines and want to reduce to two, you can just use {0,} instead, which has * as a shortcut notation.)


It's probably worth pointing out that none of the above will consume any spaces/tabs - if this is desired the lookaheads can either be changed from (?=\s) to \s+, you could can do a second replace of \n[ \t]+ with \n to remove any leading spaces/tabs, or something similar, depending on exactly what you're trying to do.

like image 131
Peter Boughton Avatar answered Dec 21 '25 20:12

Peter Boughton


I believe you can use \r for carriage return and \n for new line

like image 23
northpole Avatar answered Dec 21 '25 20:12

northpole



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!