Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex matching only numbers

Tags:

regex

shell

sed

I am having problems understanding what my regex in bash shell is doing exactly.

I have the string abcde 12345 67890testing. I want to extract 12345 from this string using sed.

However, using sed -re 's/([0-9]+).*/\1/' on the given string will give me abcde 12345.

Alternatively, using sed -re 's/([\d]+).*/\1/' would actually only extract abcd.

Am I wrong in assuming that the expression [0-9] and [\d] ONLY capture digits? I have no idea how abcd is being captured yet the string 67890 is not. Plus, I want to know why the space is being captured in my first query?

In addition, sed -re 's/^.*([0-9]+).*/\1/' gives me 0. In this instance, I completely do not understand what the regex is doing. I'd thought that the expression ^.*[0-9]+ would only capture the first instance of a string of only numbers? However, it's matching only the last 0.

All in all, I'd like to understand how I am wrong about all these. And how the problem should be solved WITHOUT using [\s] in the regex to isolate the first string of numbers.

like image 785
Gin Avatar asked Sep 19 '25 07:09

Gin


2 Answers

sed -E 's/([0-9]+).*/\1/g'  <<< "$s" 

The above command means: find a sequence of number followed by something and replace it with only the numbers. So it matches 12345 67890testing and replaces it with only 12345.

The final string will be abcd 12345.

If you want to get only 12345 you should use grep.

egrep -o '[0-9]+ ' <<< "$s"

Or with sed you can use:

sed -E 's/[a-zA-Z ]*([0-9]+).*/\1/g'  <<< "$s"

This will drop the letters before the numbers

like image 168
drolando Avatar answered Sep 21 '25 01:09

drolando


You can use:

sed 's/^\([0-9]*\).*$/\1/g' <<< "$s"
12345

OR else modifying your sed:

sed 's/\([0-9]\+\).*/\1/g' <<< "$s"
12345

You need to escape + & ( and ) in sed without extended regex flag (-r OR -E).

WIth -r it will be:

sed -r 's/([0-9]+).*/\1/g' <<< "$s"
12345

UPDATE: You don't really need any external utility for this as you can do this in BASH itself using its regex capabilities:

[[ "$s*" =~ ^([0-9]+) ]] && echo "${BASH_REMATCH[1]}"
12345
like image 42
anubhava Avatar answered Sep 21 '25 00:09

anubhava