Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex match digits between strings

Tags:

c++

regex

c++11

I need to extract integer values from the following text, between strings "start:" and "end:", and only between.

 111222 garbage  999888 start:        123456       end:     start:         654321     end:

wanted results:

123456
654321

Here is what I have, but I need it to exclude the unknown number of spaces around the integer.

std::regex

(?<=start:)(.*?)(?=end:)

RegExr

like image 265
Boy Avatar asked Dec 02 '25 10:12

Boy


2 Answers

You may use

std::regex reg(R"(start:\s*(\d+)\s*end:)");

See the regex demo.

It defines the start:\s*(\d+)\s*end: regex pattern that matches start:, 0+ whitespaces, then captures into Group 1 one or more digits, and then matches 0+ whitespaces and end: substring.

Note that in case you cannot use raw string literals (R"(...)" notation), you may define the pattern with a regular string literal where all backslashes should be doubled: "start:\\s*(\\d+)\\s*end:".

To obtain all matches, you need std::sregex_token_iterator and when getting the matches, specify that you need to grab all Group 1 values:

const std::regex reg(R"(start:\s*(\d+)\s*end:)");
std::smatch match;
std::string s = "garbage 111222 garbage ... 999888 fewfew... start:        123456       end:     start:         654321     end:";
std::vector<std::string> results(std::sregex_token_iterator(s.begin(), s.end(), reg, 1),
                           std::sregex_token_iterator());

See the online C++ demo

If there can be any value inside start: and end:, replace \d+ with .*? (matching any 0+ chars other than line break characters).

like image 135
Wiktor Stribiżew Avatar answered Dec 05 '25 00:12

Wiktor Stribiżew


To extract the integer values between start: and end: without a lookbehind you could capture one or more digits in a capturing group:

  • Match start: followed by zero or more whitespace characters \s*
  • (/d+) Capture one or more digits in a group
  • (?=\s*end:) positive lookahead that asserts that what follows is zero or more whitespace characters and end:

start:\s*(\d+)(?=\s*end:)

like image 33
The fourth bird Avatar answered Dec 05 '25 00:12

The fourth bird



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!