Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex performance java vs c++11

I am learning regex in c++ and java. So i did a performance test on c++11 regex and java regex with same expression and same no of inputs. Strangely java regex is faster than c++11 regex. Is there anything wrong in my code? Pls correct me

Java code:

import java.util.regex.*;

public class Main {
    private final static int MAX = 1_000_000;
    public static void main(String[] args) {
        long start = System.currentTimeMillis();
        Pattern p = Pattern.compile("^[\\w._]+@\\w+\\.[a-zA-Z]+$");
        for (int i = 0; i < MAX; i++) {
            p.matcher("[email protected]").matches();
        }
        long end = System.currentTimeMillis();
        System.out.print(end-start);
    }
}

C++ code:

#include <iostream>
#include <Windows.h>
#include <regex>

using namespace std;

int main()
{
    long long start = GetTickCount64();
    regex pat("^[\\w._]+@\\w+\\.[a-zA-Z]+$");
    for (long i = 0; i < 1000000; i++) {
        regex_match("[email protected]", pat);
    }
    long long end = GetTickCount64();
    cout << end - start;
    return 0;
}

Performance:

Java -> 1003ms
C++  -> 124360ms
like image 267
Eddy Avatar asked Feb 25 '26 10:02

Eddy


1 Answers

Made the C++ sample portable:

#include <iostream>
#include <chrono>
#include <regex>

using C = std::chrono::high_resolution_clock;
using namespace std::chrono_literals;

int main()
{
    auto start = C::now();
    std::regex pat("^[\\w._]+@\\w+\\.[a-zA-Z]+$");
    for (long i = 0; i < 1000000; i++) {
        regex_match("[email protected]", pat);
    }
    std::cout << (C::now() - start)/1.0ms;
}

On linux, and with clang++ -std=c++14 -march=native -O3 -o clang ./test.cpp I get 595.970 ms. See also Live On Wandbox

The java runs in 561 ms, on the same machine.

Update: Boost Regex runs much faster, see below comparative benchmark

Caveat: synthetic benchmarks like these are very prone to error: the compiler might sense that no observable side effects are done, and optimize the whole loop out, just to give an example.

More Fun: Adding Boost To The Mix

Using Boost 1.67 and Nonius Micro-Benchmarking Framework

enter image description here

We can see that Boost's Regex implementations are considerably faster.

See the detailed sample data interactive online: https://plot.ly/~sehe/25/

Code Used

#include <iostream>
#include <regex>
#include <boost/regex.hpp>
#include <boost/xpressive/xpressive_static.hpp>
#define NONIUS_RUNNER
#include <nonius/benchmark.h++>
#include <nonius/main.h++>

template <typename Re>
void test(Re const& re) {
    regex_match("[email protected]", re);
}

static const std::regex std_normal("^[\\w._]+@\\w+\\.[a-zA-Z]+$");
static const std::regex std_optimized("^[\\w._]+@\\w+\\.[a-zA-Z]+$", std::regex::ECMAScript | std::regex::optimize);
static const boost::regex boost_normal("^[\\w._]+@\\w+\\.[a-zA-Z]+$");
static const boost::regex boost_optimized("^[\\w._]+@\\w+\\.[a-zA-Z]+$", static_cast<boost::regex::flag_type>(boost::regex::ECMAScript | boost::regex::optimize));

static const auto boost_xpressive = []{
    using namespace boost::xpressive;
    return cregex { bos >> +(_w | '.' | '_') >> '@' >> +_w >> '.' >> +alpha >> eos };
}();

NONIUS_BENCHMARK("std_normal",      [] { test(std_normal);      })
NONIUS_BENCHMARK("std_optimized",   [] { test(std_optimized);   })
NONIUS_BENCHMARK("boost_normal",    [] { test(boost_normal);    })
NONIUS_BENCHMARK("boost_optimized", [] { test(boost_optimized); })
NONIUS_BENCHMARK("boost_xpressive", [] { test(boost_xpressive); })

Note Here's the output of the Hotspot JVM JIT compiler:

  • http://stackoverflow-sehe.s3.amazonaws.com/fea76143-b712-4df9-97c3-4725b2f9e695/disasm.a.xz

This was generated using

LD_PRELOAD=/home/sehe/Projects/stackoverflow/fcml-1.1.3/example/hsdis/.libs/libhsdis-amd64.so ./jre1.8.0_171/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly Main 2>&1 > disasm.a

like image 186
sehe Avatar answered Mar 01 '26 01:03

sehe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!