Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

boost.regex:switching between ascii and unicode

Is there a convenient way in boost.regex to switch between ascii and utf?

The only way I see right now is to, for example, switch between boost::u32regex to boost::regex.

Is this the only way to switch between unicode and ascii?
I was hoping to be able to just pass a parameter to boost, specifying my character encoding, thus-by not have to duplicate a lot of code.

like image 939
user695652 Avatar asked Dec 05 '25 07:12

user695652


1 Answers

Is this the only way to switch between unicode and ascii?

Pretty much. What you think of as boost::regex is really a type alias:

namespace boost{
    template <class charT, class traits = regex_traits<charT>  >
    class basic_regex;

    typedef basic_regex<char>      regex;
    typedef basic_regex<wchar_t>   wregex;
}

Note that the character type is a template parameter - it's not a runtime parameter. Since boost::regex is built on char, it cannot support unicode.

boost::u32regex is the same way:

typedef basic_regex<UChar32,icu_regex_traits> u32regex;

In order to really generalize between them, you would have to write everything as a template too. Instead of taking a boost::regex, you take a boost::basic_regex<charT, traits>. That's one of the downsides of templates - they kind of just permeate everything.

like image 118
Barry Avatar answered Dec 06 '25 22:12

Barry



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!