Once pcre2_match() is successfully executed I use pcre2_get_ovector_pointer() to get the ovector and then build a data structure containing (1) the matched string (using ovector[0] and ovector[1]); and (2) the matched capture groups (using ovector[2*i] and ovector[2*i+1], for i in [1..rv), where rv is the return value of pcre2_match()).
For each capture group number, I'd like to include in the data structure the matched string (no problem, that's in ovector), the length of the match (same, the info can be extracted form ovector), and, and this is the difficulty, the name of the capture group (obviously, only if the matched group has a name).
Helper functions are available to fetch matched capture groups by name. In particular pcre2_substring_number_from_name() could be used to transform a capture group name into a group number (i.e name-to-number translation). What I need in the exact opposite behaviour: given a group number, get its associated group name, if any, or NULL otherwise (i.e. number-to-name translation). I assume I missing something obvious here, but I'm not able to find a way to do that using the PCRE2 API. Is it possible?
This is not the simple number-to-name API function I was looking for, but the following snippet from https://www.pcre.org/current/doc/html/pcre2demo.html contains enough inspiration to implement what I need :)
(void)pcre2_pattern_info(
re, /* the compiled pattern */
PCRE2_INFO_NAMECOUNT, /* get the number of named substrings */
&namecount); /* where to put the answer */
if (namecount == 0) printf("No named substrings\n"); else
{
PCRE2_SPTR tabptr;
printf("Named substrings\n");
/* Before we can access the substrings, we must extract the table for
translating names to numbers, and the size of each entry in the table. */
(void)pcre2_pattern_info(
re, /* the compiled pattern */
PCRE2_INFO_NAMETABLE, /* address of the table */
&name_table); /* where to put the answer */
(void)pcre2_pattern_info(
re, /* the compiled pattern */
PCRE2_INFO_NAMEENTRYSIZE, /* size of each entry in the table */
&name_entry_size); /* where to put the answer */
/* Now we can scan the table and, for each entry, print the number, the name,
and the substring itself. In the 8-bit library the number is held in two
bytes, most significant first. */
tabptr = name_table;
for (i = 0; i < namecount; i++)
{
int n = (tabptr[0] << 8) | tabptr[1];
printf("(%d) %*s: %.*s\n", n, name_entry_size - 3, tabptr + 2,
(int)(ovector[2*n+1] - ovector[2*n]), subject + ovector[2*n]);
tabptr += name_entry_size;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With