Bug 31404 - regcomp does not honour the documented behaviour.
Summary: regcomp does not honour the documented behaviour.
Status: RESOLVED WONTFIX
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Thomas Backlund
QA Contact:
URL: https://github.com/gnudatalanguage/gd...
Whiteboard: MGA8TOO
Keywords: UPSTREAM
Depends on:
Blocks:
 
Reported: 2023-01-14 18:21 CET by gilles d
Modified: 2023-01-20 16:33 CET (History)
1 user (show)

See Also:
Source RPM: glibc-devel-2.32-30.mga8
CVE:
Status comment:


Attachments

Description gilles d 2023-01-14 18:21:09 CET
Description of problem:

regcomp() should correctly find the occurences of '{ ' in a string, since it is said: (man 7 regex) 
A  '{'  followed by a character other than a digit is an ordinary character, not the beginning of a bound(!).

Version-Release number of selected component (if applicable):


How reproducible:
Always.

Steps to Reproduce:
1. compile and run this small C code below (slightly edited copy of the man example). 
2. the result is OK on, e.g., OSX. Not on Mageia 8, an error is issued. Instead it should find the positions of '{ ' in the string "1234 G!t!rk{ ss { zz...\n" 
3. the code:
      #include <stdint.h>
       #include <stdio.h>
       #include <stdlib.h>
       #include <regex.h>

       #define ARRAY_SIZE(arr) (sizeof((arr)) / sizeof((arr)[0]))

       static const char *const str =  "1234 G!t!rk{ ss { zz...\n";
       static const char *const re = "{ ";

       int main(void)
       {
           static const char *s = str;
           regex_t     regex;
           regmatch_t  pmatch[1];
           regoff_t    off, len;
int cflags = REG_EXTENDED;
 int res=regcomp(&regex, re, cflags);
 if (res) {
   printf("regcomp error:");
   if (res == REG_BADBR   ) printf(" REG_BADBR   ");
   if (res == REG_BADPAT  ) printf(" REG_BADPAT  ");
   if (res == REG_BADRPT  ) printf(" REG_BADRPT  ");
   if (res == REG_EBRACE  ) printf(" REG_EBRACE  ");
   if (res == REG_EBRACK  ) printf(" REG_EBRACK  ");
   if (res == REG_ECOLLATE) printf(" REG_ECOLLATE");
   if (res == REG_ECTYPE  ) printf(" REG_ECTYPE  ");
   /* if (res == REG_EEND    ) printf(" REG_EEND    "); */
   if (res == REG_EESCAPE  ) printf(" REG_EESCAPE  ");
   if (res == REG_EPAREN  ) printf(" REG_EPAREN  ");
   if (res == REG_ERANGE  ) printf(" REG_ERANGE  ");
   /* if (res == REG_ESIZE   ) printf(" REG_ESIZE   ");  */
   if (res == REG_ESPACE  ) printf(" REG_ESPACE  ");
   if (res == REG_ESUBREG ) printf(" REG_ESUBREG ");
   printf("\n");
  exit(EXIT_FAILURE);
 }

           printf("String = \"%s\"\n", str);
           printf("Matches:\n");

           for (int i = 0; ; i++) {
               if (regexec(&regex, s, ARRAY_SIZE(pmatch), pmatch, 0))
                   break;

               off = pmatch[0].rm_so + (s - str);
               len = pmatch[0].rm_eo - pmatch[0].rm_so;
               printf("#%d:\n", i);
               printf("offset = %jd; length = %jd\n", (intmax_t) off,
                       (intmax_t) len);
               printf("substring = \"%.*s\"\n", len, s + pmatch[0].rm_so);

               s += pmatch[0].rm_eo;
           }

           exit(EXIT_SUCCESS);
       }
Comment 1 Lewis Smith 2023-01-16 19:38:42 CET
Trying this for starters on Cauldron:
$ gcc regexptest.c
$ ./a.out
regcomp error: REG_BADRPT  

Is this th error you see?
I am off to try it on M8 and another distro.

CC: (none) => lewyssmith

Comment 2 Lewis Smith 2023-01-16 20:21:11 CET
Same result on Mageia 8, and Linux Mint.
So not a Mageia problem, need to report upstream.

The man excerpt you cite "A '{' followed by a character other than a digit is an ordinary character, not the beginning of a bound" is exact.

Where did you, Gilles, discover this very obscure fault which requires a specially crafted program to show it?!

https://sourceware.org/glibc/wiki/FilingBugs
https://www.gnu.org/software/libc/manual/html_node/Reporting-Bugs.html
https://tldp.org/HOWTO/Glibc2-HOWTO-9.html

Assigning this to tmb.
If Gilles found out about this from a published report, it is almost certainly already bugged chez glibc. If he found it himself, he can raise the necessary bug (and please put its URL in this bug's URL field).

Assignee: bugsquad => tmb
Version: 8 => Cauldron
Keywords: (none) => UPSTREAM
Whiteboard: (none) => MGA8TOO

Comment 3 gilles d 2023-01-17 14:57:31 CET
(In reply to Lewis Smith from comment #2)
> Same result on Mageia 8, and Linux Mint.
> So not a Mageia problem, need to report upstream.
> 
> The man excerpt you cite "A '{' followed by a character other than a digit
> is an ordinary character, not the beginning of a bound" is exact.
> 
> Where did you, Gilles, discover this very obscure fault which requires a
> specially crafted program to show it?!
> 

Hi, indeed this is very indirect. The problem was found here: https://github.com/gnudatalanguage/gdl/issues/1444 

The example by our user was with '{' and not '{ ' as I've demonstrated. I believe the sentence " '{' followed by a character..." holds even if there is no character at all (and indeed the OSX version behaves the same with '{'), so the exact extent of this glibc bug is to be determined.
Comment 4 Lewis Smith 2023-01-17 20:27:36 CET
Thank you for the upstream bug reference. You did not get a sympa reaction...
No point in us sitting on this.
"'{' followed by a character other than a digit" seems to include being followed by nothing, as you point out.

Please take 'wontfix' as 'cannot fix'. We do not have the resources to meddle with upstream code, especially at such a basic level.

Resolution: (none) => WONTFIX
URL: (none) => https://github.com/gnudatalanguage/gdl/issues/1444
Status: NEW => RESOLVED

Comment 5 gilles d 2023-01-17 23:47:19 CET
Hi, no problem.
Shall I surmise I should report directly upstairs, to glib bug reports, or do you do it?
Thanks
Comment 6 Lewis Smith 2023-01-18 21:36:45 CET
Please you do it.
TIA
Comment 7 gilles d 2023-01-20 16:33:00 CET
reported here https://sourceware.org/bugzilla/show_bug.cgi?id=30024 but was refused as bug.

Will be resolved at our level (GDL).
Thanks and sorry for the inconvenience.

Note You need to log in before you can comment on or make changes to this bug.