Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimal bison/flex-generated code has memory leak

In debugging a memory leak on a large project, I found that the source of the leak seemed to be some flex/bison-generated code. I was able to recreate the leak with the following minimal example consisting of two files, sand.l and sand.y:

in sand.l:

%{
#include <stdlib.h>
#include "sand.tab.h"
%}

%%
[0-9]+ { return INT; }
. ;
%%

in sand.y:

%{
#include <stdio.h>
#include <stdlib.h>

int yylex();
int yyparse();
FILE* yyin;

void yyerror(const char* s);
%}

%token INT

%%
program:
       program INT { puts("Found integer"); }
       | 
       ;
%%

int main(int argc, char* argv[]) {
    yyin = stdin;
    do {
        yyparse();
    } while (!feof(yyin));
    return 0;
}

void yyerror(const char* s) {
    puts(s);
}

The code was compiled with

$ bison -d sand.y
$ flex sand.l
$ gcc -g lex.yy.c sand.tab.c -o main -lfl

Running the program with valgrind gave the following error:

8 bytes in 1 blocks are still reachable in loss record 1 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x402126: yyensure_buffer_stack (lex.yy.c:1423)
by 0x400B89: yylex (lex.yy.c:669)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

64 bytes in 1 blocks are still reachable in loss record 2 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x401CBF: yy_create_buffer (lex.yy.c:1258)
by 0x400BB3: yylex (lex.yy.c:671)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

16,386 bytes in 1 blocks are still reachable in loss record 3 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x401CF6: yy_create_buffer (lex.yy.c:1267)
by 0x400BB3: yylex (lex.yy.c:671)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

It seems that bison and/or flex is holding on to a substantial amount of memory. Is there anyway to force them to free it?

like image 804
iafisher Avatar asked Oct 19 '25 01:10

iafisher


1 Answers

The default flex skeleton allocates an input buffer and a small buffer stack, which it never frees. You could free the input buffer manually with yy_delete_buffer(YY_CURRENT_BUFFER); but there is no documented way to delete the buffer stack. If you have a sufficiently non-ancient version of flex [see Note 1], you can call yylex_destroy() to remove the last vestiges of the buffer stack. (If you don't, it's only 8 bytes in your application, so it's not a disaster.)

If you want to write a clean application, you should generate a reentrant scanner, which puts all persistent data into a scanner context object. Your code must allocate and free this object, and freeing it will free all memory allocations. (You might also want to generate a pure parser, which works roughly the same way.)

However, the reentrant scanner has a very different API, so you will need to get your parser to pass through the scanner context object. If you use a reentrant (pure) parser as well, you'll need to modify your scanner actions because with the reentrant parser, yylval is a YYSTYPE* instead of YYSTYPE.


Notes:

  1. In fact, you can delete the buffer stack using yylex_destroy(), as recently pointed out in a comment, as long as your flex version is at least 2.5.9. Since that version was released almost two decades ago, you'd think this note would be unnecessary, but unfortunately v2.5.4 continues to be the default MinGW installation and it had a surprisingly long life on various Linux distros as well (although I think these days you're not so likely to find it installed).
like image 181
rici Avatar answered Oct 21 '25 15:10

rici



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!