Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to get a block of code into my regex match groups

So yeah, the title is pretty weird but I have no other idea how to describe my problem properly. Whatever... lets get to the problem.

Job to get done

My boss wants a function that read all functions of a python file and return a DataTable containing the found functions. This function should be written in IronPython (Python which actually uses C# libraries).

The Problem

I am relatively new to Python and I have no idea what this language is capable of, so I started to write my function and yeah it works pretty well, except one weird problem. I wrote a regular expression to find the functions and to test it I downloaded a RegEx Tester. The Regex Tester showed the results I wanted: Group 1 - The function name, Group 2 - The functions parameter and Group 3 - the content of the function.

For some magical reasons, it doesn't work when it goes to live testing. And with doesn't work I mean, Group 3 has actually no output. After testing the expression with another (online) RegEx Tester, it showed me, that Group 3 has actually not the content of the function, it only has a small part of it, starting with a newline/return character.

In my test cases, the results of Group 3 where all the same, starting with a newline/return character and ended with the functions return (e.g. return objDic).

Question: What the hell is going wrong there? I have no idea what is wrong on my RegEx.

The Regex

objRegex = Regex(r"(?i)def[\s]+([\w]+)\(([\, [\w]+)\)(?:[\:{1}]\s*)([\n].*(?!\ndef[\s]+))+")

The Data

def test_function(some_parameter):
    try:
        some_cool_code_goes_here()
        return obj
    except Exception as ex:
        DetailsBox.Show(ex)

def another_cool_function(another_parameter):
    try:
        what_you_want()
        return obj
    except Exception as ex:
        DetailsBox.Show(ex)

The Result

Match: def test_function(some_parameter):...
Position: ..
Length: ..
Group 1: test_function
Group 2: some_parameter
Group 3: (newline/return character) return obj

But Group 3 should be:

    try:
        some_cool_code_goes_here()
        return obj
    except Exception as ex:
        DetailsBox.Show(ex)

I hope you can help me :3 Thank you guys!

like image 882
Fortuna Avatar asked Dec 12 '25 03:12

Fortuna


2 Answers

Although @Hamza said in his comment that you have several problems in your regex, I think they are more of uneeded complexity, the reason for not matching the body might be that you haven't let the . special meta-character match the new line so it is stopping at the first new line character after the first Try: statement.

To fix this you will need to let the . match new line characters and here is a stripped down version of your regex that works:

(?i)def\s+(\w+)\s*\(([\, \w]+)\)(?:\s*:\s*)(.+?)(?=def|$)
like image 76
Ibrahim Najjar Avatar answered Dec 14 '25 17:12

Ibrahim Najjar


Thanks to HamZa for the quick help (and of course also thanks for all the other helpers), he actually solved the problem. There were just a few adjustments necessary (to make it work for C# :-)) but the main point comes from him, thanks a lot.

Solution for my problem:

Regex(r"(?is)def\s*(?<name>\w+)\s*\((?<parameter>[^)]+)\)\s*:\s*(?:\r?\n)+(?<body>.*?)(?=\r?\ndef|$)")
like image 26
Fortuna Avatar answered Dec 14 '25 16:12

Fortuna



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!