RegEx for capturing groups using dictionary key

Question

I'm having trouble displaying the right named capture in my dictionary function. My program reads a .txt file and then transforms the text in that file into a dictionary. I already have the right regex formula to capture them.

Here is my File.txt:

file Science/Chemistry/Quantum 444 1
file Marvel/CaptainAmerica 342 0
file DC/JusticeLeague/Superman 300 0
file Math 333 0
file Biology 224 1

Here is the regex link that is able to capture the ones I want:

By looking at the link, the ones I want to display is highlighted in green and orange.

This part of my code works:

rx= re.compile(r'file (?P<path>.*?)( |\/.*?)? (?P<views>\d+).+')
i = sub_pattern.match(data) # 'data' is from the .txt file
x = (i.group(1), i.group(3))
print(x)

But since I'm making the .txt into a dictionary I couldn't figure out how to make .group(1) or .group(3) as keys to display specifically for my display function. I don't know how to make those groups display when I use print("Title: %s | Number: %s" % (key[1], key[3])) and it will display those contents. I hope someone can help me implement that in my dictionary function.

Here is my dictionary function:

def create_dict(data):
    dictionary = {}
    for line in data:
      line_pattern = re.findall(r'file (?P<path>.*?)( |\/.*?)? (?P<views>\d+).+', line)
      dictionary[line] = line_pattern
      content = dictionary[line]
      print(content)
    return dictionary

I'm trying to make my output look like this from my text file:

Science 444
Marvel 342
DC 300
Math 333
Biology 224

Wiktor Stribiżew · Accepted Answer

You may create and populate a dictionary with your file data using

def create_dict(data):
    dictionary = {}
    for line in data:
        m = re.search(r'file\s+([^/\s]*)\D*(\d+)', line)
        if m:
            dictionary[m.group(1)] = m.group(2)
    return dictionary

Basically, it does the following:

Defines a dictionary dictionary
Reads data line by line
Searches for a file\s+([^/\s]*)\D*(\d+) match, and if there is a match, the two capturing group values are used to form a dictionary key-value pair.

The regex I suggest is

file\s+([^/\s]*)\D*(\d+)

See the Regulex graph explaining it:

enter image description here

Then, you may use it like

res = {}
with open(filepath, 'r') as f:
    res = create_dict(f)
print(res)

See the Python demo.

SuShiS · Answer

You already used named group in your 'line_pattern', simply put them to your dictionary. re.findall would not work here. Also the character escape '\' before '/' is redundant. Thus your dictionary function would be:

def create_dict(data):
    dictionary = {}
    for line in data:
        line_pattern = re.search(r'file (?P<path>.*?)( |/.*?)? (?P<views>\d+).+', line)
    dictionary[line_pattern.group('path')] = line_pattern.group('views')
    content = dictionary[line]
    print(content)
    return dictionary

RegEx for capturing groups using dictionary key

Tags:

python

regex

dictionary

regex-negation

regex-group

smokingpenguin

2 Answers

Wiktor Stribiżew

SuShiS

Recent Activity

Donate For Us

RegEx for capturing groups using dictionary key

Tags:

python

regex

dictionary

regex-negation

regex-group

smokingpenguin

2 Answers

Wiktor Stribiżew

SuShiS

Related questions

Recent Activity

Donate For Us