Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split list based on first character - Python

I am new to Python and can't quite figure out a solution to my Problem. I would like to split a list into two lists, based on what the list item starts with. My list looks like this, each line represents an item (yes this is not the correct list notation, but for a better overview i'll leave it like this) :

***
**
.param
+foo = bar
+foofoo = barbar
+foofoofoo = barbarbar
.model
+spam = eggs
+spamspam = eggseggs
+spamspamspam = eggseggseggs

So I want a list that contains all lines starting with a '+' between .param and .model and another list that contains all lines starting with a '+' after model until the end.

I have looked at enumerate() and split(), but since I have a list and not a string and am not trying to match whole items in the list, I'm not sure how to implement them. What I have is this:

paramList = []
for line in newContent:
    while line.startswith('+'):
        paramList.append(line)
        if line.startswith('.'):
            break

This is just my try to create the first list. The Problem is, the code reads the second block of '+'s as well because break just Exits the while Loop, not the for Loop. I hope you can understand my question and thanks in advance for any pointers!

like image 942
moringana Avatar asked Mar 23 '26 10:03

moringana


2 Answers

What you want is really a simple task that can be accomplish using list slices and list comprehension:

data = ['**','***','.param','+foo = bar','+foofoo = barbar','+foofoofoo = barbarbar',
     '.model','+spam = eggs','+spamspam = eggseggs','+spamspamspam = eggseggseggs']

# First get the interesting positions.
param_tag_pos = data.index('.param')
model_tag_pos = data.index('.model')
# Get all elements between tags.
params =  [param for param in data[param_tag_pos + 1: model_tag_pos] if param.startswith('+')]
models =  [model for model in data[model_tag_pos + 1: -1] if model.startswith('+')]

print(params)
print(models)

Output

>>> ['+foo = bar', '+foofoo = barbar', '+foofoofoo = barbarbar']
>>> ['+spam = eggs', '+spamspam = eggseggs']

Answer to comment:

Suppose you have a list containing numbers from 0 up to 5.

l = [0, 1, 2, 3, 4, 5]

Then using list slices you can select a subset of l:

another = l[2:5]   # another is [2, 3, 4]

That what we are doing here:

data[param_tag_pos + 1: model_tag_pos]

And for your last question: ...how does python know param are the lines in data it should iterate over and what exactly does the first paramin param for paramdo?

Python doesn't know, You have to tell him.

First param is a variable name I'm using here, it cuold be x, list_items, whatever you want.

and I will translate the line of code to plain english for you:

# Pythonian
params =  [param for param in data[param_tag_pos + 1: model_tag_pos] if param.startswith('+')]

# English
params is a list of "things", for each "thing" we can see in the list `data` 
from position `param_tag_pos + 1` to position `model_tag_pos`, just if that "thing" starts with the character '+'.
like image 158
Raydel Miranda Avatar answered Mar 25 '26 00:03

Raydel Miranda


data = {}
for line in newContent:
    if line.startswith('.'):
        cur_dict = {}
        data[line[1:]] = cur_dict
    elif line.startswith('+'):
        key, value = line[1:].split(' = ', 1)
        cur_dict[key] = value

This creates a dict of dicts:

{'model': {'spam': 'eggs',
           'spamspam': 'eggseggs',
           'spamspamspam': 'eggseggseggs'},
 'param': {'foo': 'bar',
           'foofoo': 'barbar',
           'foofoofoo': 'barbarbar'}}
like image 43
eumiro Avatar answered Mar 25 '26 00:03

eumiro