Can python normalize array of objects?

Question

I am doing an assignment for machine learning class in python. I started learning python just yesterday so I am not aware of practices used in python.

Part of my task is to load data from csv (2D array) lets call it arr_2d and normalize that.

I've found sklearn and numpy solutions online but they expect 2D array as input.

My approach after loading arr_2d is to parse them into array of objects (data: [HealthRecord]).

My solution was a code similar to this (note: kinda pseudocode)

result = [] # 2D array of property values
for key in ['age','height','weight',...]:
    tmp = list(map(lambda item: getattr(key, item), data))
    result.append(tmp)

Result now contains 3 * data.length items and I would use sklearn to normalize single row in my result array, then rotate it back and parse normalized to HealthRecord.

I see this as overcomplicated and what I would like to see an option to do it any easier way, like sending [HealthRecord] to sklearn.normalize

Code below shows my (simplified) loading and parsing:

class Person: 
    age: int
    height: int
    weight: int
    

def arr_2_obj(data: [[]]) -> Person:
    person = Person()
    person.age = data[0]
    person.height = data[1]
    person.weight = data[2]

    return person


# age (days), height (cm), weight (kg)
input = [
    [60*365, 125, 65],
    [30*365, 195, 125],
    [13*365, 116, 53],
    [16*365, 164, 84],
    [12*365, 125, 96],
    [10*365, 90, 46],    
]

parsed = []

for row in input:
    parsed.append(arr_2_obj(row))

note: Person class is HealthRecord

Thank you for any input or insights.

Edit: typo sci-learn -> sklearn

Yuri Feldman · Accepted Answer

You can't. In practice, you're dealing with tabular data. The standard (as in most popular, not standard library) package in python to process tabular data is pandas, so you can do something like:

import pandas as pd
df = pd.DataFrame([d.__dict__ for d in data])
normalized_df = (df-df.mean())/df.std() # example normalization

If you insist on dealing with arrays of objects instead of tables, you can write a class which does the required conversions to shorten notations, e.g. something like

class ObjectList: 
    def __init__(self, object_type, records): 
        self.objects = [object_type(**record) for record in records]

    def to_data_frame(self): 
        return pd.DataFrame([d.__dict__ for d in self.objects])

class PersonList(ObjectList): 
    def __init__(self, records): 
        super().__init__(Person, records)

The above assumes class Person has an __init__ function accepting arguments height, age, weight.

You can also try to shorten notations further by overloading operators, but unless you're writing library code I don't see why you would want to.

Can python normalize array of objects?

Tags:

python

python-3.x

Chiffie

1 Answers

Yuri Feldman

Recent Activity

Donate For Us

Can python normalize array of objects?

Tags:

python

python-3.x

Chiffie

1 Answers

Yuri Feldman

Related questions

Recent Activity

Donate For Us