Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing data in cells as a function

I'm given a pandas dataframe (or even just a 2 dimensional array).

Lets assume I have a variable a and one of the cells of the above dataframe (say the cell in place (0,0)) is a:

df = pandas.DataFrame()
df.at[0,0] = a

How should I write the above code such that if the value of a changed then the cell at place (0,0) will change automatically?

I tried what I wrote above. I also tried using lambda function.

like image 634
Or Shahar Avatar asked Jan 19 '26 12:01

Or Shahar


1 Answers

What you ask for is not something you would want to do (as all the comments to your question point out). The reason is that in order to achieve this with "simple" types you would need pointers (the memory address where the actual value is stored) which is an extra piece of risky complexity which vanilla Python usually hides away. When you access an element by index with df.iloc[0,0] the pandas returns to you a copy of the actual value stored, which is what people manipulating data most people expect and not a memory address which is dangerous to expose and requires the extra hastle of having to dereference it to access its value. Thus there is no way to allias a variable (make to variables point to the same value).

In case you really need this behaviour you could use a mutable data structure/object to store your value as a workaround (as @MichaelButscher explained). A simple way would be to use a list with a single value and simply access the value with mylist[0]. A more sophisticated way would be to define a custom class. If you work with numeric types the following is an option:

class referred_number:

    def __init__(self, value=None):
        self._value = value

    @property
    def value(self):
        return self._value

    @value.setter
    def value(self, value):
        self._value = value

    def __add__(self, add_obj):
        self._value = self._value + add_obj.value
        return self

    def __sub__(self, sub_obj):
        self._value = self._value - sub_obj.value
        return self

    def __mult__(self, mult_obj):
        self._value = self._value * mult_obj.value
        return self

    def __div__(self, div_obj):
        self._value = self._value / div_obj.value
        return self

    def __str__(self):
        return str(self._value)

This is underwhelming since not only you have lose all the power of pandas methods that work with numeric types and you have to a lot of dunder methods to recover just some of the basic behaviour for the default types. The following code will work as you wanted and will even be able to do some arithmetic operations. Besides the fact that we implemented the __ str__ will make any prints of the variable or the dataframe print the actual values.

import pandas as pd

df = pd.DataFrame()
x = referred_number(1)

df.at[0,0] = x
df.at[0,1] = referred_number(5)
print(f"df:\n{df} \n\nx:{x}\n")

x.value = 5
print(f"df:\n{df} \n\nx:{x}\n")

df[0] = df[0] + referred_number(4)
print(f"df:\n{df} \n\nx:{x}\n")
df[0] = df[0] + referred_number(3)
print(f"df:\n{df} \n\nx:{x}\n")
df[0] = df[0] + referred_number(2)
print(f"df:\n{df} \n\nx:{x}\n")
df[0] = df[0] + referred_number(2)
print(f"df:\n{df} \n\nx:{x}\n")

# Expected but dangerous behaviour since x is added twice!!
df.at[0,2] = x
df[0] = df[0] + referred_number(4)
print(f"df:\n{df} \n\nx:{x}\n")

You just have to be carefull of not assiging or operating with a regular number and use a number build with the constructor referred_number(...), e.g. referred_number(2) + 2 will yield an error. As you may have noticed what you intended involves more work, loses functionality and is error prone, which is why it is suggested to be a bad idea.

like image 177
AlGM93 Avatar answered Jan 22 '26 00:01

AlGM93