Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting all possible values from an array in python

I have a file with multiple (over 1000) columns and rows, and their names do not follow any pattern. The example of it as in below:

file1.txt

IDs     AABC  ABC6    YHG.8     D78Ha 
Ellie   12            48.70    33        
Kate    98      34    21       76.36        
Joe     22      53    49                    
Van     77            40       12.1
Xavier                         88.85   

First, I have to fill the blanks with NA, so that it will look like :

file1.txt



IDs     AABC  ABC6    YHG.8    D78Ha 
Ellie   12      NA    48.70    33        
Kate    98      34    21       76.36         
Joe     22      53    49       NA                
Van     77      NA    40       12.1
Xavier  NA      NA    NA       88.85   

Then, I am trying to get all combinations for IDs and other column as AABC, ABC6,YHG.8 and D78Ha, such as :

Ellie , AABC --> 12
Ellie, ABC6 --> NA
Ellie, YHG.8 --> 48.70  ( without rounding )
Ellie, D78Ha --> 33
Kate,AABC --> 98
Kate, ABC6 --> 34
...

So the desired output should be 20 lines (4 columns x 5 IDs) as following:

output.txt


Ellie  AABC   12
Ellie  ABC6   NA
Ellie  YHG.8  48.70
Ellie  D78Ha  33
Kate   AABC   98
Kate   ABC6   34
..

For this reason, I filled the blanks manually with NA, read file with pandas, and indexed the IDs.

So that I can reach with the ID names and other column names.

But I could not iterate it. My try was:

import pandas as pd
tablefile = pd.read_csv('file1.txt',sep='\t')
print(tablefile)
df2=tablefile.set_index("IDs")
print("Ellie AABC " , df2.loc["Ellie", "AABC" ])
print("Kate AABC " , df2.loc["Kate", "AABC" ])
print("Xavier AABC " , df2.loc["Xavier", "AABC" ])

It prints:

('Ellie AABC ', 12.0)
('Kate AABC ', 98.0)
('Xavier AABC ', nan)

How can I fill the blanks with NAs and iterate in this array without calling the names by writing it one by one? Maybe with increasing i in [i,i]?

like image 216
bapors Avatar asked Feb 11 '26 04:02

bapors


2 Answers

IIUC stack with dropna = False

df.set_index('IDs').stack(dropna=False).astype(object).reset_index()

Out[915]: 
       IDs level_1      0
0    Ellie    AABC     12
1    Ellie    ABC6    NaN
2    Ellie   YHG.8   48.7
3    Ellie   D78Ha     33
4     Kate    AABC     98
5     Kate    ABC6     34
6     Kate   YHG.8     21
7     Kate   D78Ha  76.36
8      Joe    AABC     22
9      Joe    ABC6     53
10     Joe   YHG.8     49
11     Joe   D78Ha    NaN
12     Van    AABC     77
13     Van    ABC6    NaN
14     Van   YHG.8     40
15     Van   D78Ha   12.1
16  Xavier    AABC    NaN
17  Xavier    ABC6    NaN
18  Xavier   YHG.8    NaN
19  Xavier   D78Ha  88.85
like image 128
BENY Avatar answered Feb 12 '26 17:02

BENY


Simply melt to reshape dataframe:

Data

from io import StringIO 
import pandas as pd

txt = """IDs     AABC  ABC6    YHG.8    D78Ha 
Ellie   12      NA    48.70    33        
Kate    98      34    21       76.36         
Joe     22      53    49       NA                
Van     77      NA    40       12.1
Xavier  NA      NA    NA       88.8"""

tabledf = pd.read_table(StringIO(txt), sep="\s+")

Melt

melted_df = pd.melt(tabledf, id_vars = "IDs").sort_values('IDs').reset_index(drop=True)
print(melted_df)

#        IDs variable  value
# 0    Ellie     AABC  12.00
# 1    Ellie     ABC6    NaN
# 2    Ellie    YHG.8  48.70
# 3    Ellie    D78Ha  33.00
# 4      Joe     AABC  22.00
# 5      Joe    D78Ha    NaN
# 6      Joe     ABC6  53.00
# 7      Joe    YHG.8  49.00
# 8     Kate     AABC  98.00
# 9     Kate     ABC6  34.00
# 10    Kate    YHG.8  21.00
# 11    Kate    D78Ha  76.36
# 12     Van     AABC  77.00
# 13     Van     ABC6    NaN
# 14     Van    D78Ha  12.10
# 15     Van    YHG.8  40.00
# 16  Xavier     ABC6    NaN
# 17  Xavier     AABC    NaN
# 18  Xavier    YHG.8    NaN
# 19  Xavier    D78Ha  88.80
like image 24
Parfait Avatar answered Feb 12 '26 18:02

Parfait



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!