In answering this question, I found that after using melt on a pandas dataframe, a column that was previously an ordered Categorical dtype becomes an object. Is this intended behaviour?
Note: not looking for a solution, just wondering if there is any reason for this behaviour or if it's not intended behavior.
Example:
Using the following dataframe df:
  Cat  L_1  L_2  L_3
0   A    1    2    3
1   B    4    5    6
2   C    7    8    9
df['Cat'] = pd.Categorical(df['Cat'], categories = ['C','A','B'], ordered=True)
# As you can see `Cat` is a category
>>> df.dtypes
Cat    category
L_1       int64
L_2       int64
L_3       int64
dtype: object
melted = df.melt('Cat')
>>> melted
  Cat variable  value
0   A      L_1      1
1   B      L_1      4
2   C      L_1      7
3   A      L_2      2
4   B      L_2      5
5   C      L_2      8
6   A      L_3      3
7   B      L_3      6
8   C      L_3      9
Now, if I look at Cat, it's become an object:
>>> melted.dtypes
Cat         object
variable    object
value        int64
dtype: object
Is this intended?
We can also do the reverse of the melt operation which is also called as Pivoting. In Pivoting or Reverse Melting, we convert a column with multiple values into several columns of their own. The pivot() method on the dataframe takes two main arguments index and columns .
Pandas melt() function is used to change the DataFrame format from wide to long. It's used to create a specific format of the DataFrame object where one or more columns work as identifiers. All the remaining columns are treated as values and unpivoted to the row axis and only two columns - variable and value.
melt() function is useful to message a DataFrame into a format where one or more columns are identifier variables, while all other columns, considered measured variables, are unpivoted to the row axis, leaving just two non-identifier columns, variable and value.
There are many ways in which conversion can be done, one such way is by using Pandas' integrated cut-function. Pandas' cut function is a distinguished way of converting numerical continuous data into categorical data.
In source code . 0.22.0(My old version)
 for col in id_vars:
        mdata[col] = np.tile(frame.pop(col).values, K)
     mcolumns = id_vars + var_name + [value_name]
Which will return the datatype object with np.tile. 
It has been fixed in 0.23.4(After I update my pandas)
df.melt('Cat')
Out[6]: 
  Cat variable  value
0   A      L_1      1
1   B      L_1      4
2   C      L_1      7
3   A      L_2      2
4   B      L_2      5
5   C      L_2      8
6   A      L_3      3
7   B      L_3      6
8   C      L_3      9
df.melt('Cat').dtypes
Out[7]: 
Cat         category
variable      object
value          int64
dtype: object
More info how it fixed :
for col in id_vars:
    id_data = frame.pop(col)
    if is_extension_type(id_data): # here will return True , then become concat not np.tile
        id_data = concat([id_data] * K, ignore_index=True)
    else:
        id_data = np.tile(id_data.values, K)
    mdata[col] = id_data
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With