Saving multidimensional numpy array from pandas cell into Excel

Question

I would like to save a multidimensional numpy array which is stored in a Pandas cell into an Excel file. But Excel converts the array into a string. My pandas Dataframe looks like that:

df_data relationalAtt 0 [[0.87159, 0.88042, 0.88042, 0.81962, 0.81962,... 1 [[2.7428, 2.4265, 2.4265, 2.3447, 2.3447, 2.33... 2 [[0.9799, 1.0028, 1.0028, 0.81538, 0.81538, 1.... 3 [[0.96582, 1.1887, 1.1887, 1.1342, 1.1342, 1.0... 4 [[-1.8861, -1.4923, -1.4923, -1.8474, -1.8474,... ... ... 270 [[0.66787, 0.5834, 0.53637, 0.53637, 0.64349, ... 271 [[1.6233, 1.5928, 1.5977, 1.4355, 1.4355, 1.62... 272 [[1.2729, 1.3988, 1.3772, 1.3143, 1.3143, 1.40... 273 [[1.9266, 1.7625, 1.7722, 1.7722, 2.0019, 2.05... 274 [[1.6942, 1.5156, 1.6347, 1.7582, 1.7582, 1.60... 275 rows × 1 columns

we can go one level deeper into the array :

df_data[df_data.columns[0]][0]

array([[ 0.87159, 0.88042, 0.88042, ..., -0.95541, -0.64258, -0.64258], [ 0.7453 , 0.82313, 0.82313, ..., 2.1161 , 2.2079 , 2.2079 ], [ 1.1533 , 1.0887 , 1.0887 , ..., 1.045 , 1.286 , 1.286 ], ...,

So far so good.My problem occurs when I try to save the Pandas Dataframe into an Excel. Excel saves the array cells as Strings:

n=name.split("/")[-1] name=n.split(".")[0] path= "../Random_Data/"+name+ ".csv" df_data.to_csv(path) df=pd.read_csv(path,index_col=0) df[df.columns[0]][0] '[[ 0.87159 0.88042 0.88042 ... -0.95541 -0.64258 -0.64258]\n [ 0.7453 0.82313 0.82313 ... 2.1161 2.2079 2.2079 ]\n [ 1.1533 1.0887 1.0887 ... 1.045 1.286 1.286 ]\n ...\n [ 0.88441 0.85476 0.85476 ... -0.40933 -0.44269 -0.44269]\n [ 1.137 0.63292 0.63292 ... -2.5608 -2.3481 -2.3481 ]\n [ 1.2429 1.4795 1.4795 ... -1.0315 -1.0025 -1.0025 ]]'

Do you know a way to keep the original data types? Or do you know another format to store the data which is more suitable here? Or is there a way to convert the stings back into arrays? Thanks for your help!

Tried to save as .arff or Excel. Both didn't work. Also couldn't convert the Strings back to arrays. Every approach would be helpful!

Do you need a CSV file? Do you want to change the data or just view? — mike, CommentedJan 3, 2024 at 11:28
Thank you for your input. When loading a dataset using the pd.read_csv() function with pandas, I encounter instances where the cells in the resulting DataFrame contain string representations of arrays. Specifically, I observe this behavior when inspecting the DataFrame cells. Is there a recommended approach to convert these string representations back into arrays? Alternatively, should I consider saving the DataFrame in a different format to preserve the array structure during loading? Your guidance on resolving this matter would be greatly appreciated. — Petros Tsialis, CommentedJan 6, 2024 at 13:06

mike · Accepted Answer · 2024-01-07 14:23:26Z

Just save it as a pickle. This will preserve your dataframe.

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_pickle.html

path= "../Random_Data/"+name+ ".pck" df_data.to_pickle(path)

Collectives™ on Stack Overflow

Saving multidimensional numpy array from pandas cell into Excel

1 Answer 1

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Related