1

I would like to save a multidimensional numpy array which is stored in a Pandas cell into an Excel file. But Excel converts the array into a string. My pandas Dataframe looks like that:

df_data relationalAtt 0 [[0.87159, 0.88042, 0.88042, 0.81962, 0.81962,... 1 [[2.7428, 2.4265, 2.4265, 2.3447, 2.3447, 2.33... 2 [[0.9799, 1.0028, 1.0028, 0.81538, 0.81538, 1.... 3 [[0.96582, 1.1887, 1.1887, 1.1342, 1.1342, 1.0... 4 [[-1.8861, -1.4923, -1.4923, -1.8474, -1.8474,... ... ... 270 [[0.66787, 0.5834, 0.53637, 0.53637, 0.64349, ... 271 [[1.6233, 1.5928, 1.5977, 1.4355, 1.4355, 1.62... 272 [[1.2729, 1.3988, 1.3772, 1.3143, 1.3143, 1.40... 273 [[1.9266, 1.7625, 1.7722, 1.7722, 2.0019, 2.05... 274 [[1.6942, 1.5156, 1.6347, 1.7582, 1.7582, 1.60... 275 rows × 1 columns 

we can go one level deeper into the array :

df_data[df_data.columns[0]][0]

array([[ 0.87159, 0.88042, 0.88042, ..., -0.95541, -0.64258, -0.64258], [ 0.7453 , 0.82313, 0.82313, ..., 2.1161 , 2.2079 , 2.2079 ], [ 1.1533 , 1.0887 , 1.0887 , ..., 1.045 , 1.286 , 1.286 ], ..., 

So far so good.My problem occurs when I try to save the Pandas Dataframe into an Excel. Excel saves the array cells as Strings:

n=name.split("/")[-1] name=n.split(".")[0] path= "../Random_Data/"+name+ ".csv" df_data.to_csv(path) df=pd.read_csv(path,index_col=0) df[df.columns[0]][0] '[[ 0.87159 0.88042 0.88042 ... -0.95541 -0.64258 -0.64258]\n [ 0.7453 0.82313 0.82313 ... 2.1161 2.2079 2.2079 ]\n [ 1.1533 1.0887 1.0887 ... 1.045 1.286 1.286 ]\n ...\n [ 0.88441 0.85476 0.85476 ... -0.40933 -0.44269 -0.44269]\n [ 1.137 0.63292 0.63292 ... -2.5608 -2.3481 -2.3481 ]\n [ 1.2429 1.4795 1.4795 ... -1.0315 -1.0025 -1.0025 ]]' 

Do you know a way to keep the original data types? Or do you know another format to store the data which is more suitable here? Or is there a way to convert the stings back into arrays? Thanks for your help!

Tried to save as .arff or Excel. Both didn't work. Also couldn't convert the Strings back to arrays. Every approach would be helpful!

3
  • 1
    Thanks for the recommendation I have replaced the picturesCommentedJan 3, 2024 at 10:59
  • 1
    Do you need a CSV file? Do you want to change the data or just view?
    – mike
    CommentedJan 3, 2024 at 11:28
  • Thank you for your input. When loading a dataset using the pd.read_csv() function with pandas, I encounter instances where the cells in the resulting DataFrame contain string representations of arrays. Specifically, I observe this behavior when inspecting the DataFrame cells. Is there a recommended approach to convert these string representations back into arrays? Alternatively, should I consider saving the DataFrame in a different format to preserve the array structure during loading? Your guidance on resolving this matter would be greatly appreciated.CommentedJan 6, 2024 at 13:06

1 Answer 1

0

Just save it as a pickle. This will preserve your dataframe.

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_pickle.html

path= "../Random_Data/"+name+ ".pck" df_data.to_pickle(path) 

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.