2

since I am able to save string using to_csv using 'utf-8' encoding, I am expecting to do be able to do the same using to_excel. This is not an issue on my side with encoding. None of the previous thread I saw, discuss this issue.

I am using python 2.7.12 on Windows 7(Anaconda) and pandas 0.18.1

I have 2 questions related to saving a panda dataframe containing special character (encoding as 'utf-8') as a .csv or .xlsx file.

For example:

import pandas as pd # Create a Pandas dataframe from the data. df = pd.DataFrame({'Data': ['1', 'as', '?%','ä']}) 

I can save the dataframe as a .cvs file without any issue:

df.to_csv('test_csv.csv',sep=',', encoding='utf-8') 

and it works. When importng the data I need to choose 'utf-8' in Excel and everything is fine.

Now if I try to save the same dataframe as an .xlsx then it doesn't work.

I have the following code:

 # Create a Pandas Excel writer using XlsxWriter as the engine. writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter', options={'encoding':'utf-8'}) # Convert the dataframe to an XlsxWriter Excel object. df.to_excel(writer, sheet_name='Sheet1',encoding='utf-8') writer.save() 

and I got the following error message:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7: ordinal not in range(128)

I am not 100% sure to use the correct option to set the encoding:

options={'encoding':'utf-8'} 

and

encoding='utf-8' 

since it is not clear to me how to proceed from the documentation.

Any idea how to have this working ?

Bonus question related to df.to_csv. Is there a way to use some special character as separator ? Some reason, the code I am migrated from R to python is using sep='¤'. tried to encode this special character in all possible way but is always failed. Is it possible to do that ?

Thanks a lot

Cheers

Fabien

2
  • Possible duplicate of Pandas: save to excel encoding issueCommentedOct 19, 2016 at 8:05
  • favorite This question may already have an answer here: Pandas: save to excel encoding issue 2 answers since I am able to save the string using to_csv using 'utf-8' encoding, I am expecting to do be able to do the same using to_excel. This is not an issue on my side with encoding. None of the previous thread I saw, discuss this issue. Please correct me if I am wrong.CommentedOct 19, 2016 at 8:25

1 Answer 1

1

If you are using xlsxwriter as the Excel writing engine then the encoding='utf-8' is ignored because the XlsxWriter module doesn't use it.

XlsxWriter requires that the string data is encoded as utf8. After that it handles the strings automatically.

So you will need to ensure that the string data you are writing is encoded as utf8 via Pandas: either when you read it or after the data is in the data frame.

1
  • Thanks @jmcnamara, this explain my issue. By fixing the encoding in the dataframe fix my issue.CommentedOct 20, 2016 at 12:43

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.