You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using special characters in the naming of my procedure's DATA_COLUMNS (e.g. when writing '°C'), the result file is written correctly, but PyMeasure can fail to read it back in and no curve is displayed in the plotter window.
When writing the result files, PyMeasure uses the open function (in Results.__init__ for the header and column titles) and logging.FileHandler (in Recorder.__init__) which both default to the standard encoding of the system (locale.getencoding()). On my German Windows 10 machine, this is 'cp1252'. However, reading the data back in uses pandas' read_csv function which uses encoding='utf-8' by default. This causes an error for special characters (such as '°'), that is silently ignored by PyMeasure and an empty DataFrame is returned instead. Special characters in the header do not pose an issue, as read_csv ignores the commented lines and Results.header() uses encode("unicode_escape") to replace tricky characters (e.g. µL is written as \xb5L).
Solution 1: Avoid special characters
Of course, I can just write degC and uL (also for possible compatibility with other software). But that is not a real solution, more a workaround.
Solution 2: Set the global default encoding to UTF-8
I did not check how this can be done in the operating system itself. But in Python, the UTF-8 mode can be used to change most encoding defaults to 'utf-8', e.g. by setting the environment variable PYTHONUTF8=1. Again, this is a user-side workaround and not a fix.
Solution 3: Using the default encoding in Results
Wherever pandas is used to read the results file, add the argument encoding=locale.getencoding(). I found two occurrences of pd.read_csv in Results.reload and Results.data, but maybe there are other places? At least for me, adding the new argument in both these places fixed the issue for me. (And of course, import locale has to be added to the script.)
Solution 4: Explicitly specify the encoding
Set an encoding when creating a Results instance (either as class parameter or argument) and pass it down to any function interacting with the file. This would probably be the most robust solution, as it does not implicitly rely on all functions to use the default encoding (which, as we have seen here, already failed with pandas). But I do not have the confidence to say, in how many places this has to be implemented, so I rather not mess around with this solution.
Thanks for your support.
The text was updated successfully, but these errors were encountered:
I personally prefer to make it all in utf-8, as the files will be portable between operating systems.
Using the locale for reading and writing can be an issue, if you read and write on different machines.
However, backward compatibility requires the local encoding.
When using special characters in the naming of my procedure's
DATA_COLUMNS
(e.g. when writing'°C'
), the result file is written correctly, but PyMeasure can fail to read it back in and no curve is displayed in the plotter window.When writing the result files, PyMeasure uses the
open
function (inResults.__init__
for the header and column titles) andlogging.FileHandler
(inRecorder.__init__
) which both default to the standard encoding of the system (locale.getencoding()
). On my German Windows 10 machine, this is'cp1252'
. However, reading the data back in uses pandas'read_csv
function which usesencoding='utf-8'
by default. This causes an error for special characters (such as'°'
), that is silently ignored by PyMeasure and an emptyDataFrame
is returned instead. Special characters in the header do not pose an issue, asread_csv
ignores the commented lines andResults.header()
usesencode("unicode_escape")
to replace tricky characters (e.g.µL
is written as\xb5L
).Solution 1: Avoid special characters
Of course, I can just write
degC
anduL
(also for possible compatibility with other software). But that is not a real solution, more a workaround.Solution 2: Set the global default encoding to UTF-8
I did not check how this can be done in the operating system itself. But in Python, the UTF-8 mode can be used to change most encoding defaults to
'utf-8'
, e.g. by setting the environment variablePYTHONUTF8=1
. Again, this is a user-side workaround and not a fix.Solution 3: Using the default encoding in
Results
Wherever pandas is used to read the results file, add the argument
encoding=locale.getencoding()
. I found two occurrences ofpd.read_csv
inResults.reload
andResults.data
, but maybe there are other places? At least for me, adding the new argument in both these places fixed the issue for me. (And of course,import locale
has to be added to the script.)Solution 4: Explicitly specify the encoding
Set an encoding when creating a
Results
instance (either as class parameter or argument) and pass it down to any function interacting with the file. This would probably be the most robust solution, as it does not implicitly rely on all functions to use the default encoding (which, as we have seen here, already failed with pandas). But I do not have the confidence to say, in how many places this has to be implemented, so I rather not mess around with this solution.Thanks for your support.
The text was updated successfully, but these errors were encountered: