Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utf-8 fix in unpickle #2653

Closed
wants to merge 3 commits into from

Conversation

piotr1212
Copy link
Member

should fix #2652

Co-authored-by: Adam Stephens <2071575+adamcstephens@users.noreply.github.com>
@obfuscurity
Copy link
Member

Per @adamcstephens' fix, we should probably be passing utf8 instead of bytes. Looks like the builds are broken, but otherwise this change appears to be working for us in limited testing.

@piotr1212
Copy link
Member Author

@obfuscurity Thanks for reporting. We are probably breaking Python 2 support with this change, thus the failed tests.

@piotr1212
Copy link
Member Author

hmm, well actually py2 should still work, only the test is incompatible.


def test_load(self):
unpickler = util.unpickle()
p = b'\x80\x04\x95\r\x00\x00\x00\x00\x00\x00\x00\x8c\ttest.d\xc3\xb8d\x94.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a possibly better example that may work as intended for both python2.7 and python3: b"S'test.d\\xc3\\xb8d'\np0\n."

python2.7:

>>> s = 'test.død'
>>> pickle.dumps(s)
"S'test.d\\xc3\\xb8d'\np0\n."
>>> pickle.loads(b"S'test.d\\xc3\\xb8d'\np0\n.")
'test.d\xc3\xb8d'
>>> print(pickle.loads(b"S'test.d\\xc3\\xb8d'\np0\n."))
test.død

python3.7:

>>> pickle.loads(b"S'test.d\\xc3\\xb8d'\np0\n.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128)
>>> pickle.loads(b"S'test.d\\xc3\\xb8d'\np0\n.", encoding='utf8')
'test.død'

@deniszh deniszh mentioned this pull request Dec 23, 2020
@deniszh
Copy link
Member

deniszh commented Dec 23, 2020

I added test in #2660, let's continue there.

@deniszh deniszh closed this Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Remote Finder unable to parse UTF-8 characters
5 participants