Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Black crashes on files containing \r, from e.g. old MacOS #3700

Open
Zac-HD opened this issue May 24, 2023 · 3 comments · May be fixed by #4536
Open

Black crashes on files containing \r, from e.g. old MacOS #3700

Zac-HD opened this issue May 24, 2023 · 3 comments · May be fixed by #4536
Labels
T: bug Something isn't working

Comments

@Zac-HD
Copy link
Contributor

Zac-HD commented May 24, 2023

Black currently fails on files containing a carriage return in many positions, while Python itself is happy to treat \r as a newline.
For example:

import black
black.format_str("{\r}", mode=black.Mode())

This is obviously a trivial example, but the problem occurs for a fairly wide range of old Python code from e.g. MacOS, or when files are loaded without universal-newlines (which does still happen occasionally).

As a fix, I don't think it's worth trying to fix lib2to3, but 'on error check for \r, rewrite and retry' seems reasonable.

@Zac-HD Zac-HD added the T: bug Something isn't working label May 24, 2023
@cooperlees
Copy link
Collaborator

Could it be unsafe in any way at all to change these newlines for this old code? They are probably using an old python runtime too maybe ... I can't think of any but just wondering ...

If there is any edge case, since this is very niche, maybe we error and state we don't support '\r' newline terminated files and recommend dos2unix -c Mac mac_file then black? My thinking here is so it's an explicit change from \r to \n? I can see pros and cons each way so I'm open to what majority thinks best here. I guess if people are using black they have lost caring for \r ...

@MeGaGiGaGon
Copy link
Contributor

I recently left a more in depth comment on this here, but funny coincidence that the same issue got opened twice by Zac-HD almost four years apart :).
Maybe worth closing as a duplicate?
The TLDR from that other comment is that this is still unsolved, but unreachable from normal use because of how black opens files/reads stdin, and is in a weird spot since black doesn't yet have a decided public API #779
This might also get resolved by #4536 since the issue is a bug in the parser.

@tusharsadhwani
Copy link
Contributor

Okay very interesting.

The new parser will fix this bug, but funnily enough the behaviour of this bit of code has changed between Python 3.11 and 3.12:

$ printf '{\r}' | python3.11 -m tokenize
1,0-1,1:            OP             '{'            
1,1-1,2:            ERRORTOKEN     '\r'           
1,2-1,3:            OP             '}'            
1,3-1,4:            NEWLINE        ''             
2,0-2,0:            ENDMARKER      ''             

$ printf '{\r}' | python3.12 -m tokenize
1,0-1,1:            OP             '{'            
1,1-1,3:            OP             '\r}'          
1,3-1,4:            NEWLINE        ''             
2,0-2,0:            ENDMARKER      ''   

Fairly sure that this is a benign bug in CPython. I've made an issue.

@tusharsadhwani tusharsadhwani linked a pull request Dec 25, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants