`hex_codes_in_unicode_sequences`: restrictions around f-strings are too strict #4523

MeGaGiGaGon · 2024-12-04T07:56:17Z

I had this idea while reading #4522. This probably shouldn't block hex_codes_in_unicode_sequences from being stabilized since it is an edge case, but would be nice to have.

playground link Currently, f"{'\xFF'}" does not get formatted. It should be formattable to f"{'\xff'}".

Note that any changes here have to be careful of f-string debug statements. This formatting would not have any effect since \xXX is resolved in the string literal before the formatting output happens

>>> f"{'\xFF'=}"
"'ÿ'='ÿ'"
>>> f"{'\xff'=}"
"'ÿ'='ÿ'"

but if there is ever a situation where the formatting is applied when the escape isn't resolved, then the behavior would change observably.

The text was updated successfully, but these errors were encountered:

JelleZijlstra · 2024-12-04T14:09:53Z

Thanks, good catch. I think this doesn't need to block the stabilization; we can tweak it next year. Though I'd accept a PR fixing it before the end of the year.

MeGaGiGaGon · 2024-12-05T02:50:15Z

I have found why this happens, but not how to fix it.

The issue was introduced by #4401. The test used for if an f-string can be formatted like a normal string is if any of the f-string {}s contain a \, which is too broad of a test.
You can see this by the fact that this code f"\xFF{"\a"}" also doesn't have the hex value formatted.

Note that this does also accidentally fix a possible bug, since if this f-string f"{r"\xFF"}" was formatted as a normal string to f"{r"\xff"}" it would change program behavior (but be caught by the inequivalent to source code sanity check).

Since fixing this naively both introduces a bug and, in my attempt, breaks a ton of tests, I'm not sure what to do/what approach to take. Thoughts @JelleZijlstra? Also cc @tusharsadhwani

This is also could be considered a byproduct of the actual f-string formatting code being commented out with # TODO: Uncomment Implementation to format f-string children, but I couldn't find any context from #3822 why this is the case.

JelleZijlstra · 2024-12-05T15:53:09Z

Thanks for looking into this!

This is also could be considered a byproduct of the actual f-string formatting code being commented out with # TODO: Uncomment Implementation to format f-string children, but I couldn't find any context from #3822 why this is the case.

#3822 was a big change by itself to add parsing support for new f-strings. I didn't want to make the change even more complicated by adding formatting changes, especially because (as I recall) the initial version made some questionable formatting choices. Ideally we should format code inside f-strings, and that might provide a principled way to fix this issue too. However, it's a bigger project.

Since fixing this naively both introduces a bug and, in my attempt, breaks a ton of tests, I'm not sure what to do/what approach to take.

Since this is a fairly obscure bug, the best approach might be to leave this simple bug around for now, and investigate broader changes to format f-strings in a principled way, like I discussed above.

tusharsadhwani · 2024-12-05T15:59:21Z

It wasn't (and still hasn't afaik) been decided how we want to approach nested string formatting, so #4401 simply restored the pre-3.12 behaviour.

We could for example adopt Ruff's decisions on how to format them. i.e., continue this discussion: astral-sh/ruff#9785

MeGaGiGaGon · 2024-12-12T18:48:26Z

Update from the corresponding Ruff issues I've opened: The current python behavior might be a bug in the parser, in which case the current formatting would be correct. python/cpython#124363

MeGaGiGaGon added the T: style What do we want Blackened code to look like? label Dec 4, 2024

JelleZijlstra added the F: strings Related to our handling of strings label Dec 4, 2024

MeGaGiGaGon mentioned this issue Dec 11, 2024

[ruff] Hex code formatting restrictions around debug f-strings are too strict on preview astral-sh/ruff#14926

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`hex_codes_in_unicode_sequences`: restrictions around f-strings are too strict #4523

`hex_codes_in_unicode_sequences`: restrictions around f-strings are too strict #4523

MeGaGiGaGon commented Dec 4, 2024

JelleZijlstra commented Dec 4, 2024

MeGaGiGaGon commented Dec 5, 2024

JelleZijlstra commented Dec 5, 2024

tusharsadhwani commented Dec 5, 2024

MeGaGiGaGon commented Dec 12, 2024

hex_codes_in_unicode_sequences: restrictions around f-strings are too strict #4523

hex_codes_in_unicode_sequences: restrictions around f-strings are too strict #4523

Comments

MeGaGiGaGon commented Dec 4, 2024

JelleZijlstra commented Dec 4, 2024

MeGaGiGaGon commented Dec 5, 2024

JelleZijlstra commented Dec 5, 2024

tusharsadhwani commented Dec 5, 2024

MeGaGiGaGon commented Dec 12, 2024

`hex_codes_in_unicode_sequences`: restrictions around f-strings are too strict #4523

`hex_codes_in_unicode_sequences`: restrictions around f-strings are too strict #4523