Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the blib2to3 tokenizer with pytokens #4536

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

tusharsadhwani
Copy link
Contributor

@tusharsadhwani tusharsadhwani commented Dec 22, 2024

Description

Replaces black's tokenizer with a from-scratch rewrite done by me. We could vendor the code into black itself, but either pinning it or keeping it as-is would be my recommendation, the tokenizer can be used by multiple tools for perfect compatibility.

Resolves #4520
Resolves #970
Resolves #3700

Tests passing so far: 381/381 (!)

What's left

  • ERRORTOKENs are not produced just yet. And there are no tests for ERRORTOKENs. Let's add those.
  • Fix any leftover primer regressions.

@tusharsadhwani
Copy link
Contributor Author

@JelleZijlstra with this, the test suite is fully passing. Primer is failing (mostly just because some file in hypothesis failed to parse), I'll be working on that, but the code should be good for a first review.

@MeGaGiGaGon
Copy link
Contributor

I don't think we currently have any tests for this, but I just linked the above two issues here because they are the same bug in the parser where \rs cause issues. The most minimal reproduction is {\r}, and since this is a parser rewrite hopefully it can be solved. Note that this is currently only observable by directly calling internal methods due to how black reads input.

@tusharsadhwani
Copy link
Contributor Author

Thanks for linking this, I'll make sure these parse identically to how CPython does it.

@tusharsadhwani
Copy link
Contributor Author

Okay, primer is fixed, and all tests are green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants