Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Python 3.13 #2003

Merged
merged 8 commits into from
Jul 6, 2024
Merged

Support Python 3.13 #2003

merged 8 commits into from
Jul 6, 2024

Conversation

PeterJCLaw
Copy link
Collaborator

@PeterJCLaw PeterJCLaw commented Jun 23, 2024

Various fixes to get the tests passing on Python 3.13, all within the tests themselves.
Also some small tidyups in one of the tests.
Moves to the 3.13 grammar from parso 0.8.4 and add testing on 3.13.

Reviewing by commit may be useful.

Fixes #2002.

Discovered issues:

@davidhalter
Copy link
Owner

Thanks a lot for looking into this! Please ping me once you are finished. I'm also happy to help with issues. It feels like you are almost done, though.

@PeterJCLaw
Copy link
Collaborator Author

I'm also happy to help with issues. It feels like you are almost done, though.

Heh, the "obvious" things are fixed. I don't immediately have great ideas for the pathlib thing though. I also haven't looked at why the test_imports.py tests are failing. This could be the same thing or they might be something else.

An idea I have around the pickles issue, but which I don't like, is that we could adjust the binary content of the pickle itself, to rename the path. That can be made to work (and I believe it could be done in a way that's reliable), but it doesn't feel right. It feels like a better solution would be to hook the pickling logic somehow, but I don't know if there are hook points which would enable that?

@PeterJCLaw
Copy link
Collaborator Author

86f185e addresses the pathlib move, turns out there is a hook in pickle which solves exactly this issue and was easy to get working :)

@davidhalter
Copy link
Owner

davidhalter commented Jun 29, 2024

I don't think I would find a better solution to load those pickled objects. However, I think theres also a question of why these pickled objects need to be loaded across Python versions. iter_module_names AFAIK simply returns a list of strings, so if there's suddenly some weird pathlib object in there, I think we might also be able to fix something there. But I can also think about that a bit later once I have time (in a few hours).

@PeterJCLaw
Copy link
Collaborator Author

Interesting. I thought I saw a function (get_module_info) trying to be pickled, though maybe that was from a separate call and I misread the logging. I'll circle back to that and confirm what was happening.

I'm currently looking into why Python 3.13 is seeing flaky tests; I can intermittently reproduce it locally, so it's not just CI.

@PeterJCLaw
Copy link
Collaborator Author

On the flakes: I've got as far as it seeming to be that an inference state id for a state which the sub-process doesn't know about (or seems not to know about) is being requested to be deleted. From what I can tell the inference state in question has been used with its subprocess (and that use doesn't appear to have errored).

@davidhalter
Copy link
Owner

I thought I saw a function (get_module_info) trying to be pickled, though maybe that was from a separate call and I misread the logging. I'll circle back to that and confirm what was happening.

You are right about that. get_module_info is called from a pytest test with params from pytest. But as far as I can see even for that function there shouldn't be any special classes in there. However it might be that in some places a Path is returned now instead of a str. We can probably just use str(...) wherever that is needed and we will probably be fine?

Does that help? I feel like the Unpickler would work, but I think it's probably bad that we transfer complex Python objects anyway that we don't control, because we're working across Python versions.

I'm happy to also start debugging here if you don't feel like working on this anymore.

On the flakes: I've got as far as it seeming to be that an inference state id for a state which the sub-process doesn't know about (or seems not to know about) is being requested to be deleted. From what I can tell the inference state in question has been used with its subprocess (and that use doesn't appear to have errored).

That sounds pretty bad. This is probably some fallout from my hack of reusing the InferenceState datastructure in subprocesses. Sucks, but I'm not sure how to help. This is something I feel kind of guilty about and will of course work on it, if you don't find a solution quickly. There's not much I can recommend here, this is just annoying work and trying to understand code. Something I always try first is to use pytest -I to use the interpreter and avoid any subprocesses. Typically that solves the problem. I would then try to run 3.13 against 3.13 and see if that works. If only 3.13 against the other versions fails, I would guess that there's issues with pickling sometimes.

I haven't seen any signs about inference states in sub-processes. Is that a result of your debugging? If so, how did you debug this? Asking for a friend, because I'm pretty sure this is also going to haunt me :D

@PeterJCLaw
Copy link
Collaborator Author

PeterJCLaw commented Jun 29, 2024

On the pathlib stuff: I haven't looked in detail at which functions are called, though through debugging the other issue I feel I have a better understanding of what's going on with the subprocesses (in general) now. Having a very brief look at the functions there which could be causing the issue, get_sys_path feels like it could be one -- I believe sys.path supports pathlib.Path entries (and has for a while now), so that could easily have Paths in the returned value. That doesn't quite match up though as I had thought the issue the tests showed were when Python 3.13 was the parent and the error was in the child process. So that would suggests that the Path was in the arguments. Can't remember though - would need to check the test output to be sure.

On the subprocess stuff, I have observed that InferenceStateSubprocess.__del__ is called more than once for an instance with the same self._inference_state_id. I don't know if the id is a guarantee of the same object (especially since it's not the id of itself); can Python re-use ids within the same process if the previous instance has been garbage collected? Might that have changed in Python 3.13? (I know there's a bunch of work with a JIT happening, so GC changes seem plausible). If this is a change it feels a pretty big one and I don't see anything in the release notes.

In terms of how I've been debugging this, it's a lot of print statements I'm afraid. I've added a small hook to get logs back from the child process, but it's not pretty (and relies on collaboration from the subprocess, so it's not full logging; it also avoids using the stderr stream as I couldn't work out how to read that in a non-blocking manner). I've attached a diff to this message in case it's useful though.
debugging.diff.txt

I've then been running a subset of the tests in a loop until they fail:

export JEDI_TEST_ENVIRONMENT=3.10
while pytest --capture=no test/test_integration.py > out.txt 2>&1 ; do echo ; done

edit: added mention of JEDI_TEST_ENVIRONMENT.

I'm going to add some more tracking of the inference state ids to get a sense of where they get created, which will help clarify the GC interaction I hope.

I'm not sure how much more time I'll have to spend on this this weekend and probably won't get back to the pathlib stuff soon. Hopefully I'll make some more progress on the ids today though.

@PeterJCLaw
Copy link
Collaborator Author

PeterJCLaw commented Jun 29, 2024

On the subprocess stuff, I have observed that InferenceStateSubprocess.__del__ is called more than once for an instance with the same self._inference_state_id. I don't know if the id is a guarantee of the same object (especially since it's not the id of itself); can Python re-use ids within the same process if the previous instance has been garbage collected? Might that have changed in Python 3.13?

Looking in the docs for id:

Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

So maybe there was always scope for an issue here?

Either way, I think the use of ids to convey identity to the sub-process is the issue as they're not sufficiently unique with the current approach. I can't actually see anything amiss with the handling of the removals though - that seems pretty solid based on how I'd expect __del__ to work and this being essentially single-threaded. The only way I can see this happening is if __del__ for an instance is allowed to happen after the "replacement" instance has been created. Adding logging for that I do actually have evidence for that! This feels like it might actually be a Python bug at this point, since that doesn't seem like a good idea to me (allowing __del__ after another object has been __init__ with the same id and memory address).

Edit: for clarity: that's still looking at InferenceStateSubprocess.__init__ and InferenceStateSubprocess.__del__ (I slightly misread my logs), so not quite the smoking gun I was hoping for just yet.

@PeterJCLaw
Copy link
Collaborator Author

PeterJCLaw commented Jun 29, 2024

Aaah, aha, correcting myself then. I don't think this needs a Python bug.

I believe there's a race of the form:

  • InferenceState A created, gets id 1
  • InferenceStateSubprocess A' created, uses InferenceState A which it stores as a weakref and an id
  • InferenceStateSubprocess A' is used, the sub-process learns about an InferenceState with id 1
  • InferenceState A goes away, InferenceStateSubprocess A' is not yet garbage collected
  • InferenceState B created, gets id 1
  • InferenceStateSubprocess B' created, uses InferenceState B which it stores as a weakref and an id
  • InferenceStateSubprocess B' is used, the sub-process re-uses its entry for an InferenceState with id 1

Edit: have confirmed this from logging.

At this point the order of operations between the two InferenceStateSubprocess instances going away is immaterial -- both will trigger a removal of a state with id 1. As long as B' doesn't try to use the sub-process again after the first removal has happened then the second removal will fail.

I suspect that this could have happened under any Python version. My guess is that something in the GC behaviour on 3.13 is different and has highlighted the issue.

I think the solution is to create our own ids for the InferenceState instances. A global counter is probably fine I suspect, though it feels less neat than using the id. Another alternative might be to use the id of the InferenceStateSubprocess instance as that's the entity which we're actually locking around. Assuming that __del__ is guaranteed to happen before the next __init__ of an object with the same id, I think that'd guarantee correct behaviour (relying on the existing removals logic being otherwise correct, which I believe it is).

@davidhalter
Copy link
Owner

I think the solution is to create our own ids for the InferenceState instances. A global counter is probably fine I suspect, though it feels less neat than using the id. Another alternative might be to use the id of the InferenceStateSubprocess instance as that's the entity which we're actually locking around. Assuming that del is guaranteed to happen before the next init of an object with the same id, I think that'd guarantee correct behaviour (relying on the existing removals logic being otherwise correct, which I believe it is).

I'm probably fine with either solution, as long as it feels good to you. I generally am really happy that you were able to pin this down.

About the Path stuff: I can also try and install Python3.13 and see if I can reproduce it. I feel like once I can reproduce it, the fix will probably be quick. Whereas the whole InferenceState stuff is way way more complicated.

@PeterJCLaw
Copy link
Collaborator Author

About the Path stuff: I can also try and install Python3.13 and see if I can reproduce it. I feel like once I can reproduce it, the fix will probably be quick. Whereas the whole InferenceState stuff is way way more complicated.

For the Path stuff we actually hit the issue in both directions:

  • when Python 3.13 is the running environment and the target environment is <3.13 the child process crashes in test_imports.py, in test_find_module_not_package_zipped, test_relative_imports_without_path. While this might be to do with test setup, it seems reasonable to me that a Path object could be passed in as the context somehow.
  • when Python 3.13 is the target environment and the running environment is <3.13 then the parent process can fail to load data returned to it. This case has far more general issues in the tests -- causing something like 90% of tests to fail. I've had a quick look using test/test_cache.py (only as that's a small suite which hits the issue) and I can't actually immediately see what data in the return is the issue.

Either way around, I actually think it's fine that we're passing a Path here. (There are far more interesting types that are being passed here!) I agree there's definitely some unknown with pickling across python versions, though I don't think it's too bad. Typically my concerns with pickle would be security based, though here we're (already) treating both processes as essentially part of the same process/security boundary so I think it's fine on that front.

Do you feel we need to invest in a solution other than the unpickler?

@PeterJCLaw
Copy link
Collaborator Author

Another alternative might be to use the id of the InferenceStateSubprocess instance as that's the entity which we're actually locking around. Assuming that __del__ is guaranteed to happen before the next __init__ of an object with the same id, I think that'd guarantee correct behaviour (relying on the existing removals logic being otherwise correct, which I believe it is).

I've implemented this in PeterJCLaw#3. Not sure how we want to arrange the branches -- it's not really related to the Python 3.13 upgrade, so I'm tempted to pull it out onto its own branch and then base this one (the upgrade) on that one.

@davidhalter
Copy link
Owner

davidhalter commented Jun 30, 2024

Do you feel we need to invest in a solution other than the unpickler?

Not necessarily. I also don't think it's bad that we are passing Path. I just thought that it's weird but

I've implemented this in PeterJCLaw#3.

Thanks a lot! It looks good.

Not sure how we want to arrange the branches

I'm personally indifferent. :) Just do it however you feel like it works best for you.

This should make it easier to add new entries as well as clarifying
the intent of this filter.
In Python 3.13 the `locals` function now returns a fresh mapping
each time it's called (when called in a function). We thus need
to store a reference to the mapping being used, rather than
re-fetching it each time.

Since we don't actually need to modify the locals within the scope
of the test function itself, it suffices to use our own mapping
here rather than the result of calling `locals`, which fully
isolates this test from the nature of that function.

Fixes davidhalter#2002
This moves to using the 3.13 grammar as well as testing 3.13 in CI.
Jedi passes pickles to subprocesses which are running the target
version of Python and thus may not be the same as the version
under which Jedi itself is running. In Python 3.13, pathlib is
being refactored to allow for easier extension and has thus moved
most of its internal implementation to a submodule. Unfortunately
this changes the paths of the symbols, causing pickles of those
types to fail to load in earlier versions of Python.

This commit introduces a custom unpickler which accounts for this
move, allowing bi-directional passing of pickles to work.
@PeterJCLaw
Copy link
Collaborator Author

PeterJCLaw commented Jul 2, 2024

The latest set of failures here look related to CI now using beta 3 (up from beta 2) of Python 3.13. Based on the failures we're seeing, one of python/cpython#121025 or python/cpython#121027 seems possibly related. Don't have time this evening to dig further, but should do later in the week.

Have confirmed by running CI pinned to beta 2 (PeterJCLaw@2cbe074) that the issue doesn't reproduce there, so this is definitely related to changes in beta 3.

@PeterJCLaw
Copy link
Collaborator Author

PeterJCLaw commented Jul 4, 2024

I've pinned down the test failures around partial under beta-3 to definitely being caused by python/cpython#121027.

The specific issue which Jedi faces is that partial now looks like a method descriptor, so the check for that in DirectObjectAccess.py__name__ has a different result than it did before. This means we end up using the instance rather than its type to look up the __name__, which then raises an AttributeError.

I haven't dug too deeply into why a __get__ method is being added to functools.partial, though I'm not sure the details matter a great deal to us.

What I'm not sure about is what the right fix is here. We could detect that the thing we're looking at is a functools.partial and special case it in DirectObjectAccess.py__name__, but that doesn't really feel the right solution. @davidhalter do we have any examples of handling different names I could look at?

@PeterJCLaw PeterJCLaw changed the base branch from ensure-unique-subprocess-reference-ids to master July 4, 2024 21:40
@davidhalter
Copy link
Owner

I finally managed to get Python3.13 onto my computer and it seems like I got an older beta and therefore cannot reproduce it.

Regardless I would say that the name in that signature is of a low priority anyway. It doesn't provide a lot of value. So I would even be fine with changing the assert and just check that a string is returned.

My assumptions from reading the code is that there is that _is_class_instance returns False and therefore the repr is returned. Can you maybe post the output for the following statements?

cls = partial(str, 1).__class__
cls == type
isinstance(cls, type)

Sorry I don't seem to be able to get a Python 3.13 shell quickly and I'm a bit lazy to set up all the docker stuff. I hope you don't mind.

@PeterJCLaw
Copy link
Collaborator Author

No worries, I had to build beta-3 myself.

>>> from functools import partial
>>> cls = partial(str, 1).__class__
>>> cls
<class 'functools.partial'>
>>> type == cls
False
>>> isinstance(cls, type)

For completeness, _is_class_instance returns True.

Something which did occur to me is whether we should react to ismethoddescriptor returning True by working through the py__get__ logic. That's probably the most technically correct thing here, though I'm not sure what other impacts that'd have nor if it's really what we'd even want to show (e.g: is that going to mess with properties).

In my mind there's a couple of cases to solve here -- the immediate one, where the descriptor actually returns itself, and also the future case where it'll return something else. In the latter I'm not actually sure what it'll return -- whether that's going to be another partial (but with a self argument) or whether we're going to get another type.

Another thing I'm slightly puzzled by is that testing the signature of partials manually with sith.py I'm not actually seeing a signature printed at all. Yet the tests clearly do better than that.

I appreciate the suggestion to make the test check for any string, though I'd really hope we can do that in the long term. I might do that as a short-term workaround though (for 3.13 specifically) and then defer this to being a known bug.

@PeterJCLaw
Copy link
Collaborator Author

Another thing I've just realised is that this is actually only an issue for the interpreter. When running as Script we don't have the issue with partial, since we don't have any of the mixed-mode behaviours and thus rely entirely on PartialObject etc. Ironically the comment in MixedObject.get_signatures implies that it expects the signature via inspect.signature to be better than code inspection. That also seems out of step with the codebase though, since inspect.signature isn't actually used anywhere that I can tell.

Aaanyway, given how limited the impact from this seems likely to be in light of this being interpreter only I've broken it out to #2012 and I'll update the test here to ignore 3.13+ for the name assertion.

@PeterJCLaw PeterJCLaw marked this pull request as ready for review July 5, 2024 21:06
@davidhalter
Copy link
Owner

Thanks for all your work here. I generally agree. It would of course be nicer if this didn't regress. But it doesn't really matter. There are probably things we can improve around this, but partial is not the name a user wants there anyway. A user would actually want foo(..., b, c) as a signature for a call like partial(foo, 1) where def foo(a, b, c): ....

This is why I don't really care about this bug. It wasn't really correct earlier. One could even argue that a signature like <functools.partial object at 0x7f2b26605850>(b, c) is more correct than partial(b, c), because the latter should really be something like partial(callable, *args, **kwargs).

Feel free to merge.

@davidhalter
Copy link
Owner

That also seems out of step with the codebase though, since inspect.signature isn't actually used anywhere that I can tell.

Did you see def _get_signature in jedi/inference/compiled/access.py?

@PeterJCLaw
Copy link
Collaborator Author

That also seems out of step with the codebase though, since inspect.signature isn't actually used anywhere that I can tell.

Did you see def _get_signature in jedi/inference/compiled/access.py?

Apparently not. I thought I searched for it, but seemingly missed that 🤷

@PeterJCLaw PeterJCLaw merged commit 82d1902 into davidhalter:master Jul 6, 2024
146 checks passed
@PeterJCLaw PeterJCLaw deleted the python-3.13 branch July 6, 2024 10:39
@glaubitz
Copy link

glaubitz commented Aug 6, 2024

Are there already any plans for a new release which contains the fixes for Python 3.13?

I tried backporting these so I can fix the package in openSUSE Tumbleweed, but there is still one test failing for me after the backport:

[  181s]         if is_exception:
[  181s]             # Replace the attribute error message with a the traceback. It's
[  181s]             # way more informative.
[  181s]             result.args = (traceback,)
[  181s] >           raise result
[  181s] E           KeyError: 'Traceback (most recent call last):\n  File "/home/abuild/rpmbuild/BUILD/jedi-0.19.1/jedi/inference/compiled/subprocess/__init__.py", line 343, in listen\n    result = False, None, self._run(*payload)\n                          ~~~~~~~~~^^^^^^^^^^\n  File "/home/abuild/rpmbuild/BUILD/jedi-0.19.1/jedi/inference/compiled/subprocess/__init__.py", line 311, in _run\n    del self._inference_states[inference_state_id]\n        ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^\nKeyError: 139708397551184\n'
[  181s] 
[  181s] /home/abuild/rpmbuild/BUILD/jedi-0.19.1/jedi/inference/compiled/subprocess/__init__.py:270: KeyError
[  181s] =========================== short test summary info ============================
[  181s] FAILED test/test_integration.py::test_completion[usages:86] - KeyError: 'Trac...
[  181s] = 1 failed, 3761 passed, 114 skipped, 7 deselected, 5 xfailed in 86.41s (0:01:26) =

@davidhalter
Copy link
Owner

Is master also failing for you?

@davidhalter
Copy link
Owner

I'm going to release Jedi once Python 3.13 is finally out. Feel free to use master. The master branch will probably release as 0.19.2 in October.

@glaubitz
Copy link

glaubitz commented Aug 6, 2024

I'm going to release Jedi once Python 3.13 is finally out. Feel free to use master. The master branch will probably release as 0.19.2 in October.

This is about RPM package that is part of the distribution. I am currently fixing all packages that fail with Python 3.13 so that the Python system in openSUSE is ready for the Python 3.13 release. So, using a git snapshot is not really the feasible way here.

@PeterJCLaw
Copy link
Collaborator Author

@glaubitz you'll need to include the fix from #2008 as well I expect.

@davidhalter perhaps it would make sense for us to publish a pre-release which downstream maintainers could pull in?

@davidhalter
Copy link
Owner

@PeterJCLaw I'm currently not willing to do the extra work, since the git master will be pushed as a release anyway and maintainers can just use that instead.

@glaubitz
Copy link

glaubitz commented Aug 8, 2024

@PeterJCLaw I'm currently not willing to do the extra work, since the git master will be pushed as a release anyway and maintainers can just use that instead.

If all upstream projects handled it that way, Linux distributions would have a very hard time.

@davidhalter
Copy link
Owner

davidhalter commented Aug 8, 2024

Why? Maybe I don't understand? What's the difference between me now making a pre-release and me telling you to use a specific git commit that is fine?

Also the distros that I have used and come in contact professionally don't even try to have every package ready before the Python release. Debian obviously is notorious for lagging behind a lot. Ubuntu is typically behind just a bit (24.04 is Python 3.11). Even Arch Linux/Gentoo users have only started complaining when we were actually late (AFAIK).

It feels like it's typically Fedora (indirectly RHEL? with the commercial backing of Red Hat?)/Suse users that have these issues. Why is it that you guys want to have a package like Jedi ready before the release of Python 3.13?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test_completion_param_annotations fails in Python 3.13
3 participants