Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherrypick 2 snapshot fixes for 2.1.13 #15187

Merged
merged 2 commits into from
Aug 25, 2023

Conversation

rincebrain
Copy link
Contributor

Motivation and Context

Someone pinged #13327 and I noticed I never got either of these into 2.1. Oops.

Description

See #12670 and #14462, but briefly, sometimes you can wind up with snapshots never getting cleaned up because the unmount fails, and on debug builds, you can trip asserts racing yourself to unmount.

How Has This Been Tested?

I've been running this as a local patchset for a long time now, but never pushed it to upstream 2.1. Oopsie.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

On Linux, sometimes, when ZFS goes to unmount an automounted snap,
it fails a VERIFY check on debug builds, because taskq_cancel_id
returned ENOENT after not finding the taskq it was trying to cancel.

This presumably happens when it already died for some reason; in this
case, we don't really mind it already being dead, since we're just
going to dispatch a new task to unmount it right after.

So we just ignore it if we get back ENOENT trying to cancel here,
retry a couple times if we get back the only other possible condition
(EBUSY), and log to dbgmsg if we got anything but ENOENT or success.

(We also add some locking around taskqid, to avoid one or two cases
of two instances of trying to cancel something at once.)

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Nguyen <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#11632
Closes openzfs#12670
Unfortunately, even after e79b680, I still, much more rarely,
tripped asserts when playing with many ctldir mounts at once.

Since this appears to happen if we dispatched twice too fast, just
ignore it. We don't actually need to do anything if someone already
started doing it for us.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes openzfs#14462
@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Aug 25, 2023
@behlendorf behlendorf merged commit 426d07d into openzfs:zfs-2.1.13-staging Aug 25, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants