Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZTS: OOM in raidz_002_pos #16566

Open
tonyhutter opened this issue Sep 24, 2024 · 0 comments
Open

ZTS: OOM in raidz_002_pos #16566

tonyhutter opened this issue Sep 24, 2024 · 0 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@tonyhutter
Copy link
Contributor

System information

Type Version/Name
Distribution Name Fedora
Distribution Version 40
Kernel Version 6.10
Architecture x86_64
OpenZFS Version

Describe the problem you're observing

Using the new github runners, we're seeing an occasional OOM in functional/raidz/raidz_002_pos. It is killing off the raidz_test program:

Test: /usr/share/zfs/zfs-tests/tests/functional/raidz/raidz_002_pos (run as root) [03:30] [FAIL]
08:41:42.14 /usr/share/zfs/zfs-tests/tests/functional/raidz/raidz_002_pos.ksh[49]: log_must[70]: log_pos: line 265: 918355: Killed
08:41:42.14 20/176... 40/165... 60/165... 80/165... 100/165... 120/165... ERROR: raidz_test -S -e -t 300 exited 265

raidz_test had allocated 5.3GB of RAM:

Out of memory: Killed process 918355 (raidz_test) total-vm:13275572kB, anon-rss:5306400kB, file-rss:56kB, shmem-rss:0kB, UID:0 pgtables:24564kB oom_score_adj:0
 [ 7605.935208] systemd-userdbd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
  [ 7605.938835] CPU: 1 PID: 708 Comm: systemd-userdbd Tainted: P           OE      6.10.10-200.fc40.x86_64 #1
  [ 7605.941634] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
  [ 7605.944347] Call Trace:
  [ 7605.945197]  <TASK>
  [ 7605.945978]  dump_stack_lvl+0x5d/0x80
  [ 7605.947381]  dump_header+0x44/0x18d
  [ 7605.948634]  oom_kill_process.cold+0xa/0xaa
  [ 7605.949964]  out_of_memory+0x219/0x4b0
  [ 7605.951262]  __alloc_pages_slowpath.constprop.0+0xb4e/0xe00
  [ 7605.953023]  __alloc_pages_noprof+0x31f/0x350
  [ 7605.954412]  alloc_pages_mpol_noprof+0xd7/0x1e0
  [ 7605.955867]  ? __filemap_get_folio+0x37/0x2e0
  [ 7605.957254]  vma_alloc_folio_noprof+0x63/0xc0
  [ 7605.958667]  ? __swap_duplicate+0xdb/0x190
  [ 7605.960007]  do_swap_page+0x4a9/0xd60
  [ 7605.961215]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 7605.962768]  ? __handle_mm_fault+0x829/0x1080
  [ 7605.964150]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 7605.965656]  ? __pte_offset_map+0x1b/0x180
  [ 7605.966971]  __handle_mm_fault+0x829/0x1080
  [ 7605.968335]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 7605.969820]  ? mt_find+0x21c/0x580
  [ 7605.971016]  handle_mm_fault+0xf0/0x300
  [ 7605.972239]  do_user_addr_fault+0x15d/0x620
  [ 7605.973660]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 7605.975112]  ? asm_exc_page_fault+0x26/0x30
  [ 7605.976458]  exc_page_fault+0x7e/0x180
  [ 7605.977673]  asm_exc_page_fault+0x26/0x30
  [ 7605.978963] RIP: 0010:__get_user_8+0x11/0x20

Full examples:
https://github.com/openzfs/zfs/actions/runs/10978174081/job/30481019124
https://github.com/openzfs/zfs/actions/runs/10998799735/job/30537538603

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

@tonyhutter tonyhutter added the Type: Defect Incorrect behavior (e.g. crash, hang) label Sep 24, 2024
tonyhutter added a commit to tonyhutter/zfs that referenced this issue Oct 3, 2024
raidz_002_pos can take over 5GB of RAM and will sometimes OOM.
Enable 16GB of swap space to help mitigate this.

Fixes: openzfs#16566
Signed-off-by: Tony Hutter <[email protected]>
tonyhutter added a commit to tonyhutter/zfs that referenced this issue Oct 3, 2024
raidz_002_pos can take over 5GB of RAM and will sometimes OOM.
Enable 16GB of swap space to help mitigate this.

Fixes: openzfs#16566
Signed-off-by: Tony Hutter <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

1 participant