Bug 22496

Summary: Kernel BUG when creating raid5
Product: Mageia Reporter: Herbert Poetzl <herbert>
Component: RPM PackagesAssignee: Kernel and Drivers maintainers <kernel>
Status: RESOLVED OLD QA Contact:
Severity: critical    
Priority: Normal CC: herbert, ouaurelien, tmb
Version: 6Keywords: NEEDINFO
Target Milestone: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Source RPM: kernel-4.14.13-1.mga6.src.rpm CVE:
Status comment:

Description Herbert Poetzl 2018-01-31 05:43:15 CET
Description of problem:
When creating a raid5 setup with certain options the kernel fails to allocate memory and thus hits a NULL pointer dereference.

Version-Release number of selected component (if applicable):
kernel-server-4.14.13-1.mga6-1-1.mga6

How reproducible:
Always

Steps to Reproduce:
1. mdadm --create /dev/md0 --chunk=524288 --bitmap=internal --bitmap-chunk=16384 -e 1.0 -l 5 -n 5 /dev/sd{a,b,c,d,e}

[  146.277719] raid456_cpu_up_prepare: failed memory allocation for cpu0
[  146.284344] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  146.292310] IP: __cpuhp_state_remove_instance+0x92/0x180
[  146.297735] PGD 800000042afe1067 P4D 800000042afe1067 PUD 42afe2067 PMD 0 
[  146.304763] Oops: 0002 [#1] SMP PTI
[  146.308334] Modules linked in: raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xt_recent ip6table_nat nf_nat_ipv6 nf_nat xt_comment ip6t_REJECT nf_reject_ipv6 xt_addrtype bridge stp llc xt_mark ip6table_mangle nf_conntrack_snmp xt_tcpudp xt_CT ip6table_raw xt_multiport nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack xt_NFLOG nfnetlink_log xt_LOG nf_log_ipv6 nf_log_common nf_conntrack_tftp nf_conntrack_sip nf_conntrack_sane nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ts_kmp nf_conntrack_amanda nf_conntrack ip6table_filter ip6_tables x_tables af_packet snd_hda_codec_hdmi msr sunrpc snd_hda_codec_ca0132 i915 drm_kms_helper drm i2c_algo_bit snd_hda_intel intel_rapl
[  146.380822]  snd_hda_codec x86_pkg_temp_thermal snd_hda_core intel_powerclamp nls_utf8 nls_cp437 coretemp vfat fat kvm_intel snd_hwdep iTCO_wdt kvm snd_pcm iTCO_vendor_support snd_timer snd irqbypass intel_cstate intel_uncore soundcore mxm_wmi intel_rapl_perf e1000e ixgbe i2c_i801 alx ptp ov5693(C) v4l2_common fan pps_core mdio dca mei_me hci_uart evdev joydev input_leds mei shpchp thermal btbcm videodev serdev btqca btintel media bluetooth battery ecdh_generic rfkill tpm_infineon tpm_tis video wmi pinctrl_sunrisepoint pinctrl_intel intel_lpss_acpi intel_lpss acpi_als kfifo_buf tpm_tis_core tpm button acpi_pad industrialio sch_fq_codel gpio_it87 it87 hwmon_vid efivarfs ipv6 crc_ccitt autofs4 algif_skcipher af_alg dm_crypt uas usb_storage hid_generic usbhid xhci_pci xhci_hcd crct10dif_pclmul crc32_pclmul
[  146.453509]  crc32c_intel ghash_clmulni_intel pcbc usbcore aesni_intel aes_x86_64 crypto_simd cryptd glue_helper usb_common i2c_hid hid dm_mirror dm_region_hash dm_log dm_mod ide_pci_generic ide_core
[  146.471562] CPU: 1 PID: 2177 Comm: mdadm Tainted: G         C      4.14.13-server-1.mga6 #1
[  146.480124] Hardware name: Gigabyte Technology Co., Ltd. Z170X-Gaming 7/Z170X-Gaming 7, BIOS F22j 01/11/2018
[  146.490145] task: ffffa00aeb4bbb00 task.stack: ffffb65042924000
[  146.496218] RIP: 0010:__cpuhp_state_remove_instance+0x92/0x180
[  146.502183] RSP: 0018:ffffb65042927bb8 EFLAGS: 00010246
[  146.507537] RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000001
[  146.514809] RDX: 0000000000000000 RSI: 0000000000000080 RDI: ffffffff9513eae0
[  146.522063] RBP: ffffa00af1a569a8 R08: 0000000000000000 R09: 000000000000000f
[  146.529351] R10: ffffa00afa5bcdf8 R11: ffffffff9561f40d R12: 000000000000002b
[  146.536622] R13: 0000000000016320 R14: 0000000000000001 R15: ffffa00af1a56800
[  146.543876] FS:  00007f3100867700(0000) GS:ffffa00b0ec80000(0000) knlGS:0000000000000000
[  146.552136] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  146.557953] CR2: 0000000000000000 CR3: 000000042b64c004 CR4: 00000000003606e0
[  146.565216] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  146.572494] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  146.579765] Call Trace:
[  146.582257]  free_conf+0xab/0x160 [raid456]
[  146.586528]  setup_conf+0x3c2/0xa80 [raid456]
[  146.590973]  ? null_show+0x10/0x10
[  146.594475]  raid5_run+0x84d/0xaf0 [raid456]
[  146.598833]  ? bioset_create+0x1e8/0x280
[  146.602862]  md_run+0x445/0xb50
[  146.606098]  ? kmem_cache_free+0x1aa/0x1b0
[  146.610271]  ? kmem_cache_free+0x1aa/0x1b0
[  146.614458]  do_md_run+0xf/0xb0
[  146.617665]  md_ioctl+0x1b47/0x1d90
[  146.621245]  ? tomoyo_execute_permission+0x20/0xa0
[  146.626125]  blkdev_ioctl+0x4a4/0x910
[  146.629859]  block_ioctl+0x39/0x40
[  146.633320]  do_vfs_ioctl+0x9f/0x5e0
[  146.636994]  SyS_ioctl+0x74/0x80
[  146.640293]  entry_SYSCALL_64_fastpath+0x1e/0x81
[  146.644991] RIP: 0033:0x7f3100188687
[  146.648682] RSP: 002b:00007ffe9fd6f928 EFLAGS: 00000246
[  146.648682] Code: 00 8b 15 32 75 fc 00 85 d2 0f 85 d6 00 00 00 48 c7 c7 60 9f 03 95 e8 2e 54 78 00 45 84 f6 75 5f 48 8b 45 00 48 8b 55 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 
[  146.673280] RIP: __cpuhp_state_remove_instance+0x92/0x180 RSP: ffffb65042927bb8
[  146.680757] CR2: 0000000000000000
[  146.684121] ---[ end trace e52c59d7cd5821b3 ]---
Comment 1 Thomas Backlund 2018-01-31 09:40:39 CET
does it work if you use an older kernel like 4.14.10-1.mga6 (from core updates) ?

also please try with the new 4.14.15-3.mga6 (from core updates testing)...

The reason for theese tests are that 4.14.10 does not have the page table isolation restrictions... and the 4.14.15 has several follow-up fixes ...

Keywords: (none) => NEEDINFO
CC: (none) => tmb

Thomas Backlund 2018-01-31 09:40:57 CET

Assignee: bugsquad => kernel

Comment 2 Herbert Poetzl 2018-02-27 11:42:36 CET
Tested on 4.14.20-server-1.mga6, same problem ...

[ 2105.401407] raid456_cpu_up_prepare: failed memory allocation for cpu0
[ 2105.408004] BUG: unable to handle kernel NULL pointer dereference at         
  (null)
[ 2105.415982] IP: __cpuhp_state_remove_instance+0x92/0x180
[ 2105.421406] PGD 8000000439a36067 P4D 8000000439a36067 PUD 341c30067 PMD 0 
[ 2105.428462] Oops: 0002 [#1] SMP PTI
[ 2105.432023] Modules linked in: raid456 async_raid6_recov async_memcpy async_p
q async_xor async_tx xt_recent ip6table_nat nf_nat_ipv6 nf_nat xt_comment ip6t_R
EJECT nf_reject_ipv6 xt_addrtype bridge stp llc xt_mark ip6table_mangle nf_connt
rack_snmp xt_tcpudp xt_CT ip6table_raw xt_multiport nf_conntrack_ipv6 nf_defrag_
ipv6 xt_conntrack xt_NFLOG nfnetlink_log xt_LOG nf_log_ipv6 nf_log_common nf_con
ntrack_tftp nf_conntrack_sip nf_conntrack_sane nf_conntrack_pptp nf_conntrack_pr
oto_gre nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_broa
dcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ts_kmp nf_conntrack_am
anda nf_conntrack ip6table_filter ip6_tables x_tables af_packet msr sunrpc snd_h
da_codec_hdmi nls_utf8 nls_cp437 intel_rapl vfat fat x86_pkg_temp_thermal intel_
powerclamp
[ 2105.503973]  ses coretemp snd_hda_codec_ca0132 kvm_intel kvm i915 iTCO_wdt iT
CO_vendor_support irqbypass enclosure drm_kms_helper drm intel_cstate intel_unco
re i2c_algo_bit snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_t
imer snd mxm_wmi intel_rapl_perf soundcore e1000e ixgbe i2c_i801 ptp pps_core mp
t3sas raid_class scsi_transport_sas alx mdio dca thermal mei_me evdev input_leds
 joydev fan acpi_pad wmi video mei button shpchp sch_fq_codel gpio_it87 it87 hwm
on_vid efivarfs ipv6 crc_ccitt autofs4 algif_skcipher af_alg dm_crypt uas usb_st
orage hid_generic usbhid hid crct10dif_pclmul crc32_pclmul crc32c_intel ghash_cl
mulni_intel pcbc xhci_pci xhci_hcd aesni_intel aes_x86_64 crypto_simd cryptd glu
e_helper usbcore usb_common dm_mirror dm_region_hash dm_log dm_mod ide_pci_gener
ic ide_core
[ 2105.576219] CPU: 2 PID: 3574 Comm: mdadm Not tainted 4.14.20-server-1.mga6 #1
[ 2105.583500] Hardware name: Gigabyte Technology Co., Ltd. Z170X-Gaming 7/Z170X
-Gaming 7, BIOS F22j 01/11/2018
[ 2105.593519] task: ffff8d99eab99d80 task.stack: ffffa34741e7c000
[ 2105.599593] RIP: 0010:__cpuhp_state_remove_instance+0x92/0x180
[ 2105.605582] RSP: 0018:ffffa34741e7fba0 EFLAGS: 00010246
[ 2105.610955] RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000001
[ 2105.618233] RDX: 0000000000000000 RSI: 0000000000000080 RDI: ffffffff8f33eba0
[ 2105.625497] RBP: ffff8d99f94601a8 R08: 0000000000000000 R09: 000000000000000f
[ 2105.632776] R10: ffff8d99fa5bf7f8 R11: ffffffff8f82840d R12: 000000000000002b
[ 2105.640073] R13: 0000000000016320 R14: 0000000000000001 R15: ffff8d99f9460000
[ 2105.647374] FS:  00007f4f82f97700(0000) GS:ffff8d9a0ed00000(0000) knlGS:00000
00000000000
[ 2105.655692] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2105.661585] CR2: 0000000000000000 CR3: 00000001f71f6004 CR4: 00000000003606e0
[ 2105.668814] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2105.676118] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2105.683442] Call Trace:
[ 2105.685933]  free_conf+0xab/0x160 [raid456]
[ 2105.690213]  setup_conf+0x3c2/0xa80 [raid456]
[ 2105.694685]  ? null_show+0x10/0x10
[ 2105.698202]  raid5_run+0x84d/0xaf0 [raid456]
[ 2105.702589]  ? bioset_create+0x1e8/0x280
[ 2105.706608]  md_run+0x45f/0xb80
[ 2105.709827]  ? link_path_walk+0x7b/0x510
[ 2105.713845]  do_md_run+0xf/0xb0
[ 2105.717062]  md_ioctl+0x1b9c/0x1df0
[ 2105.720650]  ? tomoyo_execute_permission+0x50/0xa0
[ 2105.725589]  blkdev_ioctl+0x4ae/0x930
[ 2105.729343]  block_ioctl+0x39/0x40
[ 2105.732801]  do_vfs_ioctl+0xa2/0x600
[ 2105.736492]  ? __fput+0x15e/0x1d0
[ 2105.739855]  SyS_ioctl+0x74/0x80
[ 2105.743157]  ? exit_to_usermode_loop+0x94/0xc0
[ 2105.747705]  do_syscall_64+0x6e/0x120
[ 2105.751416]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 2105.756558] RIP: 0033:0x7f4f828b96d7
[ 2105.760219] RSP: 002b:00007ffd88531cd8 EFLAGS: 00000246 ORIG_RAX: 00000000000
00010
[ 2105.768011] RAX: ffffffffffffffda RBX: 00007ffd885327d0 RCX: 00007f4f828b96d7
[ 2105.775326] RDX: 00007ffd88531e30 RSI: 00000000400c0930 RDI: 0000000000000004
[ 2105.782606] RBP: 0000000000000003 R08: 0000000000f6e970 R09: 0000000000000001
[ 2105.789877] R10: 0000000000007250 R11: 0000000000000246 R12: 0000000000f6e798
[ 2105.797184] R13: 00007ffd88532280 R14: 0000000000000000 R15: 0000000000f6c040
[ 2105.804506] Code: 00 8b 15 22 64 1c 01 85 d2 0f 85 d6 00 00 00 48 c7 c7 60 9f
 23 8f e8 5e ce 78 00 45 84 f6 75 5f 48 8b 45 00 48 8b 55 08 48 85 c0 <48> 89 02
 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 
[ 2105.823791] RIP: __cpuhp_state_remove_instance+0x92/0x180 RSP: ffffa34741e7fb
a0
[ 2105.831252] CR2: 0000000000000000
[ 2105.834625] ---[ end trace 4e212b3ae66810e5 ]---

Best,
Herbert

CC: (none) => herbert

Comment 3 Aurelien Oudelet 2020-08-16 16:16:14 CEST
Mageia 6 changed to end-of-life (EOL) status on 2019-09-30. It is no longer 
maintained, which means that it will not receive any further security or bug 
fix updates.

Package Maintainer: If you wish for this bug to remain open because you plan 
to fix it in a currently maintained version, simply change the 'version' to 
a later Mageia version.

Bug Reporter: Thank you for reporting this issue and we are sorry that we 
weren't able to fix it before Mageia 6's end of life. If you are able to 
reproduce it against a later version of Mageia, you are encouraged to click 
on "Version" and change it against that version of Mageia.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a more recent
Mageia release includes newer upstream software that fixes bugs or makes them
obsolete.

If you would like to help fixing bugs in the future, don't hesitate to join the
packager team via our mentoring program [1] or join the teams that fit you 
most [2].

[1] https://wiki.mageia.org/en/Becoming_a_Mageia_Packager
[2] http://www.mageia.org/contribute/

Best regards,
Aurélien
Bugsquad Team

CC: (none) => ouaurelien
Status: NEW => RESOLVED
Resolution: (none) => OLD