Bug 18843 - [6sta1] WLAN connection with iwlwifi lacks from time to time
Summary: [6sta1] WLAN connection with iwlwifi lacks from time to time
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: x86_64 Linux
Priority: Normal normal
Target Milestone: ---
Assignee: Mageia Bug Squad
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 18881
Blocks:
  Show dependency treegraph
 
Reported: 2016-07-02 22:10 CEST by Max Perl
Modified: 2016-07-11 11:02 CEST (History)
4 users (show)

See Also:
Source RPM: kernel
CVE:
Status comment:


Attachments

Description Max Perl 2016-07-02 22:10:09 CEST
Description of problem:

My wlan turns off from time to time and I cannot activate it again. Instead I have to restart the notebook and then the WLAN works fine again. My WLAN device is Intel Wireless 3160. The module is iwlwifi. The kernel version is 4.6.3-desktop-1.mga6.

Here are some for my opinion relevant infos of dmesg output. I can send all if it would be helpful:

[  295.547565] WARNING: CPU: 1 PID: 0 at drivers/net/wireless/intel/iwlwifi/iwl-trans.h:1161 iwl_pcie_txq_stuck_timer+0x29c/0x360 [iwlwifi]
[  295.547571] Modules linked in: [...]
[  295.744406] iwlwifi 0000:01:00.0: Q 6 is active and mapped to fifo 2 ra_tid 0xa5a5 [90,1515870810]
[... these warnings come more times]


then after some time I get the following:
[  298.895800] iwlwifi 0000:01:00.0: Q 30 is active and mapped to fifo 2 ra_tid 0xa5a5 [90,1515870810]
[  298.961548] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
[  298.961555] clocksource:                       'acpi_pm' wd_now: 4d606c wd_last: 72e923 mask: ffffff
[  298.961558] clocksource:                       'tsc' cs_now: 8918dec238 cs_last: 875bd8d002 mask: ffffffffffffffff
[  298.962397] clocksource: Switched to clocksource acpi_pm
[  306.475322] audit: type=1326 audit(1467487849.363:291): auid=1000 uid=1000 gid=1000 ses=3 pid=3377 comm="chrome" exe="/usr/lib64/chromium-browser/chrome" sig=0 arch=c000003e syscall=273 compat=0 ip=0x7fdfd831860d code=0x50000
[ ... the audit message repeats some times]
[  348.429389] iwlwifi 0000:01:00.0: Failed to wake NIC for hcmd
[  348.429449] iwlwifi 0000:01:00.0: Error sending STATISTICS_CMD: enqueue_hcmd failed: -5
[  348.462304] iwlwifi 0000:01:00.0: Failed to wake NIC for hcmd
[  348.462354] iwlwifi 0000:01:00.0: Error sending STATISTICS_CMD: enqueue_hcmd failed: -5


I tried to "restart" the device with rfkill block 1 and rfkill unblock 1.
Then I got the following errors:
[  536.768681] iwlwifi 0000:01:00.0: Q 30 is active and mapped to fifo 2 ra_tid 0xa5a5 [90,1515870810]
[  536.801830] iwlwifi 0000:01:00.0: Failed to wake NIC for hcmd
[  536.801878] iwlwifi 0000:01:00.0: Error sending MAC_CONTEXT_CMD: enqueue_hcmd failed: -5
[  536.801883] iwlwifi 0000:01:00.0: Failed to remove MAC context: -5
[  536.834821] iwlwifi 0000:01:00.0: Failed to wake NIC for hcmd
[  536.834853] iwlwifi 0000:01:00.0: Error sending SCD_QUEUE_CFG: enqueue_hcmd failed: -5
[  536.834858] iwlwifi 0000:01:00.0: Failed to disable queue 0 (ret=-5)
[  536.867768] iwlwifi 0000:01:00.0: Failed to wake NIC for hcmd
[  536.867805] iwlwifi 0000:01:00.0: Error sending SCD_QUEUE_CFG: enqueue_hcmd failed: -5
[  536.867811] iwlwifi 0000:01:00.0: Failed to disable queue 1 (ret=-5)
[  536.900733] iwlwifi 0000:01:00.0: Failed to wake NIC for hcmd
[  536.900772] iwlwifi 0000:01:00.0: Error sending SCD_QUEUE_CFG: enqueue_hcmd failed: -5
[  536.900776] iwlwifi 0000:01:00.0: Failed to disable queue 2 (ret=-5)
[  536.933699] iwlwifi 0000:01:00.0: Failed to wake NIC for hcmd
[  536.933740] iwlwifi 0000:01:00.0: Error sending SCD_QUEUE_CFG: enqueue_hcmd failed: -5
[  536.933746] iwlwifi 0000:01:00.0: Failed to disable queue 3 (ret=-5)
[  540.426785] audit: type=1130 audit(1467488083.082:311): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  545.437982] audit: type=1131 audit(1467488088.088:312): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  546.712788] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[  546.976465] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[  547.310386] audit: type=1130 audit(1467488089.955:313): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  548.765501] iwlwifi 0000:01:00.0: Could not load the [0] uCode section
[  548.765527] iwlwifi 0000:01:00.0: Failed to start INIT ucode: -5
[  548.765532] iwlwifi 0000:01:00.0: Failed to run INIT ucode: -5
[  552.321870] audit: type=1131 audit(1467488094.964:314): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  553.505133] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[  553.769337] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[  555.549610] iwlwifi 0000:01:00.0: Could not load the [0] uCode section
[  555.549636] iwlwifi 0000:01:00.0: Failed to start INIT ucode: -5
[  555.549641] iwlwifi 0000:01:00.0: Failed to run INIT ucode: -5
[  559.981190] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[  560.244148] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[  562.025501] iwlwifi 0000:01:00.0: Could not load the [0] uCode section
[  562.025546] iwlwifi 0000:01:00.0: Failed to start INIT ucode: -5
[  562.025551] iwlwifi 0000:01:00.0: Failed to run INIT ucode: -5
[  566.454705] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[  566.718707] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled


Version-Release number of selected component (if applicable):
Mageia 6 sta1 DVD

How reproducible:
Not reproducible. The problem occurs from time to time and not consistently. Emotionally the problem come when more programs access to the internet.
Comment 1 Marja Van Waes 2016-07-03 05:36:34 CEST
CC'ing some people who understand a lot more about networking than I do.

@ tv, tmb, blino:

Without understanding the description of this bug, it feels foolish to blindly ask for the information that's suggested here:
https://wiki.mageia.org/en/Triage_guide#Networking_issues

Should I do that for networking issues, anyway?

CC: (none) => mageia, marja11, thierry.vignaud, tmb

Thierry Vignaud 2016-07-03 10:30:08 CEST

Source RPM: (none) => kernel

Comment 2 Thomas Backlund 2016-07-03 11:07:55 CEST
That looks like a possible firmware hang, but since I'm about to do both a firmware and kernel update.pleae wait a little and test with that.

it will be a kernel-4.7.0-0.rc6.1.mga6 landing tonight or tomorrow....

If its still an issue after that, we'll dig deeper into this
Comment 3 Max Perl 2016-07-06 12:10:54 CEST
Dear Thomas,
thank you for your answer. No problem. I will wait and test it again then...
By the way: I noticed the problem also on my mageia 5 install (not so often at my mageia 6 install, but two or three times I had this or a similar issue...

Oh, I just saw that my mageia 6 system at the moment I am writing this installs the new kernel :-) I will give you feedback about the issue the next days. Thanks a lot for all your work!!!
Comment 4 Max Perl 2016-07-06 15:12:41 CEST
Hello all,
Unfortunately the problem still exists after updating the kernel although the WLAN connection seems to be longer stable then before.

Perhaps the driver loads the wrong firmware?

I got with dmesg the following:
[   13.712361] iwlwifi 0000:01:00.0: loaded firmware version 17.340337.0 op_mode iwlmvm
[   13.853098] iwlwifi 0000:01:00.0: Detected Intel(R) Dual Band Wireless AC 3160, REV=0x164


https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi recommends iwlwifi-7265-ucode-16.242414.0.tgz as firmware at the WLAN 3160 device and kernel above 4.3. But I don't really understand anything. 

Please let me know, if you need further informations...
Comment 5 Thomas Backlund 2016-07-06 15:40:46 CEST
you are using a 3160 device, so it should use 3160 fw, not 7265...

You can try to move the api 17 fw away by doing:

mv /lib/firmware/iwlwifi-3160-17.ucode /root


and reboot... does it work better then ?
Comment 6 Max Perl 2016-07-06 16:30:16 CEST
Oh, sorry, I copied the wrong filename ;-)
I will test it with the version 16 of the firmware the next days...
One thing doesn't work: The WLAN connection cannot activated at the boot. I have to connect to the router manually after booting...

On dmesg I get the following messages:
[   13.737312] Intel(R) Wireless WiFi driver for Linux
[   13.737318] Copyright(c) 2003- 2015 Intel Corporation
[   13.824751] iwlwifi 0000:01:00.0: Direct firmware load for iwlwifi-3160-17.ucode failed with error -2
[   13.824758] iwlwifi 0000:01:00.0: Falling back to user helper

but later also an output of dmesg
[   73.916056] iwlwifi 0000:01:00.0: loaded firmware version 16.242414.0 op_mode iwlmvm
[   74.040684] iwlwifi 0000:01:00.0: Detected Intel(R) Dual Band Wireless AC 3160, REV=0x164
[   74.040817] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.041069] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.192366] ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
[   74.230975] audit: type=1130 audit(1467814151.736:188): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   74.232346] audit: type=1130 audit(1467814151.738:189): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-networkd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   74.248272] iwlwifi 0000:01:00.0 wlp1s0: renamed from wlan0
[   74.419607] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.419866] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.528250] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.528508] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.544020] IPv6: ADDRCONF(NETDEV_UP): wlp1s0: link is not ready
[   74.607041] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.607337] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.714988] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
[   74.715251] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled


And in /var/log/boot I can find the following message:
[[0;1;31mFAILED[0m] Failed to start LSB: Bring up/down networking.
See 'systemctl status network.service' for details.

the output of systemctl status network.service is:
 network.service - LSB: Bring up/down networking
   Loaded: loaded (/etc/rc.d/init.d/network; generated; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mi 2016-07-06 16:08:25 CEST; 20min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 995 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE)

Jul 06 16:08:23 localhost systemd[1]: Starting LSB: Bring up/down networking...
Jul 06 16:08:23 localhost systemd-sysctl[1029]: Couldn't write '0' to 'net/bridge/bridge-nf-call-ip6tables', ignoring: No such file or directory
Jul 06 16:08:24 localhost network[995]: Aktivieren der Loopback-Schnittstelle:  [  OK  ]
Jul 06 16:08:24 localhost network[995]: Schnittstelle »wlp1s0« aktivieren:  FEHLER: [/etc/sysconfig/network-scripts/ifup-eth] iwlwifi device wlp1s0 does not seem to be present, delaying initialization.
Jul 06 16:08:24 localhost network[995]: [FEHLER]
Jul 06 16:08:25 localhost systemd[1]: network.service: Control process exited, code=exited status=1
Jul 06 16:08:25 localhost systemd[1]: Failed to start LSB: Bring up/down networking.
Jul 06 16:08:25 localhost systemd[1]: network.service: Unit entered failed state.
Jul 06 16:08:25 localhost systemd[1]: network.service: Failed with result 'exit-code'.

Whether the wlan apart from this boot issue works more stable, I have to test a longer time...
Comment 7 Max Perl 2016-07-06 17:12:37 CEST
Unfortunately the problem occurs also with firmware version 16...
But as I said after upgrading kernel the connection is a little bit longer stable before the problem occur...
Comment 8 Max Perl 2016-07-06 18:24:25 CEST
The problem seems also occur on arch linux. Same WLAN Chip! See https://bbs.archlinux.de/viewtopic.php?id=29140 (unfortunately in german)
 I will try the modprobe.conf settings later...
Comment 9 Max Perl 2016-07-06 21:16:22 CEST
The modprobe.conf options (options iwlwifi 11n_disable=1 \ options iwlwifi swcrypto=1) didn't help. Any other ideas?
Comment 10 Max Perl 2016-07-07 16:01:43 CEST
It is definitely a kernel upstream bug. I have just tested Linux Mint 18 live with a 4.x kernel, too. The same error there... I have still created a bug report at kernel.bugzilla...
Comment 11 Thomas Backlund 2016-07-07 16:08:20 CEST
Also please test the kernel-4.7.0-0.rc6.2.mga6 when it lands (currently building)... 
I've pulled in 2 upstream iwlwifi fixes for interrupt and firmware issues...
Comment 12 Max Perl 2016-07-09 01:13:06 CEST
Dear Thomas,
The wlan lacks unfortunately also with the kernel-4.7.0.0.rc6.2.mga6, although the error messages don't repeat so much often than with the old kernel.

But I test a new solution at the moment: 
As in another bug report said (see https://bugs.mageia.org/show_bug.cgi?id=18881) I suffer from many freezes. Obviously the problem seems to be a kernel bug with baytrail prosessors. In the forum (see https://forums.mageia.org/en/viewtopic.php?f=8&t=11194) I get the good advice to add the kernel option "intel_idle.max_cstate=1" as a workaround. I need more time to test whether this helps for the freezes (but the positive reports on bugzilla kernel raise hope :-) )

At least, since I add this kernel option, I don't understand it, but also my WLAN connection seems much more stable (even already with the preinstalled 4.6 kernel!!)
This shortly for tonight. Anyway I also need for the WLAN issue more time, but at the moment I am confident, that the "intel_idle.max_cstate=1" workaround could solve both problems...

(The question then is, whether and how to automate this solution for beginners. If this is not possible or if it doesn't make sense, we should at least write a hint in the errata file...)
Comment 13 Thomas Backlund 2016-07-09 01:19:13 CEST
Ah, yes... the baytrail bug... its not really a kernel bug, it's a hardware design flaw...

We have ~50 laptops at work running windows where we had to disable deep Cstate sleep to get them stable :)

I wonder if I should simply patch kernel to prevent the deep cstates on all baytrails...
Comment 14 Marja Van Waes 2016-07-09 14:55:51 CEST
(In reply to Max Perl from comment #12)

> but at the moment I am confident, that the "intel_idle.max_cstate=1"
> workaround could solve both problems...
> 

So setting this report to depend on bug 18881

Depends on: (none) => 18881

Comment 15 Max Perl 2016-07-11 11:02:48 CEST
Just for your information. After adding the kernel option the WLAN works just fine! I make this bug as resolved. I hope this is okay. We can discuss about the Bailtrayl problem in the other report...

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.