Bug 31380

Summary: NFS mount fails at boot for server kernel edition, but not with desktop kernel. To do with network initialisation.
Product: Mageia Reporter: Herman Viaene <herman.viaene>
Component: RPM PackagesAssignee: Kernel and Drivers maintainers <kernel>
Status: NEW --- QA Contact:
Severity: normal    
Priority: Normal CC: davidwhodgins, ftg
Version: 8   
Target Milestone: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Source RPM: initscripts, kernel CVE:
Status comment:
Attachments: Journal edited to include only net and NFS items
full compressed journal
journal from server kernel
journal from desktop kernel with server kernel installed
journal server after removing enp0s18f2u3

Description Herman Viaene 2023-01-08 11:37:56 CET
Description of problem:
When booting there is a difference between the desktop and server edition for mounting remote NFS-shares. The shares are set  to allow users to mount the shares.
On the desktop edition the mounting is OK during boot, and shares are immediately available.
On the server edition the mounting is running before the network is up, and therefore fails. Once the user is logged in, the mount command makes it available, but this is a PITA.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Define NFS-shares on one PC, and run NFS-server on that one.
2. Install M8 (or M9 for that matter) on a second PC with the server edition.
3. Have the first PC running, boot the second one, use MCC to define access to the remote NFS-shares, reboot this second PC and note that the remote NFS-shares are not mounted.
Comment 1 Dave Hodgins 2023-01-08 17:55:41 CET
By server edition, do you mean kernel-server-latest?

CC: (none) => davidwhodgins

Comment 2 Lewis Smith 2023-01-08 18:48:18 CET
Guessing this is initscripts.

"On the desktop edition the mounting is OK during boot, and shares are immediately available"
"On the server edition the mounting is running before the network is up, and therefore fails"
I wonder whether this difference is due possibly to the fact that in one case the network does not need to be already up, in the other it does?

"1. Define NFS-shares on one PC, and run NFS-server on that one.
 2. Install M8 (or M9 for that matter) on a second PC with the server edition."
Assuming desktop kernel for the 1st PC.
Do you know whether you see the problem if you install the desktop (rather than server) kernel on the 2nd one?

Can you attach either the compressed journals, or just the parts that embrace Network Up and the NFS mounting.

CC: (none) => lewyssmith
Source RPM: (none) => initscripts

Comment 3 Herman Viaene 2023-01-09 09:09:19 CET
@ Dave: yes
@Lewis: the first PC is also on the server edition. But that doesn't matter since the folders it exports as NFS-shares are on its local disk.
And I can confirm that the problem does not occur when the 2nd PC is installed with the desktop version (I don't know a way to do a fresh install as server), and appears as soon as I install the server kernel and reboot with the server kernel. I've done that on M8, and repeated it XXX times with the M9's we've got up to now. And for that matter, all on the second PC, it makes no difference when first configure the NFS-access and then install the server kernel, or first install the server kernel and then configure the NFS-access.
Journals coming up later today.
Comment 4 Herman Viaene 2023-01-09 11:19:32 CET
Created attachment 13631 [details]
Journal edited to include only net and NFS items
Comment 5 Herman Viaene 2023-01-09 11:20:19 CET
Created attachment 13632 [details]
full compressed journal
Comment 6 Herman Viaene 2023-01-09 11:22:53 CET
Added journal files. I made these as root with
# journalctl -b > journalNFS.txt
and derived the two uploaded files from that one.
And the NFS-shares were not mounted.
Comment 7 Lewis Smith 2023-01-09 20:02:58 CET
Thanks for this inormation (from the 2nd machine).
I wonder whether the problem stems from:

Jan 09 10:37:02 mach7.hviaene.thuis systemd[1]: Starting LSB: Bring up/down networking...
...
Jan 09 10:37:17 mach7.hviaene.thuis systemd[1]: Failed to start LSB: Bring up/down networking.

(In reply to Herman Viaene from comment #3)
> And I can confirm that the problem does not occur when the 2nd PC is
> installed with the desktop version (I don't know a way to do a fresh install
> as server), and appears as soon as I install the server kernel and reboot
> with the server kernel
This observation is important.
Are you in a position to furnish the same journal extract with the Desktop kernel in use on the 2nd machine (mounts work)? For comparison.

BTAIM
Assiging to kernel. It may be with Basesystem if initscripts are in question.

Source RPM: initscripts => initscripts, kernel
Assignee: bugsquad => kernel
CC: lewyssmith => (none)
Summary: NFS mount fails at boot for server edition => NFS mount fails at boot for server kernel edition, but not with desktop kernel. To do with network initialisation.

Comment 8 Dave Hodgins 2023-01-09 22:24:35 CET
Comment on attachment 13631 [details]
Journal edited to include only net and NFS items

Looks like the network interface has been moved
enp0s18f2u3 does not seem to be present, delaying initialization.

Please remove /etc/sysconfig/network-scripts/ifcfg-enp0s18f2u3
Comment 9 Dave Hodgins 2023-01-10 00:53:23 CET
Also, please be careful with the terminology. IIRC, mandrake had a server
edition. The correct terminology for this case is "a boot using the server
kernel flavor".

For testing, I would install both the kernel-server-latest and
kernel-desktop-latest on the same system, and capture "journalctl -b --no-h"
for both boots to minimize the differences between the two cases.
Comment 10 Herman Viaene 2023-01-11 14:48:49 CET
Answering Dave's request in Comment 9:
$ uname -a
Linux mach7.hviaene.thuis 5.15.82-server-1.mga8 #1 SMP Thu Dec 8 23:38:11 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
then
# journalctl -b --no-h > journalNFSm8server.txt
File attached
Now installing desktop.
Comment 11 Herman Viaene 2023-01-11 14:49:30 CET
Created attachment 13636 [details]
journal from server kernel
Comment 12 Herman Viaene 2023-01-11 15:27:55 CET
Installed desktop without problems.
This laptop in test is the Acer Aspire 5253, legacy BIOS.
Rebooting and checked grub entries, and desktop not present. I more or less expected that since grub2 has problems with multiple systems and kernels on a PC. So forced desktop to it, by going MCC - boot options and make the desktop kernel default.
Rebooted and:
$ uname -a
Linux mach7.hviaene.thuis 5.15.82-desktop-1.mga8 #1 SMP Thu Dec 8 21:42:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# journalctl -b --no-h > /home/tester8/Documents/journalNFSm8desktopserverpresent.txt
Attaching that file, but remarked that this time the NFS-shares were not mounted.
enp0s18f2u3 error is present in the journal.
I will now remove the server kernel and see what that gives.
Comment 13 Herman Viaene 2023-01-11 15:28:52 CET
Created attachment 13637 [details]
journal from desktop kernel with server kernel installed
Comment 14 Herman Viaene 2023-01-11 16:04:59 CET
No success, NFS-shares do not mount anymore.
Renamed /etc/sysconfig/network-scripts/ifcfg-enp0s18f2u3 to oldifcfg-enp0s18f2u3 and rebooted: NFS-shares are mounted OK.
However, this laptop has an Atheros Wifi chipset, but I have the same problem on my Lenovo B50, which has an Intel wifi. So, this solution might be ad-hoc for this particular HW, but the issue seems to have broader implications.
Comment 15 Frank Griffin 2023-01-11 16:39:03 CET
Just a guess or  two.

Long ago and far away I had a problem where a server running the default desktop kernel would not export its NFS shares unless once the boot was complete you issued "systemctl restart nfs-server".  This stopped happening at some point, but if this workaround works for you it might be significant.

It sounds like you're using nm-applet (ifcfg) rather than NM.  Is it possible that when nm-applet initiation fails, it does not automatically retry, and nfs-server doesn't detect that the network is now available ?  Does your desktop system use NM or something like it ?  Could this be a difference in network recovery between the two systems ?

CC: (none) => ftg

Comment 16 Herman Viaene 2023-01-11 16:48:21 CET
Not yet finished: reinstated server kernel and rebooted that one: NFS-shares are not mounted. I will attach the journal file, but I noticed that the mount command in the journal appears before any reference to wlp7s0 (my wifi device) except "Bringing up interface wlp7s0:
Jan 11 16:27:54 network[1136]: Error for wireless request "Set Encode" (8B2A) :
Jan 11 16:27:54 network[1136]:     SET failed on device wlp7s0 ; Invalid argument."
Needles to say that after boot the wifi is up and active.
@ Frank: this has nothing to do with nm-applet (ifcfg) or NM, it all happens at boot time.
Comment 17 Herman Viaene 2023-01-11 16:50:22 CET
Created attachment 13638 [details]
journal server after removing enp0s18f2u3
Comment 18 Dave Hodgins 2023-01-12 00:01:31 CET
They both show nfs failing.
$ grep -e iurt -e mount.nfs *
desktop:Jan 11 15:09:14 kernel: Linux version 5.15.82-desktop-1.mga8 (iurt@rabbit.mageia.org) (gcc (Mageia 10.4.0-3.mga8) 10.4.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP Thu Dec 8 21:42:04 UTC 2022
desktop:Jan 11 15:09:56 mount[1320]: mount.nfs: Failed to resolve server mach1.hviaene.thuis: Name or service not known
desktop:Jan 11 15:09:56 mount[1322]: mount.nfs: Failed to resolve server mach1.hviaene.thuis: Name or service not known
desktop:Jan 11 15:09:56 mount[1317]: mount.nfs: Failed to resolve server mach1.hviaene.thuis: Name or service not known
server:Jan 11 16:27:17 kernel: Linux version 5.15.82-server-1.mga8 (iurt@rabbit.mageia.org) (gcc (Mageia 10.4.0-3.mga8) 10.4.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP Thu Dec 8 23:38:11 UTC 2022
server:Jan 11 16:27:59 mount[1232]: mount.nfs: Failed to resolve server mach1.hviaene.thuis: Name or service not known
server:Jan 11 16:27:59 mount[1227]: mount.nfs: Failed to resolve server mach1.hviaene.thuis: Name or service not known
server:Jan 11 16:27:59 mount[1229]: mount.nfs: Failed to resolve server mach1.hviaene.thuis: Name or service not known

Looking into it more.
Comment 19 Dave Hodgins 2023-01-12 00:12:26 CET
Two questions what are the contents of
/etc/sysconfig/network-scripts/ifcfg-wlp7s0
and how is the address of mach1.hviaene.thuis defined?
Comment 20 Herman Viaene 2023-01-12 09:19:14 CET
contents of /etc/sysconfig/network-scripts/ifcfg-wlp7s0

DEVICE=wlp7s0
BOOTPROTO=static
IPADDR=192.168.2.7
NETMASK=255.255.255.0
GATEWAY=192.168.2.15
ONBOOT=yes
METRIC=30
MII_NOT_SUPPORTED=no
USERCTL=yes
DNS1=192.168.2.1
DNS2=212.71.0.33
DOMAIN=hviaene.thuis
RESOLV_MODS=no
WIRELESS_MODE=Managed
WIRELESS_ESSID=via8ene9
WIRELESS_ENC_KEY=s:XXXXXXXXXX
WIRELESS_WPA_DRIVER=wext
WIRELESS_WPA_REASSOCIATE=no
KEY_MGMT=WPA-PSK
WPA_PSK=XXXXXXXXXXXX
IPV6INIT=yes
IPV6TO4INIT=no
ACCOUNTING=no

Part of your question on the name is given above. The DNS1 is my desktop PC that runs a DNS server so that all 4 machines on my LAN can resolve the FQDN's.
This setup is in place since I-cann't-remember-how-long and never gave problems. Do I really need that? Of course not, but I do that as a challenge. So, I will not give up on that setup unless ???????
Comment 21 Dave Hodgins 2023-01-13 08:31:04 CET
I'd like to try fixing the wireless error to see if that makes a difference ...
Jan 11 16:27:54 network[1136]: Error for wireless request "Set Encode" (8B2A) :
Jan 11 16:27:54 network[1136]:     SET failed on device wlp7s0 ; Invalid argument.

From what I can find, either the router is expecting wep, which requires a
text key that is either 5 or 13 characters (10 or 26 if using hex), or it's
using wpa but the password isn't withing the range of 8 to 133 characters.

Double check that the router is configured for wpa-psk using aes (tkip is
no longer considered secure), and that the key length is in the right range.

I finally found an explanation for the Encode 8B2A error at
https://superuser.com/a/353818
Comment 22 Dave Hodgins 2023-01-13 09:19:11 CET
I also found https://discussions.apple.com/thread/2525083 that indicates
the key must be 'printable ASCII characters' (in the range of 32 to 126
(decimal)
Comment 23 Herman Viaene 2023-01-13 14:25:25 CET
I did notice that error, but I cann't figure out why it's there. Once the boot is completed, the wifi connection is OK, so it must be setup somewhere later in the boot sequence.
As far as the router is concerned, it is a Fritzbox 7490, and the security setting is to WPA encryption selected, WPA mode WPA(CCMP) by default, other choices being WPA+WPA2 or WPA2+WPA3. Wikipedia says WPA(CCMP) is based on AES.
The network key is 8 characters, and that is definely the one listed in ifcfg-wlp7s0. Guest access is not enabled.
Comment 24 Dave Hodgins 2023-01-13 16:04:09 CET
Does "file /etc/sysconfig/network-scripts/ifcfg-w*" show ASCII text,
UTF-8 Unicode text, or something else?
Comment 25 Dave Hodgins 2023-01-13 16:05:24 CET
My thinking is that the Encode 8B2A error may be impacting the timing of
other things.
Comment 26 Herman Viaene 2023-01-13 17:15:34 CET
Kwrite tells me UTF-8.
Comment 27 Dave Hodgins 2023-01-13 19:54:30 CET
kwrite is not helpful as ascii is a subset of utf-8, and it's just telling you
that it supports utf-8 characters such as ė. Ascii does include accented
characters. The router may accept utf-8 and work with windows, and linux, but
will not work with apple, and may be causing the error, which may be what's
messing up the order.
Comment 28 Dave Hodgins 2023-01-13 22:06:24 CET
Just fyi, I've setup nfs sharing on two of my systems.

On x3.hodgins.homeip.net
$ grep nfs /etc/fstab
hodgins.homeip.net:/home/dave/bin /mnt/bin nfs wsize=8192,rsize=8192,nosuid,soft 0 0

On hodgins.homeip.net
x3.hodgins.homeip.net:/s3/Downloads /mnt/Downloads nfs rsize=8192,nosuid,wsize=8192,soft 0 0

They are both working, regardless of the boot order. Whichever boots first
will hang if I run something that tries to access the other, such as df, until
the other is up too, but it works once the second does start.

Both systems are m8. One x86_64, the other aarch64.

I'm running the server kernel flavor on both.
Comment 29 Dave Hodgins 2023-01-13 22:07:26 CET
I'm also using bind for my name server.
Comment 30 Dave Hodgins 2023-01-14 01:29:52 CET
(In reply to Herman Viaene from comment #26)
> Kwrite tells me UTF-8.

Further example that shows the difference ...
[dave@x3 tmp]$ echo 'asdf'>test
[dave@x3 tmp]$ file test
test: ASCII text
[dave@x3 tmp]$ echo 'ásdf'>test
[dave@x3 tmp]$ file test
test: UTF-8 Unicode text

According to the IEEE standards, the wireless passphrase is supposed to be
limited to printable ascii characters. While utf-8 characters outside of
the ascii range may work, it's technically a violation of the standard,
and may be what's causing the 8B2A encoding error.

Anything that causes a non-zero return code is likely going to change
the order systemd does things.
Comment 31 Herman Viaene 2023-01-14 09:20:50 CET
I wouldn't know how to check whether a file is pure ascii or whatever. But anyway, the passphrase I use are plain upper- and lower case characters and numbers, just plain from the ASCII times, long ago.
Comment 32 Dave Hodgins 2023-01-14 20:48:51 CET
(In reply to Herman Viaene from comment #31)
> I wouldn't know how to check whether a file is pure ascii or whatever. But

Use the command "file"

# rpm -q -i file|grep -e ^So -e ^Su
Source RPM  : file-5.39-4.mga8.src.rpm
Summary     : A utility for determining file types
# urpme --test file
Removing the following package will break your system:
  basesystem-minimal-8-0.4.mga8.x86_64
   (due to missing less)
Comment 33 Dave Hodgins 2023-02-06 20:33:47 CET
Herman, what is the output of the command
"file /etc/sysconfig/network-scripts/ifcfg-wlp7s0" (run as root)?