Bug 20016

Summary: urpmi hangs indefinitely when called by harddrake during Live system boot
Product: Mageia Reporter: Martin Whitaker <mageia>
Component: RPM PackagesAssignee: Mageia tools maintainers <mageiatools>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: release_blocker CC: marja11, thierry.vignaud
Version: CauldronKeywords: PATCH
Target Milestone: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Source RPM: rpm-4.13.0-6.mga6.src.rpm CVE:
Status comment:
Attachments: Patch for rpm systemd-inhibit plugin that fixes this bug

Description Martin Whitaker 2016-12-23 18:54:11 CET
When booting a Live system, if harddrake chooses to use a proprietary graphics driver, it attempts to install the necessary packages to build and install that driver, using urpmi. This gets as far as running the first RPM transaction, then hangs indefinitely.

By patching the mandriva-everytime init script to run

  strace urpmi --auto --verbose --debug gcc &> /var/log/urpmi.log

and adding a timeout, I've captured information about where it hangs. In the journal we have

  Dec 22 19:29:02 localhost urpmi[706]: called with: --auto --verbose --debug gcc   
  Dec 22 19:29:09 localhost urpmi[706]: transaction on / (remove=0, install=0, upgrade=8)
  Dec 22 19:29:09 localhost [RPM][706]: Transaction ID 585c2985 started   

before eventually

  Dec 22 19:31:44 localhost systemd[1]: mandriva-everytime.service: Start operation timed out. Terminating.

The last few actions captured by strace before the hang are

  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 19
  connect(19, {sa_family=AF_UNIX, sun_path="/run/dbus/system_bus_socket"}, 29) = 0
  fcntl(19, F_GETFL)                      = 0x2 (flags O_RDWR)
  fcntl(19, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
  geteuid()                               = 0
  getsockname(19, {sa_family=AF_UNIX}, [128->2]) = 0
  poll([{fd=19, events=POLLOUT}], 1, 0)   = 1 ([{fd=19, revents=POLLOUT}])
  sendto(19, "\0", 1, MSG_NOSIGNAL, NULL, 0) = 1
  sendto(19, "AUTH EXTERNAL 30\r\n", 18, MSG_NOSIGNAL, NULL, 0) = 18

I guess it's waiting for a response to that message.

Setting as a release blocker, as this is the only way Live DVDs can use proprietary drivers, which may be necessary for some hardware.
Marja Van Waes 2016-12-23 20:45:18 CET

CC: (none) => marja11
Assignee: bugsquad => mageiatools

Comment 1 Martin Whitaker 2016-12-23 21:01:08 CET
A little further investigation indicates it is trying to contact the systemd-logind service to block reset/shutdown events. But systemd-logind has not been started at that point in the boot sequence.
Comment 2 Thierry Vignaud 2016-12-23 22:21:00 CET
That's not urpmi but polkit I think

CC: (none) => thierry.vignaud
Source RPM: (none) => polkit

Comment 3 Martin Whitaker 2016-12-23 23:50:29 CET
(In reply to Thierry Vignaud from comment #2)
> That's not urpmi but polkit I think

Could well be. I should have added that comment 1 came from running urpmi on a booted system, using dbus-monitor to identify the messages. This isn't something I'm familiar with, so I may be misinterpreting.

I see from systemd-analyze that polkit hasn't been started at the point urpmi hangs during boot.
Comment 4 Martin Whitaker 2016-12-29 18:33:31 CET
Created attachment 8822 [details]
Patch for rpm systemd-inhibit plugin that fixes this bug

The rpm systemd-inhibit plugin is responsible for this bug. Bug 16950 explains why it is seen in mga6 and not in mga5. I guess there's two ways we could try to fix it:

  - get systemd-logind to start earlier in the boot sequence
  - disable the systemd-inhibit plugin when systemd-logind is not running

The attached patch does the latter. I'd like a better way of detecting that systemd-logind is running, but this is simple and works. According to a comment in the systemd-logind source code, checking for the existence of /run/systemd/seats/ should be enough, but that turns out not to be true - systemd-tmpfiles creates that directory in advance.
Martin Whitaker 2016-12-29 18:35:03 CET

Keywords: (none) => PATCH
Source RPM: polkit => rpm-4.13.0-6.mga6.src.rpm

Comment 5 Thierry Vignaud 2016-12-29 22:02:09 CET
Thanks for the good work (as usual :-) )

Status: NEW => RESOLVED
Resolution: (none) => FIXED