Bug 20016 - urpmi hangs indefinitely when called by harddrake during Live system boot
Summary: urpmi hangs indefinitely when called by harddrake during Live system boot
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: RPM Packages (show other bugs)
Version: Cauldron
Hardware: All Linux
Priority: release_blocker major
Target Milestone: ---
Assignee: Mageia tools maintainers
QA Contact:
URL:
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2016-12-23 18:54 CET by Martin Whitaker
Modified: 2016-12-29 22:02 CET (History)
2 users (show)

See Also:
Source RPM: rpm-4.13.0-6.mga6.src.rpm
CVE:
Status comment:


Attachments
Patch for rpm systemd-inhibit plugin that fixes this bug (429 bytes, text/plain)
2016-12-29 18:33 CET, Martin Whitaker
Details

Description Martin Whitaker 2016-12-23 18:54:11 CET
When booting a Live system, if harddrake chooses to use a proprietary graphics driver, it attempts to install the necessary packages to build and install that driver, using urpmi. This gets as far as running the first RPM transaction, then hangs indefinitely.

By patching the mandriva-everytime init script to run

  strace urpmi --auto --verbose --debug gcc &> /var/log/urpmi.log

and adding a timeout, I've captured information about where it hangs. In the journal we have

  Dec 22 19:29:02 localhost urpmi[706]: called with: --auto --verbose --debug gcc   
  Dec 22 19:29:09 localhost urpmi[706]: transaction on / (remove=0, install=0, upgrade=8)
  Dec 22 19:29:09 localhost [RPM][706]: Transaction ID 585c2985 started   

before eventually

  Dec 22 19:31:44 localhost systemd[1]: mandriva-everytime.service: Start operation timed out. Terminating.

The last few actions captured by strace before the hang are

  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 19
  connect(19, {sa_family=AF_UNIX, sun_path="/run/dbus/system_bus_socket"}, 29) = 0
  fcntl(19, F_GETFL)                      = 0x2 (flags O_RDWR)
  fcntl(19, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
  geteuid()                               = 0
  getsockname(19, {sa_family=AF_UNIX}, [128->2]) = 0
  poll([{fd=19, events=POLLOUT}], 1, 0)   = 1 ([{fd=19, revents=POLLOUT}])
  sendto(19, "\0", 1, MSG_NOSIGNAL, NULL, 0) = 1
  sendto(19, "AUTH EXTERNAL 30\r\n", 18, MSG_NOSIGNAL, NULL, 0) = 18

I guess it's waiting for a response to that message.

Setting as a release blocker, as this is the only way Live DVDs can use proprietary drivers, which may be necessary for some hardware.
Marja Van Waes 2016-12-23 20:45:18 CET

CC: (none) => marja11
Assignee: bugsquad => mageiatools

Comment 1 Martin Whitaker 2016-12-23 21:01:08 CET
A little further investigation indicates it is trying to contact the systemd-logind service to block reset/shutdown events. But systemd-logind has not been started at that point in the boot sequence.
Comment 2 Thierry Vignaud 2016-12-23 22:21:00 CET
That's not urpmi but polkit I think

CC: (none) => thierry.vignaud
Source RPM: (none) => polkit

Comment 3 Martin Whitaker 2016-12-23 23:50:29 CET
(In reply to Thierry Vignaud from comment #2)
> That's not urpmi but polkit I think

Could well be. I should have added that comment 1 came from running urpmi on a booted system, using dbus-monitor to identify the messages. This isn't something I'm familiar with, so I may be misinterpreting.

I see from systemd-analyze that polkit hasn't been started at the point urpmi hangs during boot.
Comment 4 Martin Whitaker 2016-12-29 18:33:31 CET
Created attachment 8822 [details]
Patch for rpm systemd-inhibit plugin that fixes this bug

The rpm systemd-inhibit plugin is responsible for this bug. Bug 16950 explains why it is seen in mga6 and not in mga5. I guess there's two ways we could try to fix it:

  - get systemd-logind to start earlier in the boot sequence
  - disable the systemd-inhibit plugin when systemd-logind is not running

The attached patch does the latter. I'd like a better way of detecting that systemd-logind is running, but this is simple and works. According to a comment in the systemd-logind source code, checking for the existence of /run/systemd/seats/ should be enough, but that turns out not to be true - systemd-tmpfiles creates that directory in advance.
Martin Whitaker 2016-12-29 18:35:03 CET

Keywords: (none) => PATCH
Source RPM: polkit => rpm-4.13.0-6.mga6.src.rpm

Comment 5 Thierry Vignaud 2016-12-29 22:02:09 CET
Thanks for the good work (as usual :-) )

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.