SOLUTION BELOW

The actual bug


I have never been in a more confusing situation regarding Linux.

I have a Dell XPS 15 9560, which had a dual boot Windows 10 / EndeavourOS setup. It was running fine for months. 10 days ago I updated Linux and after restart it couldn’t boot anymore. It got stuck at “A start job is running for /dev/disk/by-uuid/…” (which is the root partition).

First, with the help of a friend of mine who is quite knowledgeable about Linux (he runs vanilla Arch, etc), we spent 5 hours trying to fix it but had no luck.

Then I decided to back up everything and do a fresh install. Aaaand the same error happened again on the first boot. Then I though “ok, probably some problem with Arch, lets try Fedora”. Nope. Some similar error about not finding the root partition. (Here I must say that the kernel which was shipped with the ISO was working fine, but after updating to the latest one, it failed.) Here I thought “ok, then it might be a problem with the latest kernel, let’s install EndeavourOS with the LTS kernel.” Nope, LTS kernel also didn’t boot. Then I tried Ubuntu and it worked, but that’s not solving the problem. Then I decided to put another nvme drive in the laptop and try there. The same error again.

Now the greatest part: If I put the nvme drive into an external usb case, EndeavourOS installs, updates, boots without any problem, no sign of the error.

So now I don’t know how to proceed… Maybe there is something wrong with the pcie port in my laptop, but except for the booting problem, windows is working, I can also mount and access every partition in the ssd through a live usb. So no other signs of problem with the port whatsoever.

I would be grateful for any advice as I’ve lost several days trying to solve this and I am out of ideas…


Solution: The last working kernels are from 11. August 2023 (both linux and linux-lts) linux-6.4.10.arch1-1 and linux-lts-6.1.45-1. You can download them from here: linux / linux-lts and install them with

sudo pacman -U the_path_to_the_package

Thank you all for the help!

  • @oiram15OP
    link
    1
    edit-2
    10 months ago

    I have already wiped everything, so no logs… The only way to have it booting is to install EndeavourOS using the offline installer, which is using kernel 6.4.8. There is an option to install the LTS kernel alongside. So the system is booting with 6.4.8, but after updating, neither the new 6.4.12, or the LTS, which is 6.2, doesn’t boot. I haven’t tried booting with the LTS kernel before updating, to see if the same kernel is working before and after or not. I will try to reinstall it using the offline installer and then try to gather some logs after updating.

    • Illecors
      link
      fedilink
      English
      2
      edit-2
      10 months ago

      If you do go this route - try to update the kernel only, not the rest of the system. Yes, it’s bad under normal circumstances, but I think we’re way past those.

      sudo pacman -Syy linux
      

      This would rule out some other part of system getting in the way - like systemd.

      • @oiram15OP
        link
        1
        edit-2
        10 months ago

        I updated only the kernel and I got the following:

        :: kernel-install installing kernel 6.4.12-arch1-1
        dracut: Executing: /usr/bin/dracut --no-hostonly --force /efi/69f1b920c192454c97831ba6e72bf777/6.4.12-arch1-1/initrd-fallback 6.4.12-arch1-1
        dracut: dracut module 'dash' will not be installed, because command 'dash' could not be found!
        dracut: dracut module 'mksh' will not be installed, because command 'mksh' could not be found!
        dracut: dracut module 'busybox' will not be installed, because command 'busybox' could not be found!
        dracut: dracut module 'dbus-broker' will not be installed, because command 'dbus-broker' could not be found!
        dracut: dracut module 'rngd' will not be installed, because command 'rngd' could not be found!
        dracut: dracut module 'connman' will not be installed, because command 'connmand' could not be found!
        dracut: dracut module 'connman' will not be installed, because command 'connmanctl' could not be found!
        dracut: dracut module 'connman' will not be installed, because command 'connmand-wait-online' could not be found!
        dracut: dracut module 'network-wicked' will not be installed, because command 'wicked' could not be found!
        dracut: dracut module 'dmraid' will not be installed, because command 'kpartx' could not be found!
        dracut: dracut module 'multipath' will not be installed, because command 'multipath' could not be found!
        dracut: dracut module 'tpm2-tss' will not be installed, because command 'tpm2' could not be found!
        dracut: dracut module 'fcoe' will not be installed, because command 'dcbtool' could not be found!
        dracut: dracut module 'fcoe' will not be installed, because command 'fipvlan' could not be found!
        dracut: dracut module 'fcoe' will not be installed, because command 'lldpad' could not be found!
        dracut: dracut module 'fcoe' will not be installed, because command 'fcoemon' could not be found!
        dracut: dracut module 'fcoe' will not be installed, because command 'fcoeadm' could not be found!
        dracut: dracut module 'fcoe-uefi' will not be installed, because command 'dcbtool' could not be found!
        dracut: dracut module 'fcoe-uefi' will not be installed, because command 'fipvlan' could not be found!
        dracut: dracut module 'fcoe-uefi' will not be installed, because command 'lldpad' could not be found!
        dracut: dracut module 'iscsi' will not be installed, because command 'iscsi-iname' could not be found!
        dracut: dracut module 'iscsi' will not be installed, because command 'iscsiadm' could not be found!
        dracut: dracut module 'iscsi' will not be installed, because command 'iscsid' could not be found!
        dracut: dracut module 'nbd' depends on 'network', which can't be installed
        dracut: dracut module 'nvmf' will not be installed, because command 'nvme' could not be found!
        dracut: dracut module 'biosdevname' will not be installed, because command 'biosdevname' could not be found!
        dracut: dracut module 'memstrack' will not be installed, because command 'memstrack' could not be found!
        dracut: memstrack is not available
        dracut: If you need to use rd.memdebug>=4, please install memstrack and procps-ng
        dracut: dracut module 'squash' will not be installed, because command 'mksquashfs' could not be found!
        dracut: dracut module 'squash' will not be installed, because command 'unsquashfs' could not be found!
        dracut: *** Including module: systemd ***
        dracut: *** Including module: systemd-initrd ***
        dracut: *** Including module: modsign ***
        dracut: *** Including module: i18n ***
        dracut: *** Including module: btrfs ***
        dracut: *** Including module: crypt ***
        dracut: *** Including module: dm ***
        dracut: Skipping udev rule: 64-device-mapper.rules
        dracut: Skipping udev rule: 60-persistent-storage-dm.rules
        dracut: Skipping udev rule: 55-dm.rules
        dracut: *** Including module: kernel-modules ***
        dracut: *** Including module: kernel-modules-extra ***
        dracut: *** Including module: lvm ***
        dracut: Skipping udev rule: 64-device-mapper.rules
        dracut: Skipping udev rule: 56-lvm.rules
        dracut: Skipping udev rule: 60-persistent-storage-lvm.rules
        dracut: *** Including module: mdraid ***
        dracut: Skipping udev rule: 64-md-raid.rules
        dracut: *** Including module: nvdimm ***
        dracut: *** Including module: qemu ***
        dracut: *** Including module: qemu-net ***
        dracut: *** Including module: lunmask ***
        dracut: *** Including module: resume ***
        dracut: *** Including module: rootfs-block ***
        dracut: *** Including module: terminfo ***
        dracut: *** Including module: udev-rules ***
        dracut: Skipping udev rule: 40-redhat.rules
        dracut: Skipping udev rule: 50-firmware.rules
        dracut: Skipping udev rule: 50-udev.rules
        dracut: Skipping udev rule: 91-permissions.rules
        dracut: Skipping udev rule: 80-drivers-modprobe.rules
        dracut: *** Including module: virtiofs ***
        dracut: *** Including module: dracut-systemd ***
        dracut: *** Including module: usrmount ***
        dracut: *** Including module: base ***
        dracut: *** Including module: fs-lib ***
        dracut: *** Including module: shutdown ***
        dracut: *** Including modules done ***
        dracut: *** Installing kernel module dependencies ***
        dracut: *** Installing kernel module dependencies done ***
        dracut: *** Resolving executable dependencies ***
        dracut: *** Resolving executable dependencies done ***
        dracut: *** Hardlinking files ***
        dracut: Mode:                     real
        dracut: Method:                   memcmp
        dracut: Files:                    1987
        dracut: Linked:                   8 files
        dracut: Compared:                 0 xattrs
        dracut: Compared:                 449 files
        dracut: Saved:                    1.42 MiB
        dracut: Duration:                 0.022128 seconds
        dracut: *** Hardlinking files done ***
        dracut: *** Generating early-microcode cpio image ***
        dracut: *** Constructing AuthenticAMD.bin ***
        dracut: *** Store current command line parameters ***
        dracut: *** Stripping files ***
        dracut: *** Stripping files done ***
        dracut: *** Creating image file '/efi/69f1b920c192454c97831ba6e72bf777/6.4.12-arch1-1/initrd-fallback' ***
        dracut: *** Creating initramfs image file '/efi/69f1b920c192454c97831ba6e72bf777/6.4.12-arch1-1/initrd-fallback' done ***
        dracut: Executing: /usr/bin/dracut --hostonly --no-hostonly-cmdline -f /efi/69f1b920c192454c97831ba6e72bf777/6.4.12-arch1-1/initrd 6.4.12-arch1-1
        

        (Continues in the second comment)

        • @oiram15OP
          link
          110 months ago
          dracut: dracut module 'dash' will not be installed, because command 'dash' could not be found!
          dracut: dracut module 'mksh' will not be installed, because command 'mksh' could not be found!
          dracut: dracut module 'busybox' will not be installed, because command 'busybox' could not be found!
          dracut: dracut module 'dbus-broker' will not be installed, because command 'dbus-broker' could not be found!
          dracut: dracut module 'rngd' will not be installed, because command 'rngd' could not be found!
          dracut: dracut module 'connman' will not be installed, because command 'connmand' could not be found!
          dracut: dracut module 'connman' will not be installed, because command 'connmanctl' could not be found!
          dracut: dracut module 'connman' will not be installed, because command 'connmand-wait-online' could not be found!
          dracut: dracut module 'network-wicked' will not be installed, because command 'wicked' could not be found!
          dracut: dracut module 'dmraid' will not be installed, because command 'kpartx' could not be found!
          dracut: dracut module 'tpm2-tss' will not be installed, because command 'tpm2' could not be found!
          dracut: dracut module 'iscsi' will not be installed, because command 'iscsi-iname' could not be found!
          dracut: dracut module 'iscsi' will not be installed, because command 'iscsiadm' could not be found!
          dracut: dracut module 'iscsi' will not be installed, because command 'iscsid' could not be found!
          dracut: dracut module 'nvmf' will not be installed, because command 'nvme' could not be found!
          dracut: dracut module 'biosdevname' will not be installed, because command 'biosdevname' could not be found!
          dracut: dracut module 'memstrack' will not be installed, because command 'memstrack' could not be found!
          dracut: memstrack is not available
          dracut: If you need to use rd.memdebug>=4, please install memstrack and procps-ng
          dracut: dracut module 'squash' will not be installed, because command 'mksquashfs' could not be found!
          dracut: dracut module 'squash' will not be installed, because command 'unsquashfs' could not be found!
          dracut: dracut module 'dash' will not be installed, because command 'dash' could not be found!
          dracut: dracut module 'mksh' will not be installed, because command 'mksh' could not be found!
          dracut: dracut module 'busybox' will not be installed, because command 'busybox' could not be found!
          dracut: dracut module 'dbus-broker' will not be installed, because command 'dbus-broker' could not be found!
          dracut: dracut module 'rngd' will not be installed, because command 'rngd' could not be found!
          dracut: dracut module 'connman' will not be installed, because command 'connmand' could not be found!
          dracut: dracut module 'connman' will not be installed, because command 'connmanctl' could not be found!
          dracut: dracut module 'connman' will not be installed, because command 'connmand-wait-online' could not be found!
          dracut: dracut module 'network-wicked' will not be installed, because command 'wicked' could not be found!
          dracut: dracut module 'dmraid' will not be installed, because command 'kpartx' could not be found!
          dracut: dracut module 'tpm2-tss' will not be installed, because command 'tpm2' could not be found!
          dracut: dracut module 'iscsi' will not be installed, because command 'iscsi-iname' could not be found!
          dracut: dracut module 'iscsi' will not be installed, because command 'iscsiadm' could not be found!
          dracut: dracut module 'iscsi' will not be installed, because command 'iscsid' could not be found!
          dracut: dracut module 'nvmf' will not be installed, because command 'nvme' could not be found!
          dracut: dracut module 'memstrack' will not be installed, because command 'memstrack' could not be found!
          dracut: memstrack is not available
          dracut: If you need to use rd.memdebug>=4, please install memstrack and procps-ng
          dracut: dracut module 'squash' will not be installed, because command 'mksquashfs' could not be found!
          dracut: dracut module 'squash' will not be installed, because command 'unsquashfs' could not be found!
          dracut: *** Including module: systemd ***
          dracut: *** Including module: systemd-initrd ***
          dracut: *** Including module: i18n ***
          dracut: *** Including module: kernel-modules ***
          dracut: *** Including module: kernel-modules-extra ***
          dracut: *** Including module: rootfs-block ***
          dracut: *** Including module: terminfo ***
          dracut: *** Including module: udev-rules ***
          dracut: Skipping udev rule: 40-redhat.rules
          dracut: Skipping udev rule: 50-firmware.rules
          dracut: Skipping udev rule: 50-udev.rules
          dracut: Skipping udev rule: 91-permissions.rules
          dracut: Skipping udev rule: 80-drivers-modprobe.rules
          dracut: Skipping udev rule: 70-persistent-net.rules
          dracut: *** Including module: dracut-systemd ***
          dracut: *** Including module: usrmount ***
          dracut: *** Including module: base ***
          dracut: *** Including module: fs-lib ***
          dracut: *** Including module: shutdown ***
          dracut: *** Including modules done ***
          dracut: *** Installing kernel module dependencies ***
          dracut: *** Installing kernel module dependencies done ***
          dracut: *** Resolving executable dependencies ***
          dracut: *** Resolving executable dependencies done ***
          dracut: *** Hardlinking files ***
          dracut: Mode:                     real
          dracut: Method:                   memcmp
          dracut: Files:                    780
          dracut: Linked:                   2 files
          dracut: Compared:                 0 xattrs
          dracut: Compared:                 38 files
          dracut: Saved:                    356.65 KiB
          dracut: Duration:                 0.010006 seconds
          dracut: *** Hardlinking files done ***
          dracut: *** Generating early-microcode cpio image ***
          dracut: *** Store current command line parameters ***
          dracut: *** Stripping files ***
          dracut: *** Stripping files done ***
          dracut: *** Creating image file '/efi/69f1b920c192454c97831ba6e72bf777/6.4.12-arch1-1/initrd' ***
          dracut: *** Creating initramfs image file '/efi/69f1b920c192454c97831ba6e72bf777/6.4.12-arch1-1/initrd' done ***
          (4/5) Check if user should be informed about rebooting after certain system package upgrades.
          (5/5) Checking which packages need to be rebuilt
          
          • Illecors
            link
            fedilink
            English
            010 months ago

            Did it boot or did it get stuck at the same spot again?

            • @oiram15OP
              link
              010 months ago

              It got stuck at the same spot.

              • Illecors
                link
                fedilink
                English
                310 months ago

                Implies it is the kernel. Mark it on hold and stick to the working kernel for the time being.

              • abrer
                link
                fedilink
                0
                edit-2
                10 months ago

                Does EndeavourOS use pacman?

                You might consider modifying /etc/pacman.conf to include the option IgnorePkg=linux linux-lts until this is resolved. 6.4.12.arch1-1 was added ~4 days ago. If you check the releases arch linux kernel releases here, they have nearly a weekly cadence. This may be a case you can ride out until a newer kernel is released that might solve your issue.

                If you need access to older releases, see the archive.

                For root cause analysis – is it possible for you to extract/view the journal logs now that you have upgraded the kernel causing the issue? – /var/log/journal is a good start. My time is limited, but I’m curious to see what’s going on in there. In any case per your and Illecors’ testing, it seems you have isolated the issue to the linux package.

                I just realized you’re also on systemd-boot (I am too). I’ll check into a way to revert back to prior kernels (for I also may run into a similar issue).

                edit: updated IgnorePkg line to include both your mainline and LTS kernel (Pacman -Syu updates both) if you opt to hold them for updates.

                • @oiram15OP
                  link
                  1
                  edit-2
                  10 months ago

                  This is the log https://ufile.io/p5wj1hu0

                  I’m not sure if this is relevant, but I added pci=nommconf to the kernel parameters as I had errors similar to the one in this topic, otherwise the log was over 40 000 lines long.

                  Edit: I’m not sure if this is exactly a kernel problem, as I get the same error when booting with the LTS kernel, which is older than the one installed with the offline installer.

                  • abrer
                    link
                    fedilink
                    0
                    edit-2
                    10 months ago

                    No worries. When checking that output, it is for the working 6.4.8-arch1-1 kernel. The broken kernel boot attempt would be most useful, but I don’t want to make you suffer to get it, if you are back to a working system. I think at this point it is safe to say your laptop isn’t a fan of the newer kernels.

                    I would :

                    1. (fresh install/andor working machine) update your /etc/pacman.conf to ignore updates to packages linux and linux-lts
                    2. Devise a way to add multiple systemd-boot boot entries. I was working on this just a bit ago but I don’t have it fool proof and it drops you to an emergency shell. So I am hesitant to share this at the moment.

                    Ideally: You could (from a working system) install a known working LTS image (pkg linux-lts), and exclude that from updates until you land on a working kernel release (keep an eye on testing and core repos once a week or so). in this way, you’ll have a working LTS, and can upgrade/downgrade mainline kernels as you please, booting back into LTS to correct issues should they arise.

                    edit: minor