Via Andy Miller (2007), an amusing metaphor for Linux memory overcommit. Originally posted by Andries Brouwer to the linux-kernel mailing list, 2004-09-24, in the thread titled “oom_pardon, aka don’t kill my xlock”:

An aircraft company discovered that it was cheaper to fly its planes with less fuel on board. The planes would be lighter and use less fuel and money was saved. On rare occasions however the amount of fuel was insufficient, and the plane would crash. This problem was solved by the engineers of the company by the development of a special OOF (out-of-fuel) mechanism. In emergency cases a passenger was selected and thrown out of the plane. (When necessary, the procedure was repeated.) A large body of theory was developed and many publications were devoted to the problem of properly selecting the victim to be ejected. Should the victim be chosen at random? Or should one choose the heaviest person? Or the oldest? Should passengers pay in order not to be ejected, so that the victim would be the poorest on board? And if for example the heaviest person was chosen, should there be a special exception in case that was the pilot? Should first class passengers be exempted? Now that the OOF mechanism existed, it would be activated every now and then, and eject passengers even when there was no fuel shortage. The engineers are still studying precisely how this malfunction is caused.

Twenty years later, as far as I know, the OOM killer is still going strong. In fact, if you don’t like the airline’s policy on what counts as an “emergency” (for example, that it might exhaust your swap partition too before killing any bad actor at all), you can hire your own hit man, in the form of the userspace daemon earlyoom.

Explanation of the OOM-Killer: Understanding Out of Memory Killer (OOM Killer) in Linux

  • Beej Jorgensen
    link
    215 months ago

    I started using one of the userspace oom killers a while ago and have been much happier. Instead of the system becoming unresponsive, suddenly Slack just dies. It’s great.

    • @ReversalHatchery@beehaw.org
      link
      fedilink
      English
      35 months ago

      Why is it though that the system just becomes unresponsive? That is always my experience too, but shouldn’t just the kernel’s OOM killer kill something?

      • @Mixel@feddit.org
        link
        fedilink
        English
        35 months ago

        Yes that’s true however the default OOM killer tries its best to save the processes first and it can take while sometimes it took for me over 30 mins until it killed the bad process and then it all became responsive again.

      • Beej Jorgensen
        link
        15 months ago

        I don’t know the details behind it, but it sure takes its sweet time figuring it out. I’ve let it sit 20 minutes before giving up.