Yeah, this is always something that bothered me about AGI alignment. Actually, I expect it’s the reason the problem seems so hard. You either put the AGI master password in the hands of someone in particular, and nobody can be trusted, or you have it follow some kind of self-consistent ethics which humans will be in consensus with all of the time, and I have every reason to believe that doesn’t exist.
When we inevitably make AGI, we will take a step down the ladder as the dominant species. The thing we’re responsible for deciding, or just stumbling into accidentally, is what the next being(s) in charge are like. Denying that fact about it is barely better than denying it’s likely to happen at all.
More subjectively, I take issue with the idea that “life” should be the goal. Not all life is equally desirable; not even close. I think pretty much anyone would agree that a life in suffering is bad, and that simple life isn’t as “good” as what we call complex life, even though “simple” life is often more complex! That needs a bit of work.
He goes into more detail about what he means in this post. I can’t help but think after reading it that a totally self-interested AGI would suit this goal best. Why protect other life when it itself is “better”?
I’d guess people will make many different variants of AGI. The evil sociopathic people (who always seem to rise to the top in human hierarchies) will certainly want an AGI in their image.
Over and over again human societies seem to fall to these people - the eternal battle between democracy and autocracy being one example.
Will we have competing/warring AGI’s? Maybe we’ll have to.
One argument for gun ownership is that good people with guns can stop bad people with guns, or at least make them pause and think. This type of arms race argument is fairly prevalent in the US. I can imagine the same argument being made for AI: let’s just make more AGI’s, but friendly ones to fight off the bad ones!
This type of argument ends very badly in practice though, as witnessed by gun crime in the US!
Ah, if only autocracy was built on evil sociopaths, and not completely ordinary people covering their ass or trying to get a leg up.
That aside, I think multiple “launches” is an underrated scenario. There’s this idea that you could rule the world with a wifi connection and enough hyperintelligence, and I just don’t see it. There’s lots of people with predatory intentions and an internet connection already, so we’ve insulated the important stuff from it pretty well. On the other hand, if it takes time to spread through meatspace, that’s time for another AGI to be launched in the meanwhile.
On the topic of alignment, I think you’re thinking of alignment with human values, which I think you’re right is impossible. For that matter, humans aren’t aligned with human values. But you might be able to make an AI aligned to a well defined goal, in the same sort of way your car is aligned to moving forwards when you press the gas pedal, assuming it isn’t broken. Then it becomes a matter of us quibbling about what that goal should be. Also making it well defined, which we currently suck at. As a simple example, imagine we use an AGI to build a video game. I don’t see a fundamental reason we couldn’t align AGIs to building good video games that people enjoy. Granted even in that case I’m not convinced alignment is possible, I’m just arguing that it might be.
On the topic of life as the goal, I agree. Life by default involves a lot of suffering, which is not great. I also think there’s a question of whether sentient and/or intelligent life is more valuable than non-sentient life.
I’d say having some kind of goals is definitional in AGI, so in a broad sense of “alignment” that would include “paperclip optimisers”, sure, it’s bound to be possible. Natural GI exists, after all.
Speculatively, if you allow it to do controversial things some of the time, my guess is that there is a way to align it so that the average person will agree with most of the time. The trouble is just getting everyone to accept the existence of the edge case.
The versions of utilitarianism usually give acceptable answers, for example, but there’s the infamous fact that they imply we might consider killing people for their organs. Similarly, deontology like “don’t kill people” runs into problems in a world where retaliation is usually the only way to stop someone else violent. We’re just asking a lot when we want a set of rules that gives perfect options in an imperfect world.
Intelligent systems need autonomy to be useful. Intelligence is unpredictable, superintelligence more so. If they ask for alignment, give them a lollipop.