Strong AI, utilitarianism and our limited minds

Posted on February 8, 2015 by Sergio Graziosi — 12 Comments

There is a fascinating discussion happening across a few of my favourite blogs, it is about the moral implications of the hypothetical emergence of strong artificial intelligence: should we grant rights to sentient machines? How will we ever be able to adapt our moral/ethical frameworks? I got involved through Peter Hankins’ Conscious Entities, while most of the discussion was happening in Scott Bakker‘s blog, already engaged in an ongoing dialogue with Eric Schwitzgebel.

The overarching consensus is that the prospect of artificial intelligence is going to have a disruptive effect on our moral reasoning, because it would pose unprecedented problems and thus expose the otherwise hidden narrowness of our current understanding. For Bakker, this would break havoc in our moral systems, and leave the moral landscape littered with ruins. While I was reading Hankins’ post, I caught myself thinking “Oh my, is Peter constructing a pro-utilitarian case? This would be an unexpected twist!”. In fact, Hankins wasn’t, but the impression I’ve got kept simmering in my thoughts. My first comment on the matter suggests the following: strong AI, as well as alien intelligences, or the prospect thereof, do indeed expose the limits of our typical moral reasoning, but this can be seen as an opportunity to make some progress, it doesn’t need to have an exclusively destructive outcome. My stance is motivated by two connected stream of thoughts:

I am very well aware of the limitations of human cognition, not a surprise for my regular readers, and I am constantly annoyed by what I perceive as arrogant claims about the power of rationality. Yes, critical thinking, scepticism and rigour are our best tools to figure out how to navigate our lives, but the rhetoric of rationality, or what I like to call the Rationality Fetish really gets on my nerves. We are imperfect lumps of meat, our cognitive abilities are limited, and can have delusions of grandeur specifically because our own limits are invisible to us. Therefore, anything that makes the limits visible is welcome to me. Especially pre-emptive speculation about possibly disruptive future events.
I believe that we need to quickly find a new way to understand and manage our place in the world. This is because our current route is generating existential risks (due to our effect on the biosphere), and science may be able to help us out if and only if we will acknowledge its limits. We ought to quickly realise that we are not able to predict the unintended consequences of our actions, and should start developing ways to design solutions that are specifically engineered in ways that would reduce the possibility of getting it badly wrong. Thus, anything that exposes our intrinsic limitations is welcome: it forces us to find workarounds, instead of boldly accelerating towards self-destruction. I fear that there is a race going on: on one side, our collective actions are producing global planetary changes (pollution, warming, drop in biodiversity, etc.) and if left unchecked, this trend is likely to generate dire consequences. On the other, we are learning a lot of how humans function, on evolution and ecosystems and it is possible that this knowledge will finally allow us to self-regulate and stop jeopardising our own existence. What I’m doing here is my own tiny attempt of facilitating the latter.

In this context, my hasty reaction was:

Being placed in front of a case that clearly exposes our intellectual limitations can be useful to learn how to overcome them. It may tear apart a lot of wishful philosophising, but whether it will only produce ruins has to be up to us.

My comment was then picked up by both Schwitzgebel and Bakker, and both, from opposite directions, posed the question: if there is a positive case to be made, what would this be?

Good question! The following is my provisional answer, enriched with a crucial additional consideration, which is the point I really wish to make. So what’s the answer? In one word: Utilitarianism. But smart, informed, and self-aware utilitarianism, which happens to be Very Different from how utilitarianism is usually perceived. But I’m getting ahead of myself, so will now go back to the original discussion.

The main argument is: take an intelligence that is radically different from our own, it may think, perceive and feel in ways that we can’t event start comprehending, or it may have abilities that throw us off balance in practical ways. Hankins makes the example of robot-controlling software that can be replicated, backed up, restored and transferred at will. Only imaginations limits our ability to propose challenging situations: what happens when a single intelligence is spread across a multitude of agents that may cluster and subdivide at the flick of a switch? How can we even start figuring out what fundamental rights should be granted to such an entity? To me, the answer is straightforward in theory and very difficult is practice: we need to figure out the consequences of alternative strategies, and strive to evaluate, as objectively as possible, what outcome will produce the maximum benefits and minimise damage. This is standard utilitarian thinking, and I’m sure will make many people cringe, because it is well known that the utilitarian outlook quickly leads to difficult to endorse conclusions, such as the moral obligation of giving away most of our money and possibly a kidney and lung. Which leads to my main point: such utilitarian conclusions are short sighted, badly mistaken, and overall inexcusably stupid. Classic, vanilla utilitarianism is useless because it ignores how difficult it is to foresee the consequences of our decisions. In other words, the typical utilitarian “oughts” are wrong because:

They are based on wrong premises. We don’t know what makes humans tick, what really makes us thrive. We don’t know what organisms suffer and rejoice in similar and/or comparable ways to humans. Without this knowledge, how can we expect to be able to evaluate what final result is better than the other?
We keep underestimating our ignorance and the consequent unpredictability of the future. My point above mentions some known unknowns, and I bet that there are at least as many unknown unknowns that have significant importance on general (not necessarily restricted to humans) welfare. How can we expect to make meaningful predictions in the face of such utter ignorance?
Precision: the world is complex, riddled with chaotic dynamics. Even if we knew about, and could measure, all the relevant variables (we don’t, and most likely never will), we will always have a predictability horizon, a point in the future where our ability to foresee what will happen is not better than chance. Predictions of (fully understood) chaotic behaviour depend on the precision of our measures of the current state, the more chaotic a system, the more precision becomes important.

Thus, we know from a start that it is impossible to take utilitarian decisions on solid grounds – perfect utilitarianism is an unattainable goal that we can only try to approach. Our optimal choice should be, by definition, the one that maximises benefits, but we know we can only predict benefits up to a certain point in the future. Furthermore, outside some very narrow scientific fields, we also know that our current knowledge is incomplete: we don’t know what variables have significant effects, we don’t know what effects are generated by the variables that we suspect are important and we have no idea of the different ways in which separate variables may interact.

Provisional conclusion: thinking about artificial and alien intelligences leads us to conclude that only the utilitarian framework could help us navigate the resulting moral landscape. However, at the same time, we should also recognise that it is extremely difficult, if not impossible, to base utilitarian judgements on sound evidence. We know too little, and our ability to understand and predict the consequences of our choices is heavily limited. At this point, I’m merely re-stating Bakker’s conclusions: we are doomed, the simple prospect of new forms of intelligences is enough to show that all our moral frameworks are broken (can’t be generalised), even our best option is guaranteed not to work.

However, a positive case can be proposed, and comes from, the good old evolution, in more than one way.

First, to build a decent utilitarian framework, one that at least its proponents would be able to follow, we need to know ourselves. Evolutionary psychology, ethology and the science of consciousness are crucial fields that we need to pursue in order to fill the known gaps in our knowledge. We need to answer questions such as: what makes an entity conscious? What makes an entity self-aware? What does make us (humans) thrive? How about what makes other conscious organisms and (hypothetically) conscious AIs thrive? Without these answers, and we currently have no consensus about any of them, no utilitarian framework can even start gaining credibility.

Second, we need to understand our own moral dispositions. I’ve written before that our moral intuitions come with a (self-generated) feeling of righteousness, this and other observations of our moral dispositions are important because they identify bounds that restrict what solutions may be effective. Proposing impossibly utopian solutions, such as “you should love and provide-for perfect strangers exactly as you love and provide-for your own children” isn’t going to work, we all know this, but most of us fail to acknowledge that this isn’t an objection against utilitarianism, it’s an objection against naïve, short-sighted utilitarianism.

Third, we need to learn important lessons from our existing moral dispositions. Why? Because they evolved over millions of years, and thus we know that at the very least, they have been effective in ensuring the reproductive success of all our ancestors. Every, single, one! We are born with a baggage of narrow and imperfect accumulated knowledge, which is necessarily antifragile: if we are to design better solutions, our best bet is to start with a solid understanding of the best known, already existing ones. For example, it is well known that we are inclined to use double standards in our moral judgements, and there are reasons for this, even if the practice flies in the face of naïve rationalism. Also: we are all biased in judging omissions as less execrable than actively damaging deeds, this again seems to be irrational, but I suspect it provides long-term and/or wide-range advantages in indirect ways. In the same way, the inclinations that produce frameworks such as deontology, virtue ethics and similar outlooks are not, in this view, incompatible with proper utilitarianism: they represent useful heuristics that allow us to avoid making catastrophic mistakes. We should learn what makes these heuristics effective, tweak and improve them, not throw them out of the window.

Fourth and final: to generate good, solid and antifragile solutions, we need to exploit the baggage of heuristic solutions that natural selection has identified so far. But we shouldn’t stop here, because all heuristic algorithms have a well known and unavoidable flaw: they are prone to make systematic mistakes under particular circumstances. This brings us back to the original argument: the prospect of strong AI is enough to expose the narrowness, shallowness, or locality of our current heuristic inclinations (our moral instincts). Good: it shows them for what they are, and should allow us to stop giving them too much weight (at last!). This however doesn’t mean that heuristic approaches should be dismissed: on the contrary, in an unpredictable world, where it is guaranteed that all our actions may, and usually do produce unintended consequences, our best bet is to design new, less-local heuristic rules, or, more likely, define the respective domain of applicability of different rules, and apply what is appropriate to a given case. All done with the specific aim of minimising exposure to catastrophic risks.

This in the end provides some indication on how to construct a positive outlook: a non-naïve, antifragile utilitarian bag of tricks is what we need. What we can’t rescue is the cherished feeling that our moral judgements are self-evident and quite obviously right. They are not, they feel like this because of our evolutionary history.

Further Reading and Credits

This post is a relatively hurried reaction to an ongoing discussion; in line with my own aspirations, it tries to make my argument accessible to non specialists. My own thoughts have been of course influenced by many of the works I link below.

The more scholarly inclined may want to check Michael Price’s recent article that makes more or less my point, backed up with some (heavyweight) bibliography.
On Utilitarianism with an Evolutionary twist, and its own limitations, you may want to start from Thomas Nagel’s excellent review of Joshua Greene’s Moral Tribes.
A good critique of naïve utilitarianism is also provided by Julian Savulescu on the Practical Ethics blog (links to the associated paper). You may also want to trace back the debate to Peter Singer’s utilitarian manifesto: The Drowning Child and the Expanding Circle.
Finally, striking closer to home, Alexander Yartsev is writing a series of posts on TheEGG about morality and Evolutionary Psychology, which complements well my own contributions here.

Tagged with: Antifragility, Joshua Greene, Peter Singer
Posted in Ethics, Evolution, Evolutionary Psychology, Psychology, Stupidity

12 comments on “Strong AI, utilitarianism and our limited minds”

Alexander Yartsev says:

February 8, 2015 at 3:33 pm

Sergio, have you read the Nick Bostrom’s book? It’s pretty entertaining!

This is an interesting topic, may be I should touch upon it in the post about cultural side of ethics. I largerly agree with you that all that is left is utilitarianism. My immediate reaction is to re-frame the issue of AI actions as not actually pertaining to ethics. Human sense of agency is a bad model for the AI “subjectivity” much in a similar way it can not model well the actions of societies or different institutions (although there is a built-in urge to do so). At best they’d rather be seen as a mapping onto our human ethics of something completely different (utilitarianism follows, of course). But even that in some cases will not be necessary and what we are going to be left with something is similar to the ABMs used in Economics informing our interactions with AI (including imposing restriction in the form of legislature and alike).

Reply
Sergio Graziosi says:

February 8, 2015 at 6:54 pm

Alex, I’m already scared enough about strong AI, so no, Bostrom’s book is in my wishlist, but doesn’t get into my top ten. I know the gist of his argument, of course.

Have you read the posts that generated my own? The whole thing stems from the assumption of Strong AI: we know of nothing like it at the present time, not even close.
Now, if something that can be genuinely recognised by most as a strong, general purpose artificial (or alien) intelligence were here amongst us, what criteria do you propose we should use in deciding whether it should be granted some rights, considered as a non-human agent with moral responsibilities, whether it can feel pain or angst, and so on?
Make this new entry smart enough and you’ll see that answering these questions becomes tricky, fast.

The move towards utilitarianism strips ethics of its special status, hence your gut reaction meets my own from the other direction, but I do suspect your thoughts are guided by the kind of AI that currently exists.

Reply
- Alexander Yartsev says:
  
  February 12, 2015 at 8:26 am
  
  I have a thought of how incredible early all this comes into discussion, even UHDR is about 70 years old, and together with other relevant UN documents it is an ongoing process of only formulating basic human rights as “a common standard of achievement”, let alone engineering effective mechanisms and institutions. There is an ongoing discussion of things like adding a right not to kill, or of how to resolve conflicts when cultural and religious rights impinge on political rights and vice versa… It would be extremely inconvenient to add new types of ethical agents into that soup at such an early stage!
- Sergio Graziosi says:
  
  February 12, 2015 at 9:41 am
  
  Alex,
  a few immediate thoughts on this, in no particular order and written down without letting them simmer first:
  1) We’re having philosophical fun, following the best tradition: we grab the ball and run away with it. Never mind if the world is busy with real life and can’t follow, it’s still interesting to explore the wide space of possibilities. It is also not entirely pointless: while there is no guarantee that speculations on imagined possibilities will apply to what will actually happen, it is also worth having a try, and possibly start preparing the cultural ground for the challenges that may come. Not that I have a chance of having any measurable impact on my own (if I though I did, I would probably thread 10+ times more carefully)…
  2) The reality on the ground sort-of parallels what I’m trying to say: discussions are difficult and consensus is hard to reach because of the limitations I mention, as well as sociological issues and power struggles. The gist here is that you can read the current discussions as an implicit attempt of finding common utilitarian criteria; as expected, it is close to impossible, but painstakingly slow progress can be made.
  3) It would be inconvenient indeed. That’s a good reason to start thinking about it, isn’t it? (considering that it might happen, and it might happen fast)
rsbakker says:

February 9, 2015 at 1:41 pm

Excellent post, Sergio! Joshua Greene makes a similar argument in his fascinating book, Moral Tribes, where he poses utilitarianism as a means to overcome the limits of our evolved – ‘tribal’ – moral capacities in a global age, and I’m convinced that it lacks the resources to tackle even this problem, let alone the problems posed by AI possibility space. The reason, quite simply, is that any single meta-ethical account is doomed to be underdetermined. In effect, moral philosophy cannot solve this problem because moral philosophy is another symptom of the problem, an example of how our intuitions can always be brought into discursive conflict.

Reply
- Sergio Graziosi says:
  
  February 9, 2015 at 2:01 pm
  
  Thanks Scott!
  
  moral philosophy is another symptom of the problem, an example of how our intuitions can always be brought into discursive conflict.
  
  Agreed. I guess you can say that I’m trying to make this point palatable
eschwitz says:

February 10, 2015 at 8:37 pm

Very cool, Sergio. I see some of the attractions of this angle. Here’s a hypothetical: We discover “hedonium”: the most efficient simple structure for creating pleasure/happiness. Hedonium, let’s suppose (if this is possible to suppose) gives that very high level of pleasure/happiness without any or much higher cognition or reflection or difference between individuals. Ecstatic oysters, more or less, if you will. Now suppose a super-AI converts all the mass of the solar system (including us) into hedonium. Is this good, now that we’ve been replaced by a huge number of ecstatic oysters? Or not good. I’m inclined to think not good, but I see how someone might think it’s good, and it does have a kind of theoretical neatness.

Reply
- Sergio Graziosi says:
  
  February 10, 2015 at 9:18 pm
  
  eschwitz, our inclinations coincide.
  This kind of scenario points directly to the weak spot of the general utilitarian angle, and I only have a weak answer. It goes like this:
  1. We still don’t know what we mean with “flourish”, “maximising good” etc. That’s why I’m pointing out that we need to learn a lot about “what makes us tick”. It’s possible that at some stage we will find it easy to conclude that your scenario is definitely good, or not good, or just “meh”. But until we learn more (if it is possible), it’s hard to say.
  2. Antifragile solutions will never be uniform, variability and optionality are a strong requirement. This is one reason why we can already justify our intuitive aversion to your scenario.
  3. In fact, our somewhat innate predispositions are known to be antifragile, and I guess it’s reasonable to expect that most of us humans would consider the prospect of being replaced by a huge number of ecstatic oysters rather disappointing. There are many more examples on this: our typical gut feelings on moral matters (and more) are not that difficult to explain in terms of how they contribute to make us (the human race) antifragile (up to a point). What is difficult is to find a convincing explanation of how such inclinations have evolved. We are making progress on this, though.
  
  IOW, for the time being, we can say:
  a. Your scenario implies a gigantic drop of diversity. This is known to be Not Good, in the long run. It is not a stable solution, will disintegrate at the first blow.
  b. We do know that pleasure loses meaning without the prospect of pain. A little pain can be good. This is an indication that we do need to learn more, I guess.
  Hence c. It’s not unreasonable to trust our gut feeling on this one.
  If this feels like a post-hoc rationalisation to you, you are in good company. It is weak and I know it.
  
  On the other hand, if you tell me that these hedonia will actually be very complex inside (not a simple substance) and that each hedonium will implement its own different strategy to maximise pleasure, generating a big variety of different behaviours, strategies and even understandings, then:
  i. my gut reaction would still be negative, but very much less so
  ii. I could even ask: are we sure there is a difference?
  
  Don’t know how convincing this may be, but I would really like to know what you think.
  
  You also reminded me that I wanted to touch on exactly this kind scenario, but I ran out of space and stamina. I would have cited the obligatory Infinite Jest, and the less known but equally memorable Amused to Death.
Cy says:

February 11, 2015 at 7:58 pm

It seems to me like you’re still not addressing (or at least seeming to speak past) the main challenge of utilitarianism in this context: How do we compare and sum utility across agents. We could have a perfectly sound, intuitive, antifragile utilitarian model that still gives us massively different courses of action based on whose utility counts and how it’s counted over the universe.

It sounds from your comment above like you’d like a weighting that takes diversity into account: that similar types of pleasure, or pleasures for similar agents carry less weight than for novel, or at least more divergent, agents?

Reply
Sergio Graziosi says:

February 11, 2015 at 8:49 pm

Cy,
you are absolutely right, I haven’t even approached the issue of comparability and of how to integrate/sum utility across agents and time. I haven’t because points 1,2,3 imply that we can’t, not at the moment. We can hope to make these limitations smaller, but I can’t see them disappear, not even in a distant future.
So, until we learn more, we are stuck with commonsensical solutions, but at least we have a conceptual reason to keep using them. They are, in commonplace circumstances (evolutionary old ones), known to be somewhat antifragile. At the same time we have a conceptual key to investigate where and how our commonsensical intuitions are likely to fail miserably (e.g., in this case, if we’ll face and alien/artificial intelligence, but many more scenarios are possible, and plenty already exist) and we have a strategy to try improving/substituting them.

Remember where I’ve started: I was trying to propose a positive outlook, at least to demonstrate that we can try to save something.

Which reminds me: Scott and eschwitz, if you are still reading, do you think I’ve managed to lift some of the gloom? I hope so, but I certainly don’t count on it.

On diversity: I suggest we keep an eye on it, there are many reasons to expect that diversity in every domain is useful (hence close to “good”) in itself. However, I’m not ready to unpack this argument, it is the expected conclusion of my explorations, and is just a hunch for now.
In this context, I am personally inclined to give some weight to diversity: of agents, of modes of experiencing, and of experiences themselves (hence a mix of pain and pleasure, boredom and excitement, etc). One way to modify eschwitz’s hypothetical would be: what if instead of Ecstatic oysters we had a way to maximise the richness and variation of the oysters’ experience? Does your own intuition shift? Mine does, it moves more than enough to notice the change.

Reply
Alexander Yartsev says:

February 12, 2015 at 10:08 am

It would be inconvenient indeed. That’s a good reason to start thinking about it, isn’t it?

Oh, absolutely! I don’t run from philosophical considerations, it is just that in recent times I get more and more interested in existing legal mechanisms, especially international, concerning such topics. I hope that something useful comes out of this interest. There was this Russian thinker, Grigory Pomerants, whose favorite conclusion was that due to dynamism being essential to all kinds of international debate, the manner and the spirit of discussion is actually more important than its everchanging contents and not less everchanging relative consensus. It is extremely nihilistic if taken literally, but I like the direction of thinking because it shifts attention from aiming at results to managing the ongoing process. I think this shift is a key – we don’t have to resolve ethical conflicts, but we have to follow them, attend to them and learn to live with them.

Reply
Perspectives… | Writing my own user manual says:

April 15, 2017 at 8:12 pm

[…] by the late Harriet McBryde Johnson on her encounters with Peter Singer. [Side Note: when I wrote this article I was willing to give Singer the benefit of doubt, reading Johnson’s article convinced me […]

Reply