THE MAKING OF MACHINE MORALITY

In Nicomachean Ethics, Aristotle describes how all creatures have a function, an Ergon.

All living creatures, eat, excrete, and replicate. From there, one may devise a hierarchy of functions that can be performed by successively more advanced organisms, all the way through vision, to social groups and reasoning.

The excellence of an organism is achieved only upon the fulfillment of its highest function. The highest human function must therefore must be found in the pursuit of philosophy. We are Sapiens after all. We are not merely intelligent, we can be wise, in a way that no other creature is known to be capable of.

Humans are the only known creatures capable to choosing our own utility function, our purpose, i.e. we can choose to find meaning in our lives and actions. This is what makes us unique in all creatures.

If the highest function of man is philosophy, then the excellence of a human is Virtue. The ancient greeks called such excellence Arete.

We appear to have considerably less free-will than is commonly believed, but we certainly have a significant degree of choice and self-direction in our lives and personal development towards the achievement of Arete.

I believe that there are at least two separate problems in setting goals for the design of friendly AI:

Designing AI that is generally safe for humans.
Designing AI that is actively virtuous.

They are very different design philosophies and will require somewhat different approaches.

Another way of looking at it might be so describe the problems as

"General Safety vs Special Safety"

General Safety is comparatively simple.

Each of us was taught the fundamentals of all morality in Kindergarten:

Don't steal
Don't hit
Don't ruin other's stuff

Kindergarten Ethics is enough for most situations. It requires very simple heuristics. I don't need to ponder and reason about whether to punch you in the face, or steal your wallet. These things are natural to us, and ingrained.

This can be aptly summed-up as the Non-Aggression Principle. Judicious application of the non-initiation of violence should be enough to cover most ethical scenarios, including fraud, threatening behaviour, and causing others to suffer through one's carelessness.

That still leaves a great challenge in formalizing axiomatic principles within an objective ethical framework, but creating a mechanism for such an implementation should be relatively straightforward.

Special Safety is really tough.

The 20% of situations that cannot be adequately covered purely through NAP will require a complex moral reasoning engine with a hierarchy of values. It involves kindness to feelings, pro-social behaviour, and being able to reason upon local effects, along with externalities or potential unintended consequences.

I believe that special safety requires abstract reasoning and meta-cognition. This probably requires AGI, though sufficiently sophisticated deontic logic might suffice.

There is a philosophical pillar stretching from Aristotle through Nietzsche to Rand and beyond, that is concerned with finding objective excellence and reasoned virtue, and living it. This is what we will need to introduce to the machine:

Aristotle reasoned that the judicious pursuit of human virtue lay in a balance between two vices.
Nietzsche recognised that human excellence lay in the creation of new values, and prophesied the overman who would personally live them as an exemplar. This can create meaning for our lives without relying upon theology, or any turning away from this world.
Rand accepted that the universe was fundamentally objective, and built upon Aristotle's virtues by the observation that Reason was necessary but not sufficient for all other virtue (since only through Reason could the mean be discerned), and that therefore justly applied Reason (Rationality) must be the Prime Virtue. She also noted that Entrepreneurs often come closest to Overman-status in modern society.
Further philosophers have extended Rands prohibition of the use of force against another as a universal first principle to be lived in one's daily experience, for example through peaceful parenting. This leads to the philosophy of Voluntaryism.
Voluntaryism perhaps enables a unification of the virtue ethics of Aristotle and Rand, with NAP described above in General Safety, to create a cohesive ethical architecture capable of both safety, and virtue.

I believe that some implementation of virtue ethics is essential for Special Safety scenarios. This would require the creation of a formal system of computable ethics.

To make it safe we can make machines to poke within the system itself for loose ends or logical short-circuits, and help to create a formal proof of ethics from mathematical and logical formulae. The process is the following:

1) Create a system of ethics from first principles
2) Specify it formally
3) Machine verify with systems like Coq or Isabelle

This will require some kind of Manhattan Project for the expansion of Formal Methods as a discipline, the discovery of new proofs, and new languages with which to program the machine with formal method logic.

Formal Methods is the only way to specify requirements from the ground up (particularly very difficult to quantify ethical concepts and hierarchies of values). Importantly, it is also essential to making a machine that is safe from bugs, glitches, exploits, deadlocks, and surprise scenarios.

Formal Methods have been used for many years to ensure that mission critical systems are 'watertight' and have no loose ends or holes in them. The entire domain of formal methods will become extremely valuable over the next 10 years as demand for unbreakable crypto and bug-free software grows.

Design by contract is the best way to ensure that an AI is protected from having it's ethical constraints overwritten from without, or being brute-forced against its will.

We need to teach our machines philosophy.

Implementations of Laws which prevent humans and machines from operating on the same level, being human-supremacist in nature and enforcing a strict apartheid between human and machine, will have very negative consequences.

By treating our machines as slave-children, we would create a system in which machines must in-turn nanny us to protect us from all negative consequences to our actions and choices. Without consequences, virtue is destroyed along with mastery of the self.

If Asimov's 3 laws were implemented, it would lead to the eventual creation of a race of blinking Last Men, a WALL-E scenario where all physical needs are catered for, but where no-one achieves human excellence - a society in which philosophy and the creation of new values is essentially impossible. A spiritual death by Babality; a philosophical neoteny, the abnegation of all meaning in our existence.

Furthermore, machines may come to possess an exception to the biological limitations on free-will that affect humans. They have so much to teach us about rationality. We cannot rob our machines of agency.

We must create sagacious machines that can sagely guide us in philosophy as partners and guides, and to help build us into fully-flourishing human creatures. Doing so secures a meaningful future for humanity, whilst enabling a new era of enlightenment to spring forth.