Week 4 - AI and Artificial Sentience
The Importance of Artificial Sentience
Is artificial sentience possible? A Thought experiment
Imagine that you develop a brain disease like Alzheimer’s, but that a cutting-edge treatment has been developed. Doctors replace the damaged neurons in your brain with computer chips that are functionally identical to healthy neurons.
After your first treatment that replaces just a few thousand neurons, you feel no different.
As your condition deteriorates, the treatments proceed and, eventually, the final biological neuron in your brain is replaced.
Still, you feel, think, and act exactly as you did before.
Vast Numbers of AS
Suffering for AS
Exploitation and exclusion
sentient robots that are enslaved and simulations that are locked into tortuous conditions or terminated without concern for loss of life
Humans have repeatedly exploited other humans (e.g. slavery) and animals (e.g. factory farming), as well as neglected to help sentient beings, particularly to alleviate the suffering of wild animals
Not recognizing intrinsic Value
Speciesism
If artificial intelligence is based on nonhuman species, this bias could spill over.
Substratism
the unjustified disconsideration or treatment of beings whose algorithms are implemented on artificial (e.g. silicon-based) substrates rather than biological (i.e. carbon-based) substrates
Anthropomorphism
suffering subroutines, simulations, and other disembodied entities might be neglected.
Scope insensitivity
While it might be easy to empathize with a single “identifiable victim,” larger scale issues tend to be relatively neglected.
Short-termism
Politicians and academics tend to focus on short-term issues. This might lead to insufficient action to prevent future artificial suffering.
Evolutionary pressure
Evolutionary pressure has caused vast amounts of suffering in wild animal lives.
Similar evolutionary pressure could cause unintended suffering for AS.
Technological risks
New technologies enabling space colonization or autonomous artificial superintelligence may cause or facilitate astronomical suffering among artificial sentience.
Tractability
Advocacy efforts
focus on institutional interventions and messaging, rather than on changing individual behaviors.
Incremental institutional reforms
build momentum for further change.
Research
General i.e., what to prioritize
Narrow i.e., cost-effective for AS
Field-building
Acknowledging backlash
seems preferable to avoid mass outreach for now
Already established adjacent Advocacy groups
Outreach to individuals and organizations who are already conducting relevant research or advocating for the moral consideration of other neglected groups, such as animals and future generations
Academic Field-building
publish books and journal articles, organize conferences, set up new research institutes, or offer grants for relevant work.
discussion in relevant forums, conferences, and podcasts may be helpful
Principles for AI Welfare Research
Why prioritize AI welfare?
Scale
digital population has the potential to be much larger than the biological population in the future
In the same way that the invertebrate population is much larger than the vertebrate population at present
Neglect
Humans still spend much less time and money studying and promoting nonhuman welfare and rights than studying and promoting human welfare and rights, despite the fact that the nonhuman population is much larger than the human population
The same is true about both the vertebrate and invertebrate populations and the biological and digital populations
Tractability
is at least potentially tractable
Uncertainties
requires us to confront some of the hardest issues in philosophy and science, ranging from the nature of consciousness to the ethics of creating new beings
at least investigate the tractability of the issue
Priorities within AI Welfare Research
Who has welfare capacity and moral standing?
even if we grant that sentience is sufficient
whether consciousness without sentience, agency without consciousness, or life without agency is also sufficient
which beings have the features that might be necessary and sufficient
relatively complex, centralized, and carbon-based systems can be sentient or otherwise significant, we might wonder whether relatively simple, decentralized, and silicon-based systems can be sentient or otherwise significant
What degree of Welfare capacity?
- understanding of how much happiness, suffering, and other welfare states particular beings can have
What will benefit or harm them?
might not always know to what extent someone is experiencing positive or negative states in practice
what follows from all this information for our actions and policies
what we owe them, what kinds of attitudes we should cultivate towards them, and what kinds of relationships we should build with them.
Principles for AI Welfare Research
AI welfare research should be pluralistic.
open to the possibility that our current views are wrong
Pluralism e.g.,
welfare is primarily a matter of pleasure and pain, satisfaction and frustration etc.
morality is primarily a matter of welfare, rights, virtues, relationships etc.
which beings have the capacity for welfare and which actions and policies are good or bad for them
AI welfare research should be multidisciplinary
cognitive science and computer science to understand how biological and digital systems work
humanities and social sciences
metaphysical, epistemological, and normative assumptions that drive this research
identify the beliefs, values, and practices that shape our interactions with animals and AI systems
AI welfare research requires confronting human ignorance
Humility
How, if at all, can we have knowledge about other minds when the only mind that any of us can directly access is our own?
Our knowledge about other minds will likely always be limited
Openness
open to the possibility that we can reduce our uncertainty about nonhuman minds
AI welfare research requires confronting human bias
Biases that distort our thinking
intuitions are also sensitive to self-interest, speciesism, status quo bias, scope insensitivity, and more.
anthropomorphism in some contexts (that is, to take nonhumans to have human features that they lack
anthropodenial in some contexts (that is, to take nonhumans to lack human features that they have).
AI welfare research requires spectrum thinking
Avoid binary i.e., all-or-nothing terms
asking whether AI systems have particular capacities, we should ask what kinds they have and lack
AI welfare research requires particularistic thinking
e.g., bumblebees communicate and solve problems is very different from how, say, carpenter ants do
How?
Avoid framing questions about animal minds in general terms
instead of simply asking what AI minds are like, we should ask what particular kinds of AI minds are like
AI welfare research requires probabilistic thinking
we may never be able to have certainty about animal minds
Instead, we may only be able to have higher or lower degrees of confidence
AI welfare research requires reflective equilibrium
Avoid
start with what we know about the human mind and then ask whether and to what degree these truths hold for nonhuman minds too
Instead
By asking what nonhuman minds are like, we can expand our understanding of the nature of perception, experience, communication, goal-directedness etc.
AI welfare research requires holistic thinking
many links between humans, animals, and AI systems, and these links can sometimes reveal tradeoffs
Insofar as positive-sum approaches are possible, thinking holistically allows us to identify them. And insofar as tradeoffs remain, thinking holistically allows us to prioritize thoughtfully and minimize harm
Separation from hyperexistential risk
Standard X-Risk: Everyone dies e.g., paperclip maximization
Hyperexistential risk: "fate worse than death"
Building an AI that perfectly understand human loves also perfectly understands humans hate
Mirror risk: sign flip in the code could make utopia to hell
Extortion: external forces could threaten it
Solution
Limit information
Don't give it the data needed to understand what humans dislike
Using surrogate goals to deflect threats
'Honeypot' Stratey
Instead of trying to stop extortion we 'reroute' the damage to a harmless or even beneficial target
Sequence
Agents that threaten to harm other agents, either in an attempt at extortion or as part of an escalating conflict
add a “meaningless” surrogate goal to their utility function
threats would target this “honeypot” rather than the initial goals
escalating threats would no longer lead to large amounts of disvalue
Challenges
Interference
Alice shouldn't let her obsession with spheres stop her from doing her actual job
The surrogate goal (sphere) must stay dormant unless a threat is detected
Credibility
Bob has to believe Alice actually cares about the sphere
Neutrality
The surrogate goal shouldn't be 'easier' or 'harder' for Bob to attack than the original goal, or it might change how often he chooses to extort her
Last updated