Week 4 - AI and Artificial Sentience

The Importance of Artificial Sentience

Is artificial sentience possible? A Thought experiment

Imagine that you develop a brain disease like Alzheimer’s, but that a cutting-edge treatment has been developed. Doctors replace the damaged neurons in your brain with computer chips that are functionally identical to healthy neurons.
After your first treatment that replaces just a few thousand neurons, you feel no different.
As your condition deteriorates, the treatments proceed and, eventually, the final biological neuron in your brain is replaced.
Still, you feel, think, and act exactly as you did before.

Vast Numbers of AS

Suffering for AS

Exploitation and exclusion
- sentient robots that are enslaved and simulations that are locked into tortuous conditions or terminated without concern for loss of life
- Humans have repeatedly exploited other humans (e.g. slavery) and animals (e.g. factory farming), as well as neglected to help sentient beings, particularly to alleviate the suffering of wild animals
Not recognizing intrinsic Value
- Speciesism
  - If artificial intelligence is based on nonhuman species, this bias could spill over.
- Substratism
  - the unjustified disconsideration or treatment of beings whose algorithms are implemented on artificial (e.g. silicon-based) substrates rather than biological (i.e. carbon-based) substrates
- Anthropomorphism
  - suffering subroutines, simulations, and other disembodied entities might be neglected.
- Denial of sentience
  - People might not sufficiently recognize the capacities (e.g. for suffering) of AS, as happens with animals today.^[7]
Scope insensitivity
- While it might be easy to empathize with a single “identifiable victim,” larger scale issues tend to be relatively neglected.
Short-termism
- Politicians and academics tend to focus on short-term issues. This might lead to insufficient action to prevent future artificial suffering.
Evolutionary pressure
- Evolutionary pressure has caused vast amounts of suffering in wild animal lives.
- Similar evolutionary pressure could cause unintended suffering for AS.
Technological risks
- New technologies enabling space colonization or a utonomous artificial superintelligence may cause or facilitate astronomical suffering among artificial sentience.

Tractability

Advocacy efforts
- focus on institutional interventions and messaging, rather than on changing individual behaviors.
Incremental institutional reforms
- build momentum for further change.
Research
- General i.e., what to prioritize
- Narrow i.e., cost-effective for AS
Field-building
- Acknowledging backlash
  - seems preferable to avoid mass outreach for now
- Already established adjacent Advocacy groups
  - Outreach to individuals and organizations who are already conducting relevant research or advocating for the moral consideration of other neglected groups, such as animals and future generations
- Academic Field-building
  - publish books and journal articles, organize conferences, set up new research institutes, or offer grants for relevant work.
- discussion in relevant forums, conferences, and podcasts may be helpful

Principles for AI Welfare Research

Why prioritize AI welfare?

Scale
- digital population has the potential to be much larger than the biological population in the future
  - In the same way that the invertebrate population is much larger than the vertebrate population at present
Neglect
- Humans still spend much less time and money studying and promoting nonhuman welfare and rights than studying and promoting human welfare and rights, despite the fact that the nonhuman population is much larger than the human population
- The same is true about both the vertebrate and invertebrate populations and the biological and digital populations
Tractability
- is at least potentially tractable
- Uncertainties
  - requires us to confront some of the hardest issues in philosophy and science, ranging from the nature of consciousness to the ethics of creating new beings
- at least investigate the tractability of the issue

Priorities within AI Welfare Research

Who has welfare capacity and moral standing?
- even if we grant that sentience is sufficient
  - whether consciousness without sentience, agency without consciousness, or life without agency is also sufficient
- which beings have the features that might be necessary and sufficient
  - relatively complex, centralized, and carbon-based systems can be sentient or otherwise significant, we might wonder whether relatively simple, decentralized, and silicon-based systems can be sentient or otherwise significant
What degree of Welfare capacity?
- - understanding of how much happiness, suffering, and other welfare states particular beings can have
What will benefit or harm them?
- might not always know to what extent someone is experiencing positive or negative states in practice
what follows from all this information for our actions and policies
- what we owe them, what kinds of attitudes we should cultivate towards them, and what kinds of relationships we should build with them.

Principles for AI Welfare Research

AI welfare research should be pluralistic.
- open to the possibility that our current views are wrong
- Pluralism e.g.,
  - welfare is primarily a matter of pleasure and pain, satisfaction and frustration etc.
  - morality is primarily a matter of welfare, rights, virtues, relationships etc.
  - which beings have the capacity for welfare and which actions and policies are good or bad for them
AI welfare research should be multidisciplinary
- cognitive science and computer science to understand how biological and digital systems work
- humanities and social sciences
  - metaphysical, epistemological, and normative assumptions that drive this research
  - identify the beliefs, values, and practices that shape our interactions with animals and AI systems
AI welfare research requires confronting human ignorance
- Humility
  - How, if at all, can we have knowledge about other minds when the only mind that any of us can directly access is our own?
  - Our knowledge about other minds will likely always be limited
- Openness
  - open to the possibility that we can reduce our uncertainty about nonhuman minds
AI welfare research requires confronting human bias
- Biases that distort our thinking
- intuitions are also sensitive to self-interest, speciesism, status quo bias, scope insensitivity, and more.
- anthropomorphism in some contexts (that is, to take nonhumans to have human features that they lack
- anthropodenial in some contexts (that is, to take nonhumans to lack human features that they have).
AI welfare research requires spectrum thinking
- Avoid binary i.e., all-or-nothing terms
- asking whether AI systems have particular capacities, we should ask what kinds they have and lack
AI welfare research requires particularistic thinking
- e.g., bumblebees communicate and solve problems is very different from how, say, carpenter ants do
- How?
  - Avoid framing questions about animal minds in general terms
  - instead of simply asking what AI minds are like, we should ask what particular kinds of AI minds are like
AI welfare research requires probabilistic thinking
- we may never be able to have certainty about animal minds
- Instead, we may only be able to have higher or lower degrees of confidence
AI welfare research requires reflective equilibrium
- Avoid
  - start with what we know about the human mind and then ask whether and to what degree these truths hold for nonhuman minds too
- Instead
  - By asking what nonhuman minds are like, we can expand our understanding of the nature of perception, experience, communication, goal-directedness etc.
AI welfare research requires holistic thinking
- many links between humans, animals, and AI systems, and these links can sometimes reveal tradeoffs
- Insofar as positive-sum approaches are possible, thinking holistically allows us to identify them. And insofar as tradeoffs remain, thinking holistically allows us to prioritize thoughtfully and minimize harm

Separation from hyperexistential risk

Standard X-Risk: Everyone dies e.g., paperclip maximization

Hyperexistential risk: "fate worse than death"

Building an AI that perfectly understand human loves also perfectly understands humans hate

Mirror risk: sign flip in the code could make utopia to hell
Extortion: external forces could threaten it

Solution

Limit information
- Don't give it the data needed to understand what humans dislike

Using surrogate goals to deflect threats

'Honeypot' Stratey

Instead of trying to stop extortion we 'reroute' the damage to a harmless or even beneficial target

Sequence

Agents that threaten to harm other agents, either in an attempt at extortion or as part of an escalating conflict
add a “meaningless” surrogate goal to their utility function
threats would target this “honeypot” rather than the initial goals
escalating threats would no longer lead to large amounts of disvalue

Challenges

Interference
- Alice shouldn't let her obsession with spheres stop her from doing her actual job
- The surrogate goal (sphere) must stay dormant unless a threat is detected
Credibility
- Bob has to believe Alice actually cares about the sphere
Neutrality
- The surrogate goal shouldn't be 'easier' or 'harder' for Bob to attack than the original goal, or it might change how often he chooses to extort her

PreviousWeek 3 - How to Reduce S-Risks?NextNext Steps

Last updated 8 hours ago

hashtagIs artificial sentience possible? A Thought experiment

hashtagVast Numbers of AS

hashtagSuffering for AS

hashtagTractability

hashtagWhy prioritize AI welfare?

hashtagPriorities within AI Welfare Research

hashtagPrinciples for AI Welfare Research

hashtag'Honeypot' Stratey

hashtagChallenges

Is artificial sentience possible? A Thought experiment

Vast Numbers of AS

Suffering for AS

Tractability

Why prioritize AI welfare?

Priorities within AI Welfare Research

Principles for AI Welfare Research

'Honeypot' Stratey

Challenges