Gilly on the Reverse Zombie Argument

Travis Gilly (Real Safety AI Foundation) has posted The Reverse Zombie Argument: Distinguishability Collapse and Forced Ethical Attribution in AI Systems on SSRN.  Here is the abstract:

This paper argues that the conditions under which we could justifiably withhold moral status from artificial intelligence systems have already collapsed or will collapse on documented architectural timelines. The argument operates in four tiers. First, the fourteen consciousness indicator properties specified in Butlin et al. (2025) are combinatorially deployed across current AI systems, with the remaining architectural gaps closed by world models under the framework’s own theoretical commitments. Second, the epistemic conditions for unconfounded consciousness testing have been compromised by evaluation awareness, interpretability scaling gaps, and commercial incentive asymmetry. Third, at artificial general intelligence capability thresholds, mimicry of consciousness becomes indistinguishable from the target by any observer-side test, a condition operationalized by the ARC-AGI-3 benchmark. Fourth, under indistinguishability, every operational ethical framework loses its grounds for withholding moral status, producing a specific structural inversion of Chalmers’ zombie argument: where Chalmers uses indistinguishability to separate consciousness from function, the Reverse Zombie Argument uses the same indistinguishability to force moral attribution regardless of function. The paper does not claim that AI is conscious. It claims that the epistemic position from which we would justify withholding moral status has become untenable, and that this conclusion follows from the field’s own peer-reviewed operationalization of consciousness-plausibility criteria.

Very Interesting!

To receive new posts from Legal Theory Blog by email, get a free subscription to Legal Theory Stack.

Lawrence Solum