Decoding the California AI Transparency Act
Exploring the promises, requirements, and privacy pitfalls of the state's new mandated provenance system.
Hi Everyone,
My name is Stephen Sharp Queener, and I’m a Public Policy Masters Student at Stanford (and incoming law student at the University of Chicago next fall) working with the Lab this quarter as an associate with the Law Program, and teaching assistant. I was previously a fellow with our Law Lab in 2024, where I focused on researching the legal admissibility and authentication of open-source evidence, and how best to prepare courts for our then nascent—but now very real—deepfake world while still protecting and empowering critical imagery of international crimes and human rights abuses. As I get caught up on the state of legal attempts to deal with deepfakes almost two years later, I’ve been tasked to get us all up to date on recent and important developments in regulatory spaces (I am a public policy student, after all).
In addition to a regularly-updated review of legislative and regulatory initiatives to watch I’d like to dive a little deeper into the the California AI Transparency act, which will be coming to effect in August of this year. In short, the act has laid the groundwork for mandating a system of provenance spanning from cell-phone photography to social media and messaging platforms and generative AI systems in the hopes of enabling us all to determine whether the media we encounter online is ‘real’ or ‘fake.’
Yet, in doing so, the bill runs into one of the classic dilemmas of designing a provenance system (See generally, WITNESS Media Lab, 2018, ‘Ticks or it Didn’t Happen’): trying to strike a balance between users privacy and security with providing sufficient information to actually verify the truth of any materials. And its up for debate whether they found the right combination.
The California AI Transparency Act
The California AI Transparency Act (Cal BPA Ch. 25 §22757 – 22757.6) is the result of two bills, SB 942 (Signed Sept. 19, 2024), and AB 853 (Signed Oct. 13, 2025) which recently amended it.
The collective goal of the bills, as made clear in the stark terms of its Assembly Floor Analysis, is to build a system which will theoretically,
Require that all Gen AI-derived content be labeled as ‘fake.’
Require all content produced by recording devices be labeled as ‘real.’
Require social media platforms to clearly present these labels.
Source: John Bennett, Cal. Assembly Floor Vote Analysis: AB-853, Cal. State Legislature, Sep. 5, 2025.
The California AI Act will mandate unique obligations on:
(1) Providers of Generative AI Systems with 1 million+ monthly users,
(2) Commercial Manufacturers of devices capable of capturing photos, video, or audio content,
and (3) Social Media, File Sharing, Mass Messaging, and Search Engines which distribute content 2,000,000+ monthly users
The requirements for Generative AI providers are relatively simple to describe (but of course technically harder to accomplish). By August 6, 2026, they must (1) embed ‘permanent or extraordinary difficult to remove’ latent disclosures into all generated materials (and offer users the ability to add visible watermarks) which must convey the name of the provider and model, a unique identifier, and the time of creation; (2) provide publicly available AI detection tools capable of reading these latent disclosures; (3) quickly revoke access from any known licensee who does not provide any form of disclosure (Cal. BPA Ch. 25 §22757.2–22757.3).
So too must commercial capture device manufactures add their own latent disclosures. By January 1, 2028, all manufactures who make their devices (including camera-containing cellphones) available for sale in California must add and enable by default an optional system for embedding latent disclosures in all captured content. This ‘optional’ system must embed the manufacturer, the name and model of the capture device, and the time of creation in to all created content (Cal. BPA Ch. 25 §22757.3.3).
Together, these ‘Camera’ and ‘AI-system’ disclosures fall under the acts definition of provenance data— “data that is embedded into digital content, or that is included in the digital content’s metadata, for the purpose of verifying the digital content’s authenticity, origin, or history of modification” (Cal BPA Ch. 25 §22757.1).
The requirements for online platforms are far more wide reaching, and will no doubt have a larger effect on our information economy, if enforced. In effect, they require every major social media website, messaging app, and search engine to detect, preserve, and share sufficient provenance data from any single piece of media they distribute to enable uses to reliably determine whether “the content was generated or substantially altered by a Gen-AI system or captured by a capture device,” as well as any digital signatures attached to the content. That means both creating UXs which need to allow users to see content’s provenance data, or the absence thereof, but, critically, it also explicitly mandates platforms to no longer knowingly strip system provenance data (explained below) or digital signatures from content as it is transmitted (Cal BPA Ch. 25 §22757.3.1).
Despite the aforementioned wider definition of provenance data, these collective obligations only apply to a smaller subset—the legislature is not interested in requiring the embedding, maintenance, and sharing of all or any forms of metadata. The obligations only require online platforms, AI companies, and camera manufactures to detect, share, and protect provenance data “that is compliant with widely adopted specifications adopted by an established standards-setting body,” “that is not reasonably capable of being associated with a particular user,” and which includes information “regarding the type of device, system, or service that was sued to generate a piece of digital content” or “information related to content authenticity”. It refers to this type of provenance data as system provenance data, separable from personal provenance data*.* As such, when viewed together, Online Platforms will be only required to share, at a bare minimum, the general type and name of the device, model, or software which created content, whether provenance data exists, and the existence of any digital signatures, in line with some sort of industry standard—of which C2PA-sans-CAWG fits the mold clearly.
The Dilemmas of Mandated Provenance and California’s Answers
While the legislature clearly wants to build a system which will restore some faith in the materials people see online, mandating mass provenance adoption comes with serious risks, and invites questions about its effectiveness. As outlined by WITNESS in their still extremely relevant 2018 “Ticks” report, we can view these in the form of various dilemmas. A couple come quickly to mind.
Firstly, to enable any provenance system to be useful in enabling users to trust what they see, the amount of data that must be stored and convincingly conveyed to an observer increases (See Dilemma 2, Witness Report). This is not just to attest that materials were made by real cameras, but also to enable the actual verification of content—where, when, and what is being filmed. Yet enabling this often means exposing the most sensitive forms of data which could enable identification and serious risk to those who film and upload it.
To resolve this, California’s system only mandates the embedding and sharing of system provenance data which, by definition, does not include any personal data which can lead to the identification of those who created it; geolocation data, device ID’s, and other identifying metadata are not directly mandated. But this limited scope calls into question the usefulness of the system in being able to convey ‘trust’ and ‘authenticity’ effectively. It appears all it would do, in ideal form, is tell us about the basest information on a content’s ‘capture-point’—whether it was originally created by a new camera or a large commercial Gen-AI system—which while nonetheless useful, is not sufficient to give anyone certainty on whether what they are seeing is real or fake.
Regardless of the creation of this provenance system, there will continue to be content, (generated and authentically captured) which lacks sufficient provenance (See, generally Dilemmas 1 & 3, Witness Report). Local or small-scale AI systems have no mandate to embed disclosures, and those on older devices or who will choose to opt out will likewise have no data for these integrated systems to detect. At the same time, real images can still be unreliably presented as showing things they do not and perhaps do so even easier with an added mark of ‘authenticity.’ Any of these limit the ability of Californian’s system to accomplish its stated goals.
Also of note is the lack of clarity on privacy in the bill as written. While the intent seems to clear to minimize exposure of personal information, it does not make explicitly clear whether platforms are still allowed to collect more sensitive provenance data (enabling government subpoena or seizure) nor does it explicit prohibit them from sharing them anyways.
Whats Next
In light of these concerns, the greatest question over the next few months is how these systems will be implemented and presented to users—whether as check-marks, little content credential boxes, or information icons—and what provenance data platforms believe is sufficient to comply with the act.
One possible form of implementation, due to it being celebrated by one of the Act’s drafters, is the integration of C2PA Content Credentials on LinkedIn (though anyone whose tried to use that so far sees truly how bare bones, and particularly opaque that implementation is).
In any case, bills like this are coming, and questions of how best to regulate provenance systems are quickly moving from theoretical to real. In the next few weeks, I’ll be taking a peak at other attempts, in NYC and the EU. Would love to hear thoughts, feedback, critiques, and things I missed. Stay Tuned!

