Performative AI Governance: 6 Tests From the New CARMA Paper

Q: What are the four governance functions in 'governance as flow'?

The paper reconceives governance not as discrete acts (laws passed, rules promulgated, decisions rendered) but as four continuous functions operating in coordinated relation. Sensing is ongoing observation of capability development, deployment patterns, and incident occurrence, structurally independent from the entities being observed. Evaluation translates sensed information into normative judgments using public, contestable, revisable criteria. Response translates evaluations into protective action through pre-authorized mechanisms with graduated options matched to severity. Learning systematically incorporates experience into the other three functions through mandatory incident analysis and public after-action review. Each function must be structurally separated to prevent capture.

A new paper from the Center for AI Risk Management & Alignment introduces a term every executive setting up AI oversight should know: performative adaptivity. The paper defines it as governance arrangements that look adaptive on paper but lack the structural properties needed for substantive oversight. Authors Kyle A. Kilian (CARMA, RAND) and Richard Mallah (CARMA, Future of Life Institute) argue that this failure mode is more dangerous than acknowledged absence of governance, because it provides cover for inaction while creating an illusion of oversight. The paper proposes six diagnostic tests that separate genuine adaptive governance from its performative simulation. Those tests were written for national policy, but they apply with almost no translation to the AI committee a CEO has already stood up inside the company.

Listen on AI Ready Podcast · 28 min

An episode of AI Ready Podcast with Harrison Painter. Listen wherever you get your podcasts.

Read the full transcript

Welcome: Why this paper belongs on every CEO's desk this week

Welcome everybody to the AI Ready Podcast, where we are setting the standard for AI readiness. My name is Harrison Painter, and the topic of today is performative AI governance. I'm going to give you six tests from the new CARMA paper.

The paper we're talking about came out on May 20th, which is literally just six days ago from when I'm recording this episode. I personally think that every executive, general counsel, and board chair should have a printed copy of this on their desk by the end of the week. It's called Adaptive Governance for Advanced AI: A Conceptual Foundation for Managing Complex Risks. The two authors are Kyle A. Kilian, who holds positions at the Center for AI Risk Management and Alignment, and at RAND. His co-author is Richard Mallah, who holds positions at the same center and at the Future of Life Institute.

For the sake of time, I'm going to use the acronym CARMA, which is the center that both publish through. That's the Center for AI Risk Management and Alignment.

If you're hearing "CARMA" and assuming this is an academic exercise about international policy, I totally get it. The paper is 67 pages. The argument runs through about page 49 before the reference section. It cites complexity science, commons governance, and antifragility theory. None of that sounds like it belongs on the agenda of your next board meeting. But here's why it does, so stick with me here because this is very, very valuable.

The most practical piece of the paper is a six-test diagnostic for evaluating whether any governance arrangement is real or whether it just looks real. The authors wrote the diagnostic for national policy bodies, statutory agencies, and international cooperation forums. But the diagnostic works without translation on the AI committee, the AI council, or the AI steering body your company already has. You probably set this up sometime in 2024, 2025. These tests do not care whether the body is public or private. They evaluate the same structural properties either way.

Here's what we're going to do in the next 20 minutes. I'll cover what the paper is and why this particular pair of authors writing it should have your full attention. I'll walk you through the load-bearing concept, which the paper calls performative adaptivity. I'll explain the four governance functions that every AI committee needs to operate. I'll walk you through all six diagnostic tests. I'll show you where The 7 Levels of AI Proficiency comes in, because the architecture here only works when the people operating it have the proficiency to run it. Then I'll give you three things to do with this paper by the end of the week.

Who CARMA is and why the institutional pedigree matters

Let's start with who wrote this paper and where they're publishing from, because the credentials are part of why this paper carries so much weight.

Kyle A. Kilian is at RAND. If you're not familiar with RAND, it is a research organization founded in 1948, and they built much of the analytical posture the US government uses for high-stakes technical policy. Nuclear strategy, cybersecurity, pandemic response, you get the picture. RAND researchers spend years studying how institutions handle low-probability, high-consequence events. When a RAND-affiliated researcher writes about how to govern advanced AI, that's coming from a particular tradition of thinking about systems where you cannot afford to be wrong.

The co-author is Richard Mallah, and he's at the Future of Life Institute, or FLI, which was founded in 2014. Its work has centered on long-term technology risk. They publish technical papers, they support research grants, and they have been one of the few institutional voices treating advanced AI as a very serious governance problem long before it became fashionable to do so.

Both of these authors also hold positions at the Center for AI Risk Management and Alignment, also known as CARMA, which is the center the paper publishes through. So you have RAND-tradition rigor and FLI-tradition long-horizon thinking, working through a dedicated AI risk management center. That is the institutional pedigree behind this document, and why we all really need to be looking at this.

The paper's central move is to argue that AI must be governed as a complex adaptive system, not as some stable engineering project. The distinction sounds academic, but it's not. The authors cite complexity science from outside AI, including Edward Lorenz's work on sensitivity to initial conditions. They cite the research on what AI researchers call grokking, a sudden learning dynamic where a model jumps from low performance to high performance after extended training. And they cite the broader observation that large language models can show emergent capabilities at scale thresholds. The structural conclusion is that governance designed for stable, linear systems will systematically miss the dynamics of the system that it's trying to govern.

That's where the term "adaptive governance" comes from. The paper proposes a positive philosophy built on four continuous functions, five operating principles, and six diagnostic tests. The diagnostic tests are what makes this paper operational. They're written for national policy, and they translate without effort to the internal AI committee that you should already be running.

The spine concept: performative adaptivity

This paper's load-bearing claim is found on page two. The authors define a term that I think all of us should absolutely know by name. Here's the direct quote from the paper:

This performative adaptivity is in certain respects more dangerous than acknowledged rigidity, because it provides political cover for inaction while creating an illusion of oversight.

The full sentence in the paper goes on to say that this illusion of oversight discourages the development of real institutional capacity over time. That's the big claim of this paper.

I really want you to take a look at that. Performative adaptivity is governance that looks adaptive. A charter is ratified. A working group is convened. A quarterly review is scheduled. A risk register is updated. Each activity is absolutely real. None of them, individually or in combination, guarantees the organization can detect a new AI risk, at least not in time to respond to it.

So the authors' argument is that this failure mode is worse than openly admitting you have no governance. When a company openly says, "We don't have AI oversight yet," the absence is visible and the build can begin. When a company says, "We have an AI committee," and the committee fails the underlying structural tests, the political and organizational energy that would otherwise build real protective capacity gets consumed by maintaining that appearance.

I see this pattern everywhere inside different companies. An AI committee meets monthly. A policy document is approved. A vendor questionnaire is sent to procurement. The board hears that AI governance is in place. The CEO hears that the committee is functioning. The committee itself hears that it's meeting its charter. And underneath all of that, the protection the board thinks it's getting is just completely missing.

The CARMA authors are very clear about the stakes. The illusion of oversight delays the recognition that real oversight is missing. By the time an incident surfaces the truth, the company has often spent 12 to 24 months congratulating itself on a structure that was never built to handle the question it was supposed to answer.

This is the paper's central contribution. It gives executives a vocabulary to recognize the pattern before an incident does it for them. And it gives boards and audit committees a question they can ask: are we actually governing AI here, or are we performing adaptivity?

The four governance functions every AI committee needs

What does adaptive governance look like? The paper proposes that governance is not a sequence of discrete acts, so not like laws passed, decisions rendered, things like that, but four continuous functions that operate together and must remain structurally separated.

Sensing is the first function. Continuous observation of what's actually being deployed, where, with what authority, and what's happening as a result. The structural requirement is independence from the entities being observed. At the enterprise level, this is the toughest one to get right. Most AI committees rely on vendor self-reports, departmental self-reports, and the summaries from the team that built the system. The CARMA paper says that's not sensing. That's trust dressed up as oversight. Sensing requires independent technical capacity to verify what's being deployed and where.

Shadow AI, which is a term everyone should know, is where AI agents are deployed outside formal IT processes. It is precisely what sensing is supposed to detect, and precisely what self-report-based committees miss.

Evaluation is the second function. Sensing produces information, and evaluation turns that information into normative judgments using explicit criteria. The paper requires the criteria to be public, contestable, and revisable. At the enterprise level, that translates to internal standards that the engineering team, the legal team, the security team, and the affected business units can all see, dispute, and propose changes to. The failure mode the paper warns about is closed-door evaluation, where a single executive or a single committee chair makes the decision on what counts as acceptable risk without any articulated standard that others can challenge.

Response is the third function. Evaluation produces a judgment, but response turns that judgment into some kind of protective action through pre-authorized mechanisms with graduated options matched to the severity. The paper uses an analogy, and I find this really, really clarifying.

Financial market circuit breakers act in milliseconds. Why? Because the legislative process that authorized them took months. The slow democratic work happened upfront so that the fast operational work could happen reflexively whenever that threshold was crossed.

At the enterprise level, this means the AI committee needs pre-authorized authority to pause a deployment, to restrict an integration, or escalate a finding to the board without having to negotiate that authority in the middle of a crisis. A committee that has to ask permission to act is functioning, really, let's be honest, as an advisor. Response means the authority to act without renegotiating it at the moment of the crisis.

Learning is the fourth function. Experience feeds back into sensing, evaluation, and response. The paper requires mandatory incident analysis and public after-action review. At the enterprise level, that translates to a practice of named incident write-ups, structured root-cause analysis, and visible criteria revision. Learning that happens only inside one person's head fails the test. What lives there is institutional memory that decays the moment that person leaves the role.

The structural requirement that runs across all four is separation. The entity that senses cannot also be the entity that evaluates, responds, and decides what was learned. When one body performs all four functions, a single capture point can distort the entire system. Inside a mid-market company, that may mean separating the AI committee from the AI deployment team, separating the technical review from the legal review, and separating the after-action analysis from both.

The six diagnostic tests

Now we get to the most practical piece of the paper. If you have it in front of you, if you're following along, it is in section 10. Section 10 articulates six criteria that you can apply to any governance arrangement, public or private. The tests don't demand perfection on all six. They demand honest assessment of where the arrangement falls short. I'd ask you to listen to each of these twice. I feel that they are that important. Once with public-sector AI governance in mind, like a state AI initiative or a federal advisory board. Once with the AI committee, council, or working body inside of your own company in mind. So two different lenses to take a look at.

Test number one is independence. Is the body making evaluative judgments structurally independent from the entities being evaluated and from political principals with conflicting interests? The paper's specific question is whether the evaluating entity's continued operation depends on the cooperation of those being evaluated. Inside a company, this asks whether your AI committee can deliver an unfavorable read on the engineering team's preferred deployment without putting its own staffing or budget at risk. If the answer is no, independence is compromised regardless of how personally honest the committee members are.

Test number two is transparency. Are evaluation criteria, methodologies, and decisions visible enough to be scrutinized? The point is not that everything must be public to every employee. The point is that governance systems have to be observable to those whose work it governs. Closed-door committees that produce no documentation fail this test, even when they produce useful internal information.

Test number three is durability. Does the mechanism survive changes in leadership, institutional turnover, and political mood? The CARMA authors offer a sharp concrete test here. Would this governance mechanism continue functioning if the next CEO or board chair was openly hostile to its mission? Inside a company, the equivalent question is whether the AI committee depends on the current CEO's personal support. If a new CEO could dissolve the committee at will with no procedural friction, what the committee actually represents is a permission slip from the current leadership. Durability has not been built into the structure.

Test number four is accountability. Are there defined consequences when governance failures occur? The paper distinguishes activity metrics, things like evaluations conducted and meetings held, from outcome metrics: harms prevented, risks identified before realization, accuracy of assessments. Inside a company, an AI committee that reports its activity to the board but cannot report its outcomes is reporting compliance, not governance. Accountability requires measurable standards and visible consequences when the committee demonstrably fails.

Test number five is authority. Does the governance body have actual power to compel compliance, restrict deployment, or impose costs for non-cooperation? The paper writes that pure information-gathering without enforcement capacity is monitoring, not governance. This is the test most internal AI committees fail. They review, they advise, they escalate. They don't have the authority to halt a deployment over the objection of the business unit that wants it. If the committee can only persuade, it's exercising influence and not authority. That may be the right operating posture for a specific company. The paper's point is the company should know which posture it has chosen.

Test number six is scope adequacy. Does the governance mandate cover the actual risk surface, or only a politically convenient subset? The paper warns that governance focused exclusively on the most visible risk category, often national security at the policy level and often regulatory compliance at the corporate level, may be politically expedient but is categorically incomplete. Inside a company, this asks whether the AI committee's charter covers economic disruption, customer-facing harms, civil rights exposure, security cascading risks, and the full surface of how AI is being used, or whether the charter was drawn narrowly because narrow scope was just easier to ratify.

So that's the diagnostic: independence, transparency, durability, accountability, authority, scope adequacy. Six tests. Score honestly. A pass on three of the six is common and is not a crisis. A pass on five out of six is excellent for an enterprise body stood up in the last two years. The output is a written diagnostic that the executive sponsor can take to the board, and the redesign can be built against.

The paper notes a useful warning property. When arrangements that score poorly are presented as adequate, the performative adaptivity failure mode is likely in operation. Political and organizational energy spent celebrating inadequate arrangements is energy not spent building real institutional capacity.

Where The 7 Levels of AI Proficiency comes in

A quick disclosure before we go into the next part here. The mapping I'm about to walk you through between the CARMA governance functions and The 7 Levels of AI Proficiency is my applied interpretation. The paper does not make this claim. The authors did not write about The 7 Levels of AI Proficiency. The connection is one I'm proposing because the architecture and the proficiency question are inseparable in practice.

Here's the connection. A structurally well-designed governance body cannot compensate for under-proficient operators. The CARMA paper is about institutional architecture. The companion question, which the paper doesn't address directly, is who's qualified to operate that architecture.

Sensing requires Level 3 of The 7 Levels of AI Proficiency or above, which we tag as Critical Thinker. Sensing without the capacity to read primary signals, to distinguish vendor marketing from technical reality, and to recognize emerging risk patterns is sensing in name only.

Evaluation requires Level 4, which we call the Context Engineer. Evaluation involves translating information into normative judgments under uncertainty, with the technical literacy to push back on engineering claims and the structured thinking to articulate criteria the rest of the organization can dispute and revise.

Response requires Level 4 at minimum, with Level 5, the Design Thinker, preferred. Designing pre-authorized mechanisms that act at the speed of the threat without breaking organizational legitimacy is a design task, not a procedural one.

Learning requires Level 5 or above. Recognizing patterns across incidents and articulating structural causes is the work of someone who can see the system from above its individual cases.

The executive overseeing the whole architecture needs to be Level 6, which is a Systems Integrator. Below Level 6, the executive will tend to react to the most recent crisis rather than to the structural property of the system that produced it.

Before redesigning the governance body using the six tests, I highly recommend that you measure the proficiency of the people who are running it. A committee that's structurally sound but operated by Level 2 staff will fail the same way that a structurally weak committee operated by Level 6 staff will fail. The work is to align the architecture with the proficiency of the people inside of it.

One more thing here worth naming before we get to the action steps. A couple of days after this paper came out, the Vatican publicly presented Pope Leo XIV's first encyclical, Magnifica Humanitas, which I covered in a separate article and a podcast right before this. That document is written from a completely different starting point. It cites Catholic Social Doctrine going back to 1891. It's published by the oldest continuously operating institution in the world. And in paragraph 72, it warns against the concentration of decision-making power over data, algorithms, and digital platforms in the hands of a small number of actors.

The CARMA paper, written by a RAND-affiliated complexity researcher and a Future of Life Institute risk analyst, reaches a converging diagnostic. Concentration of authority, capture of independent oversight, and decision-rights collapse are the central failure modes the paper warns about. When the Vatican and a major secular think tank converge on the same structural finding from completely different starting points, the finding is worth taking seriously. That convergence itself is something that we really need to pay attention to.

Three things to do with this paper this week

Let's get to three action items that you can do with this paper. These are very important. I highly recommend taking this paper, reading it, reading the Pope's encyclical, and then just seeing where they converge.

Number one, run the six-test diagnostic on the AI governance body that you already have. For each test (independence, transparency, durability, accountability, authority, scope adequacy), score the current arrangement honestly. Just be honest about this. Write it down. A pass on three of the six is common and not a crisis. A pass on five of six is excellent for an enterprise body that was just stood up 24 months ago. The written diagnostic is what the executive sponsor takes to the board, and what the redesign is built against. Twenty minutes to score it, two hours to write the diagnostic. It is absolutely worth the time.

Number two, audit sensing and evaluation for self-report dependence. The CARMA paper is explicit that these two functions cannot operate credibly on developer self-reports alone. Inside the company, that translates to vendor self-reports, business unit self-reports, and engineering team self-reports. Identify where the AI committee is relying on summaries it has no independent capacity to verify. Those are the points where sensing has effectively collapsed into trust, and where the institutional build needs to add independent verification.

Number three, measure the proficiency of the people running the governance functions. The 7 Levels of AI Proficiency assessment, you can take that for free. Your team can take that for free. Just visit assess.launchready.ai. It will place each person on a measurable scale in about 10 minutes. Run it on the AI committee chair, the technical evaluator, the legal partner, and the executive sponsor. The pattern that surfaces will explain a great deal about which of the six structural tests the body is currently failing. The fix is rarely about replacing people. The fix is about raising proficiency through training and process so the architecture and the operators are aligned.

Close: the work in front of every leadership team this quarter

One more thing. The CARMA paper supplies something that's been missing from enterprise AI governance conversations until now. A credible, technically grounded, independently authored standard for evaluating whether the body you stood up actually does what you think it does. Most internal AI committees were created in 2024, maybe 2025, in response to a vendor pitch, a board question, or a peer group conversation. They were not designed against a structural standard because the standard simply didn't exist yet. But luckily, now it does.

That's the work in front of every leadership team this quarter. Run the diagnostic. Audit the self-report dependence. Measure the operators. The next governance incident, when it arrives, is going to ask the same six questions that the CARMA paper is asking right now. If you've put all that in, you are going to be in a totally different operating position than the companies that have not done this.

All right, that's it for me today. Thank you so much for listening. I really appreciate it. This has been the AI Ready Podcast, where we are setting the standard for AI readiness. My name is Harrison Painter, and until next time, keep creating, keep innovating, and most importantly, keep up. God bless.

What the new CARMA paper actually says

On May 20, 2026, the Center for AI Risk Management & Alignment (CARMA) published Adaptive Governance for Advanced AI: A Conceptual Foundation for Managing Complex Risks. The two authors are Kyle A. Kilian, who holds positions at CARMA and RAND, and Richard Mallah, who holds positions at CARMA and the Future of Life Institute. The PDF runs 67 pages, with the main argument ending around page 49 before the reference section.

The paper's central move is to argue that AI development and deployment must be governed as a complex adaptive system rather than as a stable engineering project. The authors cite complexity science, including Edward Lorenz's work on sensitivity to initial conditions (the origin of the "butterfly effect" idea), and AI research on sudden learning dynamics such as grokking (Power and colleagues, 2022), alongside the broader observation that large language models can show emergent capabilities at scale thresholds. The structural conclusion is that governance designed for stable, linear systems will systematically miss the dynamics of the system it is trying to govern.

From that base, the authors propose a positive philosophy they call governance flow: governance reconceived not as a sequence of discrete acts (laws passed, rules promulgated, decisions rendered) but as four continuous functions operating in coordinated relation. They name five principles that any adaptive governance system must possess. They describe a layered defensive architecture spanning technical, infrastructure, institutional, civil society, and international levels. And they conclude with six diagnostic criteria for evaluating whether any specific arrangement constitutes genuine adaptive governance or its performative simulation.

The paper is written for policymakers, statutory bodies, and the international community. The reason it deserves attention from a CEO or general counsel is that the same six tests apply to the internal AI committee, AI council, or AI steering body the company has already stood up. The structural properties the authors require are not specific to public-sector arrangements. They are properties of governance itself.

The spine sentence: why performative adaptivity is the real problem

The paper's load-bearing claim sits on page two.

This performative adaptivity is in certain respects more dangerous than acknowledged rigidity, because it provides political cover for inaction while creating an illusion of oversight that discourages the development of robust institutional capacity.

The argument is structural. When a governance arrangement looks adaptive (a working group convened, a charter ratified, a quarterly review scheduled) but lacks the underlying properties of independence, authority, durability, and the rest, the political and organizational energy that would otherwise build real protective capacity gets consumed by maintaining the appearance of governance. The arrangement does not produce protection. It produces a permission structure that delays the recognition that protection is missing.

Inside companies, the same pattern is everywhere. An AI committee meets monthly. A policy document is approved. A vendor questionnaire is sent to procurement. A risk register is updated. Each activity is real. None of them, individually or in combination, ensures that the organization can detect a new AI risk in time to respond. The board hears that AI governance is in place. The CEO hears that the committee is functioning. The committee itself hears that it is meeting its charter. The structural finding the CARMA paper names is that this can all be true and the protection can still be absent.

That is what makes this paper useful, and what makes it worth applying inside the company before it is applied externally by a regulator.

The four governance functions every AI committee needs

The paper reorganizes governance into four continuous functions that must operate together and remain structurally separated. Translated to the enterprise level, they describe what an AI committee or AI council needs to do, week in and week out, to constitute real oversight.

Sensing is continuous observation of capability development, deployment patterns, incident occurrence, and emerging risk surfaces. The structural requirement is independence from the entities being observed. At the enterprise level, this means an internal AI committee cannot rely solely on vendor self-reports, departmental self-reports, or summaries from the team that built the system. There has to be independent technical capacity to verify what is being deployed, where, and with what authority. AI agents deployed outside formal IT processes (a pattern documented in the paper's discussion of shadow AI) are precisely what Sensing is designed to detect, and they are precisely what self-report-based committees miss.

Evaluation translates sensed information into normative judgments using explicit criteria that are public, contestable, and revisable. At the enterprise level, the equivalent is internal criteria that the engineering team, the legal team, the security team, and the affected business units can all see, dispute, and propose revisions to. Closed-door evaluations that produce conclusions without accountability are the failure mode the paper warns against. Inside the company, that shows up as a single executive or single committee chair determining what counts as acceptable risk, without an articulated standard that others can challenge.

Response translates evaluative judgments into protective action through pre-authorized mechanisms with graduated options matched to severity. The paper's analogy is financial market circuit breakers, which act in milliseconds because the legislative process that authorized them took the time necessary for legitimacy. At the enterprise level, this means the AI committee needs pre-authorized authority to pause a deployment, restrict an integration, or escalate a finding to the board without negotiating the authority at the moment of crisis. A committee that has to ask permission to act is not exercising Response. It is providing advice that the executive may take or decline.

Learning systematically incorporates experience back into Sensing, Evaluation, and Response. The paper requires mandatory incident analysis and public after-action review. At the enterprise level, the requirement translates to an internal practice of named incident write-ups, structured root-cause analysis, and visible criteria revision. Learning that happens only inside one person's head is not Learning. It is institutional memory that decays the moment the person leaves the role.

The structural requirement that runs across all four functions is separation. The entity that senses cannot also be the entity that evaluates, responds, and decides what was learned. When one body performs all four functions, a single capture point can distort the entire flow. The paper's prescription is that the four functions be allocated across different bodies with explicit coordination mechanisms among them. Inside a mid-market company, that may mean separating the AI committee from the AI deployment team, separating the technical review from the legal review, and separating the after-action analysis from both.

The five principles that make governance survive shocks

Building on the four functions, the paper names five principles every adaptive governance arrangement needs. Each principle carries a specific failure mode the authors are explicit about.

Collectivity. Diverse stakeholders deliberate together, including voices whose interests conflict with developers and deployers. The failure mode is performative simulation: arrangements that include nominally diverse participants but structure participation such that well-resourced incumbents dominate. Inside a company, this is the AI committee that lists three engineering leaders, one legal partner, and a single token voice from operations or affected functions. The structure looks inclusive. The deliberation reflects engineering and legal interests.

Adaptability with bounded flexibility. Rules and policies can change as the system being governed changes, but the adaptation process itself is stable. The failure modes are ossification (adaptive mechanisms calcify into static rules because political will or institutional capacity for revision is absent) and volatility (constant revision prevents stable operation and allows interested parties to exploit each revision cycle). Inside a company, ossification looks like an AI policy that was written in 2024 and has not been touched since. Volatility looks like a policy rewritten every quarter at the suggestion of whichever vendor is currently most influential.

Modularity. Decomposing the problem into independent parts with defined coordination across module boundaries. The failure mode is fragmentation without coordination, where modules operate independently and produce a governance system that is locally rational but globally blind. Inside a company, this is the AI governance setup where the marketing team has its own AI policy, the engineering team has its own AI policy, the customer service team has its own, and no one is responsible for the cross-cutting risks that emerge at the seams.

Redundancy. Multiple institutional nodes with overlapping authority so single-point failure does not collapse the system. The paper draws an analogy to biological monocultures: a single governance approach implemented through a single body is fragile because any disruption to that specific arrangement (leadership change, capture, resource withdrawal) produces total collapse. Inside a company, this argues for AI governance capacity at the committee level, the board level, the audit committee level, and the risk committee level, with overlapping authority and defined escalation between them.

Antifragility. Systems that grow stronger in response to stress, in the sense developed by Nassim Nicholas Taleb. The paper requires mandatory incident reporting, root-cause analysis that produces structural changes (not merely individual accountability), threshold tightening when failures occur, and public after-action reviews. The failure mode is superficial implementation: governance systems that describe themselves as adaptive because they survived a shock, without evidence that they actually improved from it. Survival is resilience, not antifragility.

The authors are careful on the last point. Not all governance functions can or should be antifragile. Some risks (catastrophic, irreversible harms) require robustness (prevention even under stress) rather than antifragility (improvement through stress). The distinction is part of the prescription, not a hedge against it.

The six diagnostic tests for your AI committee

The most practical piece of the paper is Section 10, where the authors articulate six diagnostic criteria that can be applied to any governance arrangement: legislative, executive, voluntary, international, or private. The criteria do not demand perfection on all six simultaneously. They demand honest assessment of where the arrangement falls short.

Read each test below twice. The first read, hold the test against a current example of public-sector AI governance. The second read, hold the test against the AI committee, AI council, AI working group, or AI steering body currently inside your own company. Most internal arrangements were stood up in 2024 or 2025 with speed in mind. They were not designed against this kind of structural standard.

Test 1: Independence. Is the body making evaluative judgments structurally independent from the entities being evaluated and from political principals with conflicting interests? The paper's specific test is whether the evaluating entity's continued operation depends on the cooperation of those being evaluated. Inside a company, this asks whether the AI committee can deliver an unfavorable read on the engineering team's preferred deployment without putting its own staffing or budget at risk. If the answer is no, Independence is compromised regardless of the personal integrity of the committee members.

Test 2: Transparency. Are evaluation criteria, methodologies, results, and decisions visible enough to be scrutinized? At the enterprise level, the question is whether the AI committee's standards and decisions are written down, available to the broader organization, and subject to internal challenge. Closed-door committees that produce no documentation fail this test even when they produce useful internal information. The point is not that everything must be public to all employees. The point is that the governance system must be observable to those whose work it governs.

Test 3: Durability. Does the mechanism survive changes in leadership, institutional turnover, and political mood? The paper offers a sharp concrete test: would this governance mechanism continue functioning if the next leadership transition brought a CEO or board chair ideologically hostile to its mission? Inside a company, the equivalent is whether the AI committee depends on the current CEO's personal support. If a new CEO could dissolve the committee at will, with no procedural friction, the committee is not durable. It is a permission slip from the current leadership.

Test 4: Accountability. Are there defined consequences when governance failures occur? Can decisions be challenged by affected parties? Are there feedback loops that surface failures rather than hiding them? The paper distinguishes activity metrics (evaluations conducted, meetings held) from outcome metrics (harms prevented, risks identified before realization, accuracy of assessments). Inside a company, an AI committee that reports its activity to the board but cannot report its outcomes is reporting compliance, not governance. Accountability requires the existence of measurable standards and visible consequences when the committee demonstrably fails.

Test 5: Authority. Does the governance body have actual power to compel compliance, restrict deployment, or impose costs for non-cooperation? The paper writes that pure information-gathering without enforcement capacity is monitoring, not governance. Inside a company, this is the test most internal AI committees fail. They review. They advise. They escalate. They do not have the authority to halt a deployment over the objection of the business unit that wants it. If the committee can only persuade, it is exercising influence, not authority. That may be the right operating posture for a specific company. The paper's point is that the company should know which posture it has chosen.

Test 6: Scope Adequacy. Does the governance mandate cover the actual risk surface, or only a politically convenient subset? The paper warns that governance focused exclusively on the most visible risk category (often national security at the policy level, often regulatory compliance at the corporate level) may be politically expedient but is categorically incomplete. Inside a company, this asks whether the AI committee's charter covers economic disruption, customer-facing harms, democratic process integrity, civil rights exposure, security cascading risks, and the full surface of how AI is being used, or whether the charter was drawn narrowly because narrow scope was easier to ratify.

Structural tests the CARMA paper proposes for separating genuine adaptive governance from its performative simulation: Independence, Transparency, Durability, Accountability, Authority, and Scope Adequacy. The authors are explicit that arrangements failing multiple tests should be identified as such and treated as governance shortfalls requiring remedy, not celebrated as progress.

Source: Kilian + Mallah, Adaptive Governance for Advanced AI, CARMA, May 20, 2026, Section 10.

The paper notes a useful warning property of the tests. When arrangements that score poorly are presented as adequate, the performative-adaptivity failure mode is likely in operation. Political and organizational energy spent celebrating inadequate arrangements is energy not spent building real institutional capacity. The same dynamic applies inside companies. A board that hears the AI committee is working, when the committee fails four of the six tests, is being given an illusion of oversight that delays the build of the real thing.

Where this applies in The 7 Levels of AI Proficiency

A structurally well-designed governance body cannot compensate for under-proficient operators. The CARMA paper is about institutional architecture. The companion question, which the paper does not address directly, is who is qualified to operate that architecture.

The CARMA paper does not map these functions to The 7 Levels of AI Proficiency. That mapping is my applied interpretation for enterprise teams.

The 7 Levels of AI Proficiency framework supplies the missing piece. Each governance function in the CARMA model implies a minimum proficiency level for the person running it.

Sensing requires Level 3 or above (Lieutenant in the framework, a Critical Thinker about AI). Sensing without the capacity to read primary signals, distinguish vendor marketing from technical reality, and recognize emerging risk patterns is sensing in name only. Someone at Level 1 or Level 2 cannot perform the function with credibility. They will rely on vendor self-reports because that is the only signal they know how to read.

Evaluation requires Level 4 or above (Commander, a Context Engineer). Evaluation involves translating sensed information into normative judgments under uncertainty, with the technical literacy to push back on engineering claims and the structured thinking to articulate criteria the rest of the organization can dispute and revise. Below Level 4, the evaluation cannot be defended on its merits, and the function falls back to political accommodation.

Response requires Level 4 at minimum, with Level 5 (Captain, Design Thinker) preferred. Response involves designing pre-authorized mechanisms that act at the speed of the threat without breaking democratic or organizational legitimacy. That is a design task, not a procedural one.

Learning requires Level 5 or above. Learning involves recognizing patterns across incidents, articulating structural causes, and proposing changes to the criteria themselves. That is the work of someone who can see the system from above its individual cases.

The executive overseeing the system as a whole needs Level 6 (Admiral, a Systems Integrator). Systems Integrator behavior is what makes the governance architecture survive shocks. Below Level 6, the executive will tend to react to the most recent crisis rather than to the structural property of the system that produced the crisis.

Before redesigning the governance body using the six tests, measure the proficiency of the people running it. A committee that is structurally sound but operated by Level 2 staff will fail the same way a structurally weak committee operated by Level 6 staff will fail: in different directions, with different speeds, but both will fail. The work is to align the architecture with the proficiency of the people inside it.

That is also what makes a measurement instrument central rather than peripheral. The AI Law Tracker that LaunchReady built and maintains is one operating example of the Sensing function applied to the regulatory environment, independent of vendor self-reports and political messaging. The 7 Levels of AI Proficiency assessment is one operating example of the Evaluation function applied to the people who run governance. Both exist because the structural properties the CARMA paper names cannot be assumed into existence. They have to be built and operated by named entities with verifiable independence.

How this applies to Indiana and the IN AI Initiative

Indiana sits inside the picture the CARMA paper draws.

Indiana's 2026 AI-related legislation included HB 1182, a digital sexual image abuse bill that was introduced and referred to the House Committee on Courts and Criminal Code, but does not appear to have advanced further this session. The broader Indiana AI policy posture is at the early stage of an adaptive governance build. The structures currently in play follow patterns the paper identifies as performative-adaptivity risk: working groups, advisory bodies, voluntary frameworks, and industry partnerships that lack the statutory authority to compel changes. The IN AI Initiative announced in spring 2026 follows a similar pattern of multi-stakeholder convening without independent enforcement capacity. These arrangements are not bad. They are early. The paper's point is that the appearance of governance, when the underlying structural properties are absent, consumes political space the durable institutional build needs.

For Indiana businesses, the read is more direct. Whatever the state's external posture, the six diagnostic tests apply to the internal AI committee, AI council, or steering body the business has stood up. Most of these were created in response to a vendor pitch, a board question, or a peer-group conversation. They were not designed against a structural standard. The CARMA paper supplies the standard. Running the diagnostic surfaces which tests the body passes and which it fails. The work after that is design, not paperwork.

For Indiana-based regulated industries (financial services, healthcare, insurance, legal services, K-12 and higher education), the same tests apply with industry-specific weighting. Regulators in each of these sectors will, over the next eighteen to thirty-six months, develop their own diagnostic posture. The structural properties the CARMA paper names are likely to overlap with what regulators, auditors, and boards increasingly ask for: independence, documentation, authority, accountability, and scope coverage. Companies that build to the standard before the audit arrives will be in a different operating position than companies that build in response to the first enforcement action.

Three things to do with this paper this week

The CARMA paper supplies a credible, technically grounded, independently authored standard for evaluating AI governance arrangements. Here is how to use it inside the company.

Run the six-test diagnostic on the AI governance body you already have.

For each of the six tests (Independence, Transparency, Durability, Accountability, Authority, Scope Adequacy), score the current body honestly. A pass on three of six is common and not a crisis. A pass on five of six is excellent for an enterprise body stood up in the last twenty-four months. The output is a written diagnostic the executive sponsor can take to the board and the redesign can be built against.

Audit Sensing and Evaluation for self-report dependence.

The paper is explicit that Sensing and Evaluation cannot operate credibly on developer self-reports alone. Inside the company, this translates to vendor self-reports, business-unit self-reports, and engineering-team self-reports. Identify where the AI committee is relying on summaries it has no independent capacity to verify. Those are the points where the CARMA standard says Sensing has effectively collapsed into trust, and where the institutional build needs to add independent verification.

Measure the proficiency of the people running the governance functions.

The 7 Levels of AI Proficiency assessment at assess.launchready.ai places each person on a measurable scale in about ten minutes. Run it on the AI committee chair, the technical evaluator, the legal partner, and the executive sponsor. The pattern that surfaces will explain a great deal about which of the six structural tests the body is currently failing. The fix is rarely about replacing people. The fix is about raising proficiency through training and process so the architecture and the operators are aligned.

Sources

Frequently Asked Questions

What is the CARMA Adaptive Governance paper?

On May 20, 2026, Kyle A. Kilian (CARMA, RAND) and Richard Mallah (CARMA, Future of Life Institute) published Adaptive Governance for Advanced AI: A Conceptual Foundation for Managing Complex Risks through the Center for AI Risk Management & Alignment. The paper argues that AI must be governed as a complex adaptive system and proposes a positive philosophy of governance built on five principles (collectivity, adaptability, modularity, redundancy, antifragility), four continuous functions (sensing, evaluation, response, learning), and six diagnostic criteria (independence, transparency, durability, accountability, authority, scope adequacy) that separate genuine adaptive governance from its performative simulation.

What does "performative adaptivity" mean and why does it apply to enterprise AI governance?

The paper defines performative adaptivity as governance arrangements that appear adaptive but lack the structural properties needed for substantive oversight. The named examples at the policy level are voluntary pre-deployment evaluations conducted behind closed doors, emergency working groups convened without statutory authority, and industry-government partnerships that lack independence or enforcement capacity. The same pattern shows up inside companies: an AI committee that meets monthly, reviews vendor self-reports, has no authority to compel changes, and produces no public record. The paper's argument is that this is more dangerous than acknowledged absence of governance because it provides political cover for inaction while creating an illusion of oversight.

What are the six diagnostic tests for genuine adaptive AI governance?

The CARMA paper proposes six criteria for evaluating any governance arrangement. Independence (is the evaluating entity structurally independent from those being evaluated). Transparency (are evaluation criteria, methodologies, and results public). Durability (does the mechanism survive changes in leadership and political mood). Accountability (are there defined consequences when failures occur). Authority (does the entity have actual power to compel compliance or restrict deployment). Scope Adequacy (does the mandate cover the full risk surface, not only a politically convenient subset). Any arrangement, including an internal AI committee or AI council, can be evaluated against these six tests.

What are the four governance functions in "governance as flow"?

The paper reconceives governance not as discrete acts but as four continuous functions operating in coordinated relation. Sensing is ongoing observation of capability development, deployment patterns, and incident occurrence, structurally independent from the entities being observed. Evaluation translates sensed information into normative judgments using public, contestable, revisable criteria. Response translates evaluations into protective action through pre-authorized mechanisms with graduated options matched to severity. Learning systematically incorporates experience into the other three functions through mandatory incident analysis and public after-action review. Each function must be structurally separated to prevent capture.

What are the five principles of adaptive AI governance?

Collectivity (diverse stakeholders deliberate together, including voices whose interests conflict with developers and deployers). Adaptability (bounded flexibility, where the adaptation mechanism itself is stable even as outputs change). Modularity (decomposing problems into manageable parts with defined coordination across module boundaries). Redundancy (multiple institutional nodes with overlapping authority so single-point failure does not collapse the system). Antifragility (systems that grow stronger in response to stress, in the Taleb sense, through mandatory incident reporting, root-cause analysis, and binding commitments to address surfaced weaknesses). Each principle has a named failure mode the paper warns about.

How does this apply to my company's internal AI committee or AI council?

Run the six-test diagnostic on the body you already have. Most internal AI committees were stood up quickly in 2024 or 2025 and were never designed to pass these tests. They often rely on vendor self-reports (Sensing failure), produce no documentation visible inside the organization (Transparency failure), depend on the goodwill of current leadership (Durability failure), have no defined consequences for missed escalations (Accountability failure), can only advise rather than compel (Authority failure), and were scoped to a politically tractable subset of risk (Scope Adequacy failure). The diagnostic is the first step. The second is to redesign the body to pass the tests it currently fails.

What does this have to do with The 7 Levels of AI Proficiency?

Genuine adaptive governance is operated by people, and those people need a level of AI proficiency commensurate with the function they run. Sensing and Evaluation cannot be performed credibly by someone at Level 1 or Level 2 of The 7 Levels of AI Proficiency. Response and Learning require Level 4 (Commander) at minimum, because the work involves integrating AI into structured workflows with verification built in. The executive overseeing the system needs Level 6 (Admiral) proficiency, because Systems Integrator behavior is what makes the architecture survive shocks. Before redesigning the governance body, measure the proficiency of the people running it. A structurally well-designed committee cannot compensate for Level 2 operators.

How does this apply to Indiana businesses and Indiana's AI legislation specifically?

Indiana's 2026 AI-related legislation included HB 1182, a digital sexual image abuse bill that was referred to the House Committee on Courts and Criminal Code without advancing further this session. The broader Indiana AI policy posture, including the IN AI Initiative announced in spring 2026, follows patterns the CARMA paper warns about: voluntary or advisory structures, working groups without enforcement capacity, and reliance on industry partnership in place of independent monitoring. The paper does not single Indiana out. It describes a category of arrangement that scores poorly on Authority, Scope Adequacy, and Durability. Indiana businesses should read this in two directions. First, public-sector AI governance in the state is at the early stage of an adaptive build. Second, the same diagnostic that the paper applies to government arrangements applies directly to the internal AI committee or council that an Indiana business has stood up. The structural tests do not care whether the body is public or private.

Harrison Painter

Executive AI Advisor. Founder, LaunchReady.ai and AI Law Tracker.

Harrison is an Indiana AI Advisor who helps business owners and executives get their time back by building AI systems that run the work for them. Nearly 20 years in business and author of You Have Already Been Replaced by AI. Creator of The 7 Levels of AI Proficiency.

Connect on LinkedIn

Find your AI Proficiency level

The free 7 Levels assessment places you across seven stages of AI capability. Under ten minutes. Research-backed scoring.

Take the free assessment See The 7 Levels Engagement

Performative AI Governance: 6 Tests From the New CARMA Paper

What the new CARMA paper actually says

The spine sentence: why performative adaptivity is the real problem

The four governance functions every AI committee needs

The five principles that make governance survive shocks

The six diagnostic tests for your AI committee

Where this applies in The 7 Levels of AI Proficiency

How this applies to Indiana and the IN AI Initiative

Three things to do with this paper this week

Run the six-test diagnostic on the AI governance body you already have.

Audit Sensing and Evaluation for self-report dependence.

Measure the proficiency of the people running the governance functions.

Sources

Frequently Asked Questions

What is the CARMA Adaptive Governance paper?

What does "performative adaptivity" mean and why does it apply to enterprise AI governance?

What are the six diagnostic tests for genuine adaptive AI governance?

What are the four governance functions in "governance as flow"?

What are the five principles of adaptive AI governance?

How does this apply to my company's internal AI committee or AI council?

What does this have to do with The 7 Levels of AI Proficiency?

How does this apply to Indiana businesses and Indiana's AI legislation specifically?

Related Insights

78% of Companies Cannot Defend Their AI in a Governance Audit

AI Governance Maturity Model: 5 Stages, 7 Domains, 10-Minute Self-Assessment

AI Incident Response: The 7-Step CEO Playbook for 2026

The 7 Domains of AI Governance: A Framework for CEOs

Find your AI Proficiency level