The TEST AI Act of 2025 establishes a pilot program, led by NIST, to develop measurement standards and build testbeds for evaluating the reliability and security of federal Artificial Intelligence systems.
Ben Luján
Senator
NM
The TEST AI Act of 2025 establishes a pilot program, managed by NIST, to develop reliable measurement standards for evaluating federal Artificial Intelligence systems. This effort requires collaboration between the Departments of Commerce and Energy to create necessary testing facilities, known as "testbeds." A newly formed Working Group will advise on developing standards covering AI reliability, security, and bias, with a final report submitted to Congress detailing findings and future needs.
The TEST AI Act of 2025 is all about getting serious about how the federal government uses Artificial Intelligence. Think of it as the government saying, “Okay, we’re using AI for everything from processing benefits to national security, but we need to stop winging it.” This bill mandates the National Institute of Standards and Technology (NIST) to set up a pilot program to develop reliable, measurable standards for evaluating AI systems.
Right now, when the government buys or builds an AI system—say, one that screens job applications or processes loans—there isn't one universal, technical checklist to make sure it’s safe, fair, and actually works as advertised. This bill (SEC. 3) aims to create that checklist. NIST, referred to in the bill as the "Institute," must work with the Department of Energy (DOE) to pool resources, staff, and facilities to build what they call "testbeds." These testbeds are basically dedicated labs or environments where AI systems can be rigorously put through their paces in a repeatable way (SEC. 2).
For the average person, this is critical because it addresses the fear of faulty or biased AI. If a government AI system is going to affect your life—whether deciding your tax refund or flagging you at an airport—it needs to be reliable. The standards developed here must specifically cover reliability, performance, security (including data leakage), privacy, and, crucially, data bias (SEC. 3(d)). This means the government is trying to build a system where an AI tool can’t unfairly penalize certain groups of people because of flaws in its training data.
To guide this massive effort, the bill establishes an Artificial Intelligence Testing Working Group (SEC. 3(c)). This group, which advises the Departments of Commerce and Energy, will include experts from government, industry, and academia. They’re tasked with creating the strategy for developing these measurement standards within a year of the group forming. This strategy will be posted publicly, giving developers a clear roadmap of what the government expects.
However, there’s a significant security layer built into the composition of this group: citizens from any “covered foreign country”—specifically China, Russia, North Korea, and Iran—are explicitly barred from being members (SEC. 2, SEC. 3(c)(3)). While this strengthens national security oversight, it does mean the Working Group might miss out on specialized technical expertise if that knowledge resides primarily with individuals from those nations. It’s a clear trade-off between security and technical breadth.
Once the strategy is set, NIST has two years from the law’s enactment to build and demonstrate the actual testbeds (SEC. 3(e)). These aren’t just theoretical models; they are physical or virtual environments where the new standards are proven to work on AI systems used by federal agencies. Think of it like a crash-test facility, but for algorithms. After the first round of demonstrations, NIST must report back to Congress, detailing what worked and what resources they need to expand the testing to cover all the different ways the government uses AI (SEC. 3(f)). This report is the key to turning the pilot program into a permanent, standardized way of doing business.
Ultimately, the TEST AI Act is about building trust in the tools the government uses. By forcing agencies to collaborate and create transparent, measurable standards for AI reliability and fairness, it aims to reduce the risk of costly mistakes and biased outcomes that affect everyone from small business owners applying for grants to veterans seeking benefits.