CloudFlare Confidence Scorecard – Make AI safer for the Internet

Security and IT teams face impossible balancing acts. Although employees employ AI tools every day, each tool carries unique risks related to compliance, data privacy and security practices. Employees using these tools without seeking prior approval lead to new types shadow It is called Shadow ai. To prevent shadow AI, each AI application must be manually reviewed to manually review it. This is not scalable. Also, blanket banning AI applications only drives AI usage deeper underground, making it difficult to protect.

So today I'm launching the CloudFlare application's reliability scorecard. This is part of our new suite AI Security Features within the CloudFlare One SASE platform. These scores provide scale and automation to the laborious and time-consuming task of evaluating generative AI and SAAS applications one by one. Instead of spending hours trying to find compliance certifications or data processing practices for your AI application, evaluators get a clear score that reflects the safety and reliability of your application. That signal allows decision makers within an organization to confidently set policies, apply guardrails when needed, and block dangerous tools, allowing organizations to embrace innovation without compromising security.

The CloudFlare Application Reliability Scorecard evaluates both AI-powered applications for many factors, including whether they have achieved industry-recognized certifications, whether they follow specific data management and security measures, and company maturity levels. On the other hand, among other considerations, the Generated AI Trust Score explains tests of bias, ethics and safety considerations and awards AI models with a higher score to AI models that provide system cards that do not train user input. We hope that with a focus on privacy, security and safety, it will help promote safer and safer AI for everyone.

Rapid increase in shadow AI

Over the past decade, adoption of SaaS has formed a corporate structure. Employees can get new tools in minutes, with anything other than credit cards and free trial links. Currently, with the growth of generator AI, the entire workflow is moving outside of corporate monitoring. From writing assistants to image generators, employees rely on these tools every day, unaware that they are complying with company or regulatory requirements.

The risks of these tools are broad. Sensitive data can be stored or transmitted outside of company control. Tools may not have certifications such as SOC2 or ISO 27001. Many providers either retain user data indefinitely or use it to train external models. Others face economic or operational instability that can disrupt the business if they go bankrupt or suffer a violation. The model can introduce compliance risks and generate biased output that leads to false business decisions. Security leaders say they can't keep up with auditing all new applications.

Earn them for you on a large scale

To make this effective, we needed two things: rubrics that could determine AI and SAAS applications, and mechanisms to scale all of these applications to be scaled. This is how we did it.

The Application Posture Score (5 points) evaluates SaaS providers across five major categories.

Security and Privacy Compliance (1.2 points): SOC 2 and ISO 27001 certified credits. This indicates the maturity of the operation.
Data Management Practice (1 point): Whether retention windows and providers share data with third parties. Earn the best mark without shorter retention and sharing.
Security Control (1 point): Support for MFA, SSO, TLS 1.3, role-based access, and session monitoring. These are modern SaaS security table stakes.
Security Reports and Incident History (1 point): Trust or security page availability, bug bounty programs, and incident response transparency. Recent material violations result in a complete deduction.
Financial Stability (.8 points): Public companies and large capitalized providers score the highest scores, while startups have low funding or low pain scores.

A Gen-AI Posture Score (5 points) assesses AI-specific risks.

Compliance (1 point): The presence of ISO 42001 authentication for AI management systems.
Deployment Security Model (1 point): Whether access is authenticated, rate-limited or left public.
System Card (1 point): Revealing a model or system card that documents safety, bias, and risk assessments.
Data Governance Training (2 points): Whether user data is explicitly excluded from model training, or whether there are available controls that allow training user data to be opted in/out.

Together, these scores give you a transparent perspective on how much confidence you can put in your provider.

Similarly, staying on top of all the new AI and SaaS tools being created is not scalable. Our team quickly realized that we had the same problem. AI applications are spinning so fast that it takes a large number of people to try to keep pace manually.

We knew we needed to build a methodology to do that automatically, so we designed an infrastructure that could crawl the internet and answer rubric questions on a large scale. We have built a system to remove public trust centers, privacy policies, security pages, and compliance documents. Large-scale linguistic models parse these documents to identify relevant answers, but also strengthened the process of resisting hallucinations by requiring source validation and structured extraction.

All scores generated by the automation are reviewed and audited by CloudFlare Analysts before they are published in the application library. This combination of automated crawl/extraction and human validation ensures that your scores are comprehensive and reliable.

We make it easy to act on it

The trust score is built directly into the application library and is executable from day one. Click on the score in the CloudFlare dashboard to see a detailed breakdown of how the app was run across each dimension of the rubric. As vendors improve security and compliance, score updates are made, and live views are available instead of static reports.

This approach makes life easier for all stakeholders. IT and security teams can find high-risk tools at a glance. Procurement governance risk and compliance teams can accelerate vendor reviews, allowing developers and employees to make smarter choices without waiting weeks for approval.

And it's getting even better

Visibility is just the beginning. Soon, these scores will drive enforcement across one environment in CloudFlare. You can use Gateway to block or warn employees about low-score apps, or to directly turn DLP policies into trust scores. This way, untrusted AI and SaaS providers will not be backdoors of sensitive information.

By embedding scores in both vision and enforcement, we are turning them into tools to keep your corporate environment safer.

Are you interested in these scores?

CloudFlare application reliability scorecards now live in the application library. Explore them today with the CloudFlare dashboard, use them to evaluate the tools your team depends on, and quickly enforce policies across the CloudFlare Zero Trust platform.

This is another step in our mission to make the Internet safer, faster and more reliable, not just for networks, but for applications and AI tools that empower modern work.

If you are a CloudFlare customer, you Application LibraryExplore your confidence score and tell us what you think. And if you're not, don't be afraid! – Application scores are freely available to all users, including free. you can Let's get started Simply by creating a free account and looking at these scores for yourself.

Finally, if you want to test new features or share insights related to AI security, I would like to express my interest. Participate in the User Research Program.

Source link