Imagine opening your security dashboard and finding 10,000 alerts. Which should you check first?
Discovered by GitGuardian in 2024 23.7 million new hardcoded secrets 25% spike on public GitHub. 58% are “common” secrets Something that is missing from traditional rules-based systems (passwords, database credentials, API keys). The secret appears in 31% of data breaches,take 292 days to repairand 70% of sensitive information leaked in 2022 will still be exploitable today.
GitGuardian Machine learning automatically ranks incidents by riskconverting an overwhelming flood of alerts into an actionable, prioritized queue. Our ML models examine the context of each incident and calculate a risk score, revealing the most dangerous breaches first.
💡
Impact: Incident reviews are 3x faster
Our ML model increased security team review efficiency by 3x. Analysts discover nearly three times more critical threats when reviewing the same number of top-ranked incidents compared to traditional severity rules.
Building the foundation: data, capabilities, and expertise
What is the “danger” of teaching machines?
In our ranking model, supervised learningtrained on thousands of incidents that have been manually labeled by cybersecurity experts across five severity levels: Informational, Low, Medium, High, and Critical.
Understanding severity in context: Not all secrets are created equal. Consider the following real-world example.
Critical severity:
- AWS access key and
AdministratorAccessPolicy found in public GitHub repository - Production database credentials hardcoded into the main branch Docker image
- Stripe API key with full payment processing privileges exposed in client-side code
Less severe:
- Test API keys in your development sandbox without production access
- Expired credentials for deprecated services
- Examples of passwords in the documentation (e.g.
password123used for explanation)
The difference is Blast range and exploitability. We trained in the Good Samaritan program repository, where experts focused on common secrets, the fastest growing leak category within a given context.
What the model “sees”: Rich context capabilities
We never feed the actual secret value to the model. Instead, use rich metadata such as location (GitHub, GitLab, Slack), file type, branch (main vs. development), accessibility, secret type, age, number of occurrences, and more.
The model also incorporates signals from two ML modules.
secret enricher (Classifies common secrets by examining code context)
False positive removal tool (Filters harmless strings and reduces false positives by 80%).
This gives you a 360-degree view of potential exploits.
Under the hood: Why choose XGBoost?
Why use XGBoost?
what we chose XG boost (eXtreme Gradient Boosting) is an ensemble of hundreds of decision trees that learn from each other’s mistakes. why?
- speed: Millisecond predictions for thousands of incidents
- efficiency: Optimized for tabular security data
- interpretability: Feature criticality scores indicate the factors that most impact risk (secret type, location, effectiveness) and build confidence for your security team.
Human participatory improvement
has been implemented. feedback loop with a security analyst. If a misranking occurred, the analyst flagged it for iterative retraining. This ensures that the model is reflected. Real-world security expertiseIt’s not just statistical patterns. We also tuned it for SecOps workflows to prioritize top-ranked incidents over raw accuracy.
Measuring success: beyond simple precision
Reasons why “accuracy rate” fails
Imagine two models that are both 90% accurate.
❌ Model A
Correctly identify:
- 9 out of 10 low severity incidents
Miss:
result: false sense of security
✓ Model B
Correctly identify:
Misclassification:
- Some less severe incidents
result: caught the real threat
Model B is overwhelmingly superior. I will evaluate Analyst value goes beyond accuracyusing specialized metrics.
Review utility: Measures the cumulative value of the top N incidents (severe = 10 points, high = 5 points, medium = 2 points, low = 1 point).
Important precision and recall: How often the “critical” flag is correct and what percentage of the time it is detected.
coverage: Can every incident be scored?
Safe pruning: Can you automatically close low-risk incidents without missing threats?
Results: ML and rule-based prioritization
Our ML model Deliver dramatically better performance Rule-based baseline:
| metric | ML model | rule base | improvement |
|---|---|---|---|
| Top 30 Reviews Utilities | ~9.7 points | ~3.4 points | 3x the value |
| critical precision | 75% | ~15% | False alarms reduced by 5x |
| critical recall | ~72% | ~14% | 5x better detection power |
| coverage | 100% | ~18% | No blind spots |
| NDCG (ranking quality) | ~0.95 | ~0.81 | almost perfect order |
| safe pruning | 36.7% | ~2% | 18x noise reduction |
What this means for your team
Faster triage: Discover 3x more critical threats in the same review time.
Reliable alerts: 75% accuracy for “critical” flags (15% for rules), no more false alarm fatigue.
Comprehensive detection: Catch 72% of all critical leaks (14% for rules).
No blind spots: Coverage is 100% compared to 18% for rules.
Significant noise reduction: It safely auto-closes 36.7% of incidents and misses critical threats in just 2%.
Real-world implications for SecOps teams
Transforming daily operations
in front: 10,000 unranked alerts, hours of manual triage, missed critical incidents, and an average of 292 days to remediate.
rear: Risk-ranked dashboard, top 10 alerts are 75% specific threats, 72% of critical breaches surface, and lower priority incidents are automatically filtered, significantly reducing detection time.
Reinventing ML prioritization trust: Analysts can trust “critical” flags (75% accuracy) and safely defer “low” flags (false negatives are minimized), eliminating alert fatigue and fear of missing threats.
From detection to prevention
Our ML prioritization drives millions of raw detections. Actionable Risk Ranked Queue. SecOps teams can no longer guess which breaches are the most dangerous. The model identifies it with exact accuracy. This bridges the gap between detection and prevention.
Bet: 70% of sensitive information leaked in 2022 will still be valid, and 31% of breaches will involve sensitive information. Prioritization is The difference between proactive security and reactive crisis management.
Learn more about GitGuardian’s ML-powered security
Want to know how ML-based prioritization can transform your security operations?
Check out our resources:
Want to experience prioritization that actually works? Request a demo to see our ML models in action using your own security data.
