-
Notifications
You must be signed in to change notification settings - Fork 24
Cyber security research #636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
jravenel
wants to merge
24
commits into
main
Choose a base branch
from
cyber-security
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
28382a0
feat: Add comprehensive Cyber Security Analyst domain module
76e432a
Merge main branch while preserving cyber-security-analyst module
4910470
Restore cyber-security-analyst pipeline test directories
f5ce2fa
feat: Add Cyber Security Analyst domain module
2e4316a
Fix KeyError in TemplatableSparqlQuery by handling missing argument m…
4328eec
Add D3FEND-CCO semantic graph enrichment capabilities
bfefd92
Merge branch 'cyber-security-analyst-clean' into cyber-security
6a18fa6
Simplify cyber-security-analyst to competency-question pattern
70a705f
Clean up cyber-security-analyst: remove legacy, move data to samples
4c3cd6c
Add data loading pipeline for events.yaml
4348b96
Clean module structure: remove bloat
f76777d
Transform agent into competency-question interface
e1f3b0a
Consolidate all 22 CQs into CyberSecurityQueries.ttl
9a1cfa0
Fix model import - use langchain directly
e2a9ecb
Fix namespace and tool loading for CQ tools
eed861a
Add comment test to events.yaml for data validation
3499cbf
narrowing down CQs
giacomodecolle 687a3bb
updating cqs and first queries
giacomodecolle 6bc7e0c
adding more queries
giacomodecolle ea4c89e
updating queries
giacomodecolle eb507ca
Merge pull request #666 from giacomodecolle/cyber-security
jravenel 188742f
Merge branch 'main' into cyber-security
Dr0p42 3e68a32
feat: Add d3fend-slim.ttl
Dr0p42 d2b0b31
Add comprehensive attack simulation data
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
241 changes: 241 additions & 0 deletions
241
src/marketplace/domains/cyber-security-analyst/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,241 @@ | ||
| # Cyber Security Analyst Agent | ||
|
|
||
| A comprehensive AI agent for cyber security intelligence, threat analysis, and defensive recommendations using the D3FEND framework. | ||
|
|
||
| ## 🎯 Overview | ||
|
|
||
| This agent provides **conversational AI** for cyber security analysis with full auditability through SPARQL queries. It combines natural language processing with command-based interaction, following the ABI IntentAgent pattern. | ||
|
|
||
| ## 🚀 Quick Start | ||
|
|
||
| ### Conversational Interface (Recommended) | ||
| ```bash | ||
| # Start the conversational CLI | ||
| python apps/cli.py | ||
|
|
||
| # Example interactions: | ||
| 💬 "Hello, what can you help me with?" | ||
| 💬 "What were the biggest cyber threats in 2025?" | ||
| 💬 "How do I defend against ransomware?" | ||
| 💬 "Show me the timeline of events" | ||
| ``` | ||
|
|
||
| ### Command Interface (Power Users) | ||
| ```bash | ||
| # Quick commands still work: | ||
| 💬 overview | ||
| 💬 timeline | ||
| 💬 critical | ||
| 💬 audit | ||
| ``` | ||
|
|
||
| ### Demo | ||
| Use the conversational CLI above to explore capabilities interactively. | ||
|
|
||
| ## 🛡️ Features | ||
|
|
||
| ### Natural Language Understanding | ||
| - **Intent Classification**: Understands natural questions and commands | ||
| - **Context Awareness**: Maintains conversation flow and context | ||
| - **Flexible Input**: Supports both casual questions and technical queries | ||
|
|
||
| ### Comprehensive Analysis | ||
| - **20 Major Cyber Events from 2025**: Complete incident database | ||
| - **D3FEND Integration**: MITRE defensive techniques mapping | ||
| - **Sector Analysis**: Healthcare, finance, government threat intelligence | ||
| - **Attack Vector Mapping**: Specific defensive recommendations | ||
|
|
||
| ### Full Auditability | ||
| - **SPARQL Transparency**: Every response shows underlying queries | ||
| - **Data Provenance**: Complete traceability to source data | ||
| - **Knowledge Graph**: 32,311 RDF triples with ontology integration | ||
| - **Audit Trails**: Full transparency for all analysis | ||
|
|
||
| ## 📊 Data Sources | ||
|
|
||
| ### Events Dataset | ||
| - **Source**: `events.yaml` - 20 major cyber security events from 2025 | ||
| - **Coverage**: Supply chain attacks, ransomware, data breaches, critical infrastructure | ||
| - **Metadata**: Attack vectors, sectors affected, severity levels | ||
|
|
||
| ### Knowledge Graph | ||
| - **D3FEND Ontology**: Complete MITRE defensive framework | ||
| - **CCO Integration**: Common Core Ontology mappings | ||
| - **Event Instances**: Real incident data mapped to ontological concepts | ||
| - **SPARQL Queries**: 6 predefined + unlimited custom queries | ||
|
|
||
| ### Storage Structure | ||
| ``` | ||
| /storage/datastore/cyber/ | ||
| ├── 2025/ | ||
| │ ├── 01/supply_chain_attack/cse-2025-001/ | ||
| │ │ ├── source_1_demo.html | ||
| │ │ ├── event_metadata.json | ||
| │ │ └── d3fend_mapping.json | ||
| │ └── [other events...] | ||
| └── cyber_security_ontology.ttl | ||
| ``` | ||
|
|
||
| ## 🤖 Agent Capabilities | ||
|
|
||
| ### Conversational Modes | ||
|
|
||
| #### Natural Language Examples | ||
| ``` | ||
| 💬 "What happened with cyber security this year?" | ||
| 🤖 Provides comprehensive 2025 threat landscape overview | ||
|
|
||
| 💬 "Tell me about ransomware attacks" | ||
| 🤖 Shows ransomware incidents with D3FEND defensive techniques | ||
|
|
||
| 💬 "How do I protect against supply chain attacks?" | ||
| 🤖 Detailed D3FEND implementation guidance with audit trail | ||
|
|
||
| 💬 "What threats affected healthcare?" | ||
| 🤖 Sector-specific analysis with defensive priorities | ||
| ``` | ||
|
|
||
| #### Command Examples | ||
| ``` | ||
| 💬 overview | ||
| 🤖 Dataset statistics and threat category breakdown | ||
|
|
||
| 💬 timeline | ||
| 🤖 Chronological analysis of 2025 cyber events | ||
|
|
||
| 💬 critical | ||
| 🤖 Critical incidents with defensive recommendations | ||
|
|
||
| 💬 audit | ||
| 🤖 Complete system transparency and data sources | ||
| ``` | ||
|
|
||
| ### Intent Classification | ||
| The agent automatically classifies user input into intents: | ||
| - **Greeting**: Hello, hi, what can you help with | ||
| - **Analysis**: Overview, timeline, critical events | ||
| - **Threats**: Ransomware, supply chain, phishing | ||
| - **Sectors**: Healthcare, financial, government | ||
| - **Defense**: D3FEND techniques, recommendations | ||
| - **Audit**: Transparency, SPARQL queries | ||
|
|
||
| ## 🔍 Technical Architecture | ||
|
|
||
| ### ABI Integration | ||
| - **IntentAgent Framework**: Natural language processing with intent mapping | ||
| - **System Prompts**: Conversational AI with cyber security expertise | ||
| - **Agent Configuration**: Seamless integration with ABI ecosystem | ||
|
|
||
| ### SPARQL Backend | ||
| - **Knowledge Graph**: RDF triples with complete cyber security ontology | ||
| - **Query Engine**: Real-time SPARQL execution for all analysis | ||
| - **Audit Trails**: Every response includes query transparency | ||
|
|
||
| ### Natural Language Processing | ||
| - **Pattern Matching**: Regex-based intent classification | ||
| - **Context Preservation**: Conversation state management | ||
| - **Flexible Responses**: Adaptive to user input style | ||
|
|
||
| ## 📈 Usage Patterns | ||
|
|
||
| ### For Security Analysts | ||
| ```bash | ||
| # Start with overview | ||
| 💬 "Give me an overview of 2025 cyber threats" | ||
|
|
||
| # Dive into specific threats | ||
| 💬 "Tell me more about the critical incidents" | ||
|
|
||
| # Get defensive guidance | ||
| 💬 "How do I defend against these attacks?" | ||
|
|
||
| # Verify with audit trails | ||
| 💬 "Show me how you calculated this" | ||
| ``` | ||
|
|
||
| ### For Decision Makers | ||
| ```bash | ||
| # Strategic overview | ||
| 💬 "What are the biggest cyber risks we face?" | ||
|
|
||
| # Sector-specific intelligence | ||
| 💬 "What threats are affecting our industry?" | ||
|
|
||
| # Implementation priorities | ||
| 💬 "What defensive measures should we prioritize?" | ||
| ``` | ||
|
|
||
| ### For Researchers | ||
| ```bash | ||
| # Data exploration | ||
| 💬 audit | ||
|
|
||
| # Custom analysis | ||
| 💬 "Show me supply chain attack patterns" | ||
|
|
||
| # Methodology verification | ||
| 💬 "What SPARQL queries are available?" | ||
| ``` | ||
|
|
||
| ## 🛠️ Development | ||
|
|
||
| ### Setup | ||
| ```bash | ||
| # Install dependencies | ||
| pip install -r requirements.txt | ||
|
|
||
| # Generate knowledge graph (first time) | ||
| python pipelines/OntologyGenerationPipeline.py | ||
|
|
||
| # Test the agent | ||
| python apps/demo_conversational.py | ||
| ``` | ||
|
|
||
| ### Architecture Components | ||
| - **ConversationalCyberAgent.py**: ABI IntentAgent integration | ||
| - **cli.py**: Standalone natural language interface | ||
| - **CyberSecuritySPARQLAgent.py**: SPARQL query engine | ||
| - **OntologyGenerationPipeline.py**: Knowledge graph generation | ||
|
|
||
| ### Extending the Agent | ||
| 1. **Add New Intents**: Update intent patterns in `cli.py` | ||
| 2. **Custom Analysis**: Add methods for new threat categories | ||
| 3. **SPARQL Queries**: Extend query library in SPARQL agent | ||
| 4. **Data Sources**: Add new events to `events.yaml` | ||
|
|
||
| ## 🔒 Security & Compliance | ||
|
|
||
| ### Data Integrity | ||
| - **Immutable Sources**: Original HTML and metadata preserved | ||
| - **Audit Trails**: Complete query execution logging | ||
| - **Provenance**: Traceable from analysis back to source events | ||
|
|
||
| ### Transparency | ||
| - **Open Queries**: All SPARQL queries available for inspection | ||
| - **Methodology**: Clear analytical framework using D3FEND standards | ||
| - **Verification**: Independent validation of all analysis possible | ||
|
|
||
| ## 🎉 Key Achievements | ||
|
|
||
| ✅ **Conversational AI**: Natural language cyber security intelligence | ||
| ✅ **Full Auditability**: 100% transparent analysis with SPARQL queries | ||
| ✅ **D3FEND Integration**: Complete defensive technique mapping | ||
| ✅ **ABI Compatibility**: Seamless integration with IntentAgent framework | ||
| ✅ **Hybrid Interface**: Both natural language and command support | ||
| ✅ **Real Data**: 20 actual cyber security events from 2025 | ||
| ✅ **Knowledge Graph**: 32,311 RDF triples with ontological rigor | ||
|
|
||
| ## 🚀 Next Steps | ||
|
|
||
| The agent is ready for: | ||
| - **Production Deployment**: Full ABI integration with OpenAI API | ||
| - **Data Expansion**: Additional cyber security events and sources | ||
| - **Advanced Analytics**: Machine learning integration for predictive analysis | ||
| - **Custom Ontologies**: Domain-specific security frameworks | ||
|
|
||
| --- | ||
|
|
||
| **Start chatting with your cyber security intelligence agent:** | ||
| ```bash | ||
| python apps/cli.py | ||
| ``` | ||
22 changes: 22 additions & 0 deletions
22
src/marketplace/domains/cyber-security-analyst/__init__.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| """Cyber Security Analyst Domain - Data loading hook""" | ||
|
|
||
|
|
||
| def on_initialized(): | ||
| """Load events.yaml into Oxigraph on module startup.""" | ||
| from pathlib import Path | ||
| from abi import logger | ||
| from src import services | ||
| from .pipelines import load_events_to_triplestore | ||
|
|
||
| module_dir = Path(__file__).parent | ||
| events_file = module_dir / "samples" / "events.yaml" | ||
|
|
||
| if events_file.exists(): | ||
| logger.info("📊 Loading cyber security events data...") | ||
| triples_loaded = load_events_to_triplestore(str(events_file), services.triple_store_service) | ||
| if triples_loaded > 0: | ||
| logger.info(f"✅ Cyber security data loaded: {triples_loaded} triples") | ||
| else: | ||
| logger.warning("⚠️ No cyber security data loaded") | ||
| else: | ||
| logger.error(f"❌ Events file not found: {events_file}") |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find this under storage/datastore