Notebooks
M
MongoDB
Agentic Knowledge Discovery Notebook

Agentic Knowledge Discovery Notebook

agentsartificial-intelligencellmspartnersmongodb-genai-showcasegenerative-airaglangchain

Emergency Response System: Intelligent Crisis Management: Unlocking Enterprise Data with MongoDB Vector Search, LangChain, and LangGraph


Use Case Overview

In today's complex technical environment, organizations face critical incidents—ranging from network outages and security breaches to infrastructure failures and service disruptions. When these crises occur, teams must rapidly mobilize the right expertise, access relevant knowledge resources, and coordinate response efforts under significant time pressure.

Imagine:

  • a critical 5G network outage affecting multiple metropolitan areas,
  • a data center hardware failure impacting enterprise customers,
  • or a security breach requiring immediate containment.

Each crisis demands rapid response spanning multiple technical domains, requiring organizations to quickly assemble the right experts, access relevant procedures, and coordinate complex actions—all while business-critical services remain offline.

This solution transforms Emergency Response Management by:

  • Accelerating crisis detection: Automatically parsing incident reports to extract critical parameters, affected systems, and required skill sets.
  • Assembling optimal response teams: Identifying available experts with the precise skills needed for each unique crisis situation.
  • Mobilizing knowledge resources: Retrieving relevant technical procedures, best practices, and previous incident documentation.
  • Orchestrating coordinated response: Generating comprehensive response plans with prioritized action items, team assignments, and communication protocols.

Built on MongoDB Atlas Vector Search for high-performance semantic search and document retrieval, LangChain and LangGraph for agentic workflow orchestration, this approach delivers an intelligent emergency response system that dramatically reduces incident resolution time and business impact.

image.png

Key Components

  1. Crisis Detection: Analyzes unstructured incident reports to extract structured data about the crisis type, severity, affected systems, and required expertise.
  2. Expert Identification: Searches employee records using semantic matching to identify personnel with crisis-relevant skills and availability.
  3. Knowledge Resource Gathering: Retrieves technical documentation, recovery procedures, and best practices specifically relevant to the current crisis.
  4. Response Plan Generation: Creates comprehensive response plans with team assignments, prioritized action items, communication protocols, and estimated resolution timelines.

Business Impact

  • Reduced Average Time to Resolution: Accelerates response time by automating the most time-consuming aspects of crisis management.
  • Optimal Team Composition: Ensures the most qualified experts are engaged based on real-time availability and precise skill matching.
  • Enhanced Decision Support: Provides response teams with only the most relevant knowledge resources and procedures.
  • Improved Stakeholder Communication: Generates structured briefings and updates for both technical teams and business stakeholders.

This intelligent system transforms crisis management from a reactive, often chaotic process into a structured, data-driven workflow that minimizes business impact and accelerates service restoration.

Cross-Industry Applications This emergency response architecture can be readily adapted to various industries:

1. Healthcare

  • Mobilizing specialized medical teams for rare conditions or mass casualty events
  • Coordinating expertise during disease outbreaks or public health emergencies

2. Financial Services

  • Assembling fraud response teams for complex financial incidents
  • Coordinating technical and business experts during trading system failures

3. Energy and Utilities

  • Mobilizing technical teams during power grid failures or outages
  • Assembling environmental specialists during contamination events

4. Manufacturing

  • Coordinating experts to minimize downtime on critical production equipment
  • Assembling cross-functional teams for supply chain or quality control crises

5. Transportation

  • Mobilizing aviation or maritime experts during system failures or safety incidents
  • Coordinating response teams for logistics network disruptions

6. Government

  • Assembling emergency management teams during natural disasters
  • Mobilizing technical expertise for infrastructure failures or cybersecurity incidents

Objective:

Enable enterprise users to query and explore organizational knowledge across FAQs, project details, and employee expertise in natural language.

Key Benefits:

  • Reduced time-to-insight: Semantic search surfaces relevant results even when keywords differ.

  • Contextual reasoning: Agents chain multi-step queries (e.g., “Which engineer led Project P123?”).

  • Scalable architecture: Easily extend to new data sources (Confluence, emails, design documents).

Key Components:

  • MongoDB Atlas Vector Search: Dense vector indexing for semantic relevance.

  • Voyage AI: State of the art embedding models and rerankers

  • LangChain: Embedding pipelines and workflow management.

  • LangGraph: Agentic, graph-driven decision making for complex queries.

[1]
[2]
[3]
Enter your OPENAI API KEY: ··········

Part 0: Synthetic Data Creation

[ ]
[ ]
[ ]
[ ]
[4]
[ ]
[ ]
[ ]
[ ]
[ ]

Part 1: Data Loading, Cleaning and Preparation

[5]
Enter your VOYAGE AI API key: ··········
[ ]
[ ]
[ ]

Generating emebdding for datapoints

[6]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]

Part 2: Database Connection, Collection and Indexes

Connecting to MongoDB

MongoDB acts as both an operational and a vector database for the RAG system. MongoDB Atlas specifically provides a database solution that efficiently stores, queries and retrieves vector embeddings.

Setup

To use MongoDB as a toolbox, you will need to complete the following steps:

  1. Register for a MongoDB Account:

  2. Create a MongoDB Cluster

  3. Set Up Database Access:

    • In the left sidebar, click on "Database Access" under "Security".
    • Click "Add New Database User".
    • Create a username and a strong password. Save these credentials securely.
    • Set the appropriate permissions for the user (e.g., "Read and write to any database").
  4. Configure Network Access:

    • In the left sidebar, click on "Network Access" under "Security".
    • Click "Add IP Address".
    • To allow access from anywhere (not recommended for production), enter 0.0.0.0/0.
    • For better security, whitelist only the specific IP addresses that need access.
  5. Follow MongoDB’s steps to get the connection string from the Atlas UI. After setting up the database and obtaining the Atlas cluster connection URI, securely store the URI within your development environment.

[7]
Enter your MongoDB URI: ··········
[8]
[9]
Connection to MongoDB successful

Create collections

[10]
[ ]
[ ]

Create Indexes

Create the vector search indexes

[ ]
[ ]

Create the search indexes

[ ]
[ ]
[ ]

Data Ingestion

[ ]

Part 3: Creating and Testing Retrieval Methods With LangChain

Text Search

[11]
[12]
/tmp/ipython-input-3957023351.py:19: LangChainDeprecationWarning: The method `BaseRetriever.get_relevant_documents` was deprecated in langchain-core 0.1.46 and will be removed in 1.0. Use :meth:`~invoke` instead.
  result = full_text_search.get_relevant_documents(query)
[Document(metadata={'_id': '68c02694dc3b288b36954711', 'emp_id': 'employees-0', 'role': 'Network Engineer', 'department': 'Network Operations', 'skills': ['IP networking', 'routing and switching', 'fiber optics', 'network security', 'VoIP'], 'bio': 'Jordan Singh is a seasoned network engineer specializing in telecom network infrastructure with expertise in routing, switching, and optical transmission technologies.', 'manager': None, 'start_date': '2020-07-15', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': [], 'mentees': [], 'frequent_collaborators': [], 'score': 0.9914655685424805}, page_content='Jordan Singh'),
, Document(metadata={'_id': '68fa0cb57e65d1c84f9e0465', 'emp_id': 'employees-4', 'role': 'Network Engineer', 'department': 'Network Operations', 'skills': ['Network Design', 'Cisco Routers', 'VoIP Implementation', 'VPN Configuration', 'Troubleshooting', 'Telecom Infrastructure'], 'bio': 'Jordan Kim is an experienced Network Engineer with expertise in designing, implementing, and maintaining large-scale telecommunication networks. Skilled in troubleshooting and optimizing network infrastructure.', 'manager': 'employees-2', 'start_date': '2021-05-17', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': ['employees-1'], 'mentees': ['employees-3'], 'frequent_collaborators': ['employees-0'], 'score': 0.9914655685424805}, page_content='Jordan Kim'),
, Document(metadata={'_id': '68fa0cb57e65d1c84f9e0461', 'emp_id': 'employees-0', 'role': 'Network Engineer', 'department': 'Network Operations', 'skills': ['Network Design', 'Cisco Routing & Switching', 'Fiber Optic Communication', 'Telecommunications Protocols', 'Network Security'], 'bio': 'Jordan Lee is an experienced network engineer specializing in the design and maintenance of robust telecommunications infrastructure. Proven expertise in optimizing large-scale networks and ensuring high reliability for carrier-grade operations.', 'manager': None, 'start_date': '2021-03-15', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': [], 'mentees': [], 'frequent_collaborators': [], 'score': 0.9914655685424805}, page_content='Jordan Lee')]

Vector Search

[13]
[14]
  0%|          | 0/1 [00:00<?, ?it/s]
[(Document(id='68fa0cb57e65d1c84f9e0464', metadata={'_id': '68fa0cb57e65d1c84f9e0464', 'emp_id': 'employees-3', 'name': 'Priya Deshmukh', 'role': 'Network Engineer', 'department': 'Network Operations', 'skills': ['Cisco networking', 'VoIP installation', 'Fiber optic infrastructure', 'Network security', 'BGP & OSPF routing', 'Troubleshooting WAN/LAN'], 'manager': 'employees-1', 'start_date': '2021-02-15', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': ['employees-2'], 'mentees': ['employees-0'], 'frequent_collaborators': ['employees-2', 'employees-1']}, page_content='Priya is a seasoned Network Engineer with 7 years of experience in designing, implementing, and optimizing telecom network infrastructures. She specializes in VoIP systems and high-capacity fiber-optic deployments for enterprise clients.'),
,  0.7213080525398254),
, (Document(id='68921a051d77d2d9c2b14100', metadata={'_id': '68921a051d77d2d9c2b14100', 'emp_id': 'employees-7', 'name': 'Sophia Kim', 'role': 'Network Engineer', 'department': 'Network Operations', 'skills': ['IP routing', 'network design', 'fiber optics', 'troubleshooting', 'VoIP'], 'manager': 'employees-3', 'start_date': '2021-04-12', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': ['employees-6'], 'mentees': ['employees-4'], 'frequent_collaborators': ['employees-0', 'employees-1']}, page_content='Sophia Kim is a dedicated network engineer with more than 5 years’ experience designing and maintaining high-capacity telecom networks. She specializes in optimizing network infrastructure for performance and reliability, and has strong expertise with fiber optic systems.'),
,  0.7185817956924438),
, (Document(id='68921a051d77d2d9c2b140ff', metadata={'_id': '68921a051d77d2d9c2b140ff', 'emp_id': 'employees-6', 'name': 'Samantha Riley', 'role': 'Network Engineer', 'department': 'Network Operations', 'skills': ['Network Design', 'Cisco Routers', 'VoIP Configuration', 'Telecommunications Protocols', 'Firewall Management', 'Network Troubleshooting'], 'manager': 'employees-1', 'start_date': '2019-04-15', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': ['employees-1'], 'mentees': ['employees-0'], 'frequent_collaborators': ['employees-3', 'employees-5']}, page_content='Samantha Riley is a skilled network engineer with over 7 years of experience in designing and maintaining large-scale telecommunications networks. She specializes in VoIP solutions and has a strong background in network security and troubleshooting complex network issues.'),
,  0.7175770998001099)]

Hybrid Search

[ ]
[16]
  0%|          | 0/1 [00:00<?, ?it/s]
[Document(metadata={'_id': '68fa0cb57e65d1c84f9e0465', 'emp_id': 'employees-4', 'name': 'Jordan Kim', 'role': 'Network Engineer', 'department': 'Network Operations', 'skills': ['Network Design', 'Cisco Routers', 'VoIP Implementation', 'VPN Configuration', 'Troubleshooting', 'Telecom Infrastructure'], 'manager': 'employees-2', 'start_date': '2021-05-17', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': ['employees-1'], 'mentees': ['employees-3'], 'frequent_collaborators': ['employees-0'], 'vector_score': 0.01639344262295082, 'rank': 0, 'fulltext_score': 0, 'score': 0.01639344262295082}, page_content='Jordan Kim is an experienced Network Engineer with expertise in designing, implementing, and maintaining large-scale telecommunication networks. Skilled in troubleshooting and optimizing network infrastructure.'),
, Document(metadata={'_id': '68921a051d77d2d9c2b140fd', 'emp_id': 'employees-4', 'name': 'Maya Patel', 'role': 'Network Engineer', 'department': 'Network Operations', 'skills': ['Network Design', 'Cisco Routers & Switches', 'VoIP', 'Telecommunications Infrastructure', 'Network Security', 'Fiber Optic Communication', 'Linux Administration'], 'manager': 'employees-1', 'start_date': '2017-03-12', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': ['employees-1'], 'mentees': ['employees-2'], 'frequent_collaborators': ['employees-0', 'employees-3'], 'vector_score': 0.016129032258064516, 'rank': 1, 'fulltext_score': 0, 'score': 0.016129032258064516}, page_content='Maya Patel is a seasoned network engineer with over 7 years of experience in the telecommunications sector. Her expertise spans network design, deployment, and ongoing optimization, focusing on delivering high-availability communication platforms. Passionate about mentoring new engineers, she contributes to building robust technical teams.'),
, Document(metadata={'_id': '68c02694dc3b288b36954718', 'emp_id': 'employees-7', 'name': 'Anjali Patel', 'role': 'System Administrator', 'department': 'IT Operations', 'skills': ['Linux administration', 'Network security', 'Telecommunications systems', 'Firewall configuration', 'Cloud infrastructure', 'Incident response'], 'manager': 'employees-3', 'start_date': '2021-03-15', 'end_date': '', 'current_projects': [], 'past_projects': [], 'mentors': ['employees-5'], 'mentees': ['employees-6'], 'frequent_collaborators': ['employees-1', 'employees-4'], 'vector_score': 0.015873015873015872, 'rank': 2, 'fulltext_score': 0, 'score': 0.015873015873015872}, page_content='Anjali Patel is an experienced System Administrator specializing in telecom infrastructure. With a strong focus on network security and high-availability systems, Anjali ensures seamless IT operations and supports large-scale telecommunications environments.')]

Graph Search

[17]
[18]
AIMessage(content='There are no entities related to the query about projects that share team members with good communication.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 472, 'total_tokens': 490, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_cbf1785567', 'id': 'chatcmpl-CTni54CdmYkU5yr8PO7CNRmaSYXfV', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--dfb7c5f1-8ae2-4f2b-a559-8aaa151349dd-0', usage_metadata={'input_tokens': 472, 'output_tokens': 18, 'total_tokens': 490, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

Cross-Team Project Knowledge Discovery

  • Finding how different projects interconnect through shared team members
  • Identifying knowledge transfer paths when employees move between projects
  • Discovering dependencies between projects that aren't documented but exist through shared personnel

Expert Network Mapping

  • Tracing expertise flows when experts collaborate on projects
  • Finding indirect expertise paths (e.g., "Who can John reach out to for Android development help through his network?")
  • Discovering emerging expertise clusters around specific technologies

Part 4: Automated Workflow and Agentic AI Implementation

AUTOMATION SCENARIO : Critical 5G Network Issue Response ( Workflow Automation)

  • Context: A major 5G network outage affects multiple regions. The system needs to quickly assemble an emergency response team with specific expertise.

  • Workflow Steps:

    • Step 1: Crisis Detection and Skill Requirements
    • Step 2: Expert Identification
    • Step 3: Team Composition Analysis
    • Step 4: Knowledge Asset Preparation
    • Step 5: Team Activation and Brief
Overview

image.png

Create Collections and Indexes [Crisis]
[19]
crisis_events collection already exists
[ ]
Data Models
[22]
Incident Report Parser
[23]
Example Incident Report
[24]

The incident report can be a text document such as a PDF, and if images and tables are included in the PDF then we advice leveraging voyage multimodal embedding models

Testing the Incident Response Parser
[25]
=== Processing Incident Report ===
Event ID: CRISIS-20250505-001
Type: CrisisType.NETWORK_OUTAGE
Severity: SeverityLevel.CRITICAL
Title: Critical 5G Network Failure Across Major US Cities
Description: A complete 5G network outage affects the North America region, with service down in New York City, Boston, and Philadelphia due to equipment overheating during maintenance. Primary data center and majority of gNodeB stations have failed, causing major business and consumer disruptions.
Affected Systems: 5G Network Service, Core Network, gNodeB stations, Primary Data Center
Affected Regions: New York City metro area, Boston metropolitan region, Philadelphia and surrounding counties
Customer Impact: Approximately 2 million customers unable to access 5G services; enterprise customers experience business-critical disruptions; mobile data speeds reduced to 4G in adjacent areas.
Required Skills: 5G network engineering, Hardware repair, Crisis management, Customer communications

Issue Response Engine (Brings all processes together)
[26]
LangGraph State
[ ]
Workflow Definiton
[ ]
Workflow Excecution
[29]
=== Emergency Response System Activated ===


1. Beginning crisis detecting and parsing provided information...
Crisis event saved into records
Crisis Event Generated:
{
  "event_id": "CRISIS-20250505-001",
  "event_type": "Network Outage",
  "severity": "critical",
  "title": "Critical 5G Network Failure Across Major North American Cities",
  "description": "A complete 5G network outage has affected the NYC, Boston, and Philadelphia regions. Core network is down due to equipment overheating during maintenance. Estimated 2 million customers impacted, including major enterprise clients. Mobile data speeds are degraded in surrounding areas. Immediate emergency technical and customer response required.",
  "affected_systems": [
    "5G Network Service",
    "Core Network",
    "gNodeB Stations",
    "Primary Data Center"
  ],
  "affected_regions": [
    "New York City metro area",
    "Boston metropolitan region",
    "Philadelphia and surrounding counties"
  ],
  "customer_impact": "Estimated 2 million customers without 5G access; business-critical disruptions reported; mobile data reduced to 4G in surrounding areas; significant revenue and SLA impacts.",
  "required_skills": [
    "5G network engineering",
    "Hardware repair",
    "Crisis management",
    "Customer communications"
  ]
}


2. Identifying experts within records suitable to handle crisis event...
Search Query: Find experts with 5G network engineering, Hardware repair, Crisis management, Customer communications in their skills and experience
  0%|          | 0/1 [00:00<?, ?it/s]
Below are the experts identified ⬇️
[{'emp_id': 'employees-9', 'name': 'Aisha Patel', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['LTE/5G Networking', 'Network Security', 'Cisco Routers & Switches', 'RF Planning', 'Troubleshooting', 'Fiber Optic Communications', 'Data Center Networking'], 'current_projects': []}, {'emp_id': 'employees-5', 'name': 'Sarah Kim', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['Network Design', 'Telecommunications Infrastructure', 'Fiber Optic Networking', 'Routing & Switching', 'VoIP', 'Troubleshooting', 'Cisco Certified'], 'current_projects': []}, {'emp_id': 'employees-7', 'name': 'Sophia Kim', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['IP routing', 'network design', 'fiber optics', 'troubleshooting', 'VoIP'], 'current_projects': []}, {'emp_id': 'employees-4', 'name': 'Ravi Sharma', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['Network Design', 'Troubleshooting', 'Cisco Routers', 'Optical Fiber Communication', 'Telecommunications Protocols', 'Packet Switching'], 'current_projects': []}, {'emp_id': 'employees-2', 'name': 'Priya Raman', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['Network Design', 'Routing & Switching', 'Telecommunications Infrastructure', 'LAN/WAN Optimization', 'Fiber Optics', 'VoIP', 'Troubleshooting'], 'current_projects': []}]


3. Gathering knowledge assets to prep team on...
  0%|          | 0/1 [00:00<?, ?it/s]
Below are the knowledge assets gathered ⬇️
[{'asset_id': 'knowledge_assets-2', 'title': 'Technical Procedures and Best Practices for Cloud Data Migration', 'type': 'documentation', 'author': 'employees-6', 'content': '', 'creation_date': '2024-06-20'}, {'asset_id': 'knowledge_assets-4', 'title': 'Secure API Integration Procedures and Best Practices', 'type': 'best_practice', 'author': 'employees-3', 'content': '', 'creation_date': '2024-06-16T10:00:00Z'}, {'asset_id': 'knowledge_assets-3', 'title': 'Best Practices for API Deployment and Version Management', 'type': 'best_practice', 'author': 'employees-6', 'content': '', 'creation_date': '2024-06-18'}, {'asset_id': 'knowledge_assets-5', 'title': 'Automating Deployment Pipelines: Technical Procedures & Best Practices', 'type': 'documentation', 'author': 'employees-3', 'content': '', 'creation_date': '2024-05-12T09:30:00Z'}, {'asset_id': 'knowledge_assets-3', 'title': 'Standard Procedures and Best Practices for Secure API Development', 'type': 'best_practice', 'author': 'employees-4', 'content': '', 'creation_date': '2024-06-15T09:25:00Z'}]
4. Activating team and creating a response plan...
Briefing Generated:
---
## CRISIS RESPONSE TEAM BRIEFING  
**Event ID:** CRISIS-20250505-001  
**Event:** Critical 5G Network Failure Across Major North American Cities  
**Severity:** CRITICAL

---

### 1. Executive Summary

A total outage of the 5G network has simultaneously impacted New York City, Boston, and Philadelphia metropolitan areas due to equipment overheating during scheduled maintenance at the core network. This outage affects approximately 2 million customers, severely degrading service for enterprise clients and reducing mobile data speeds to 4G in adjacent regions. Immediate resolution is vital to reduce further SLA, financial, and customer trust damages.

### 2. Team Assignments & Roles

| Name           | Role               | Assignment                                 |
|----------------|--------------------|--------------------------------------------|
| **Aisha Patel**   | Technical Support  | Lead technical triage & data center escalation |
| **Sarah Kim**    | Technical Support  | Fault isolation: gNodeB & radio access analysis |
| **Sophia Kim**   | Technical Support  | Core network diagnostics & configuration review |
| **Ravi Sharma**  | Technical Support  | Overheating investigation and equipment coordination |
| **Priya Raman**  | Technical Support  | Customer/enterprise impact mapping & technical comms |

**All engineers**: On rotating shifts for continuous coverage, status escalation responsibility as per situation criticality.

### 3. Priority Action Items

1. **Immediate Core Network Restoration**
   - Diagnose overheating incident, restore failed components, and reroute traffic if possible.
2. **Outage Containment**
   - Isolate affected zones, stabilize gNodeB stations, and prevent cascading failures.
3. **Service Continuity**
   - Deploy fallback/temporary solutions to partially restore service or escalate to 4G fallback.
4. **Customer Impact Minimization**
   - Map affected business clients; prioritize mission-critical sectors.
5. **Root Cause Analysis**
   - Collect forensic data for post-mortem and long-term remediation.
6. **Ongoing Status Updates**
   - Maintain near real-time crisis dashboard for executives and customer service.

### 4. Available Resources & Documentation

- **Technical Procedures and Best Practices for Cloud Data Migration**
- **Secure API Integration Procedures and Best Practices**
- **Best Practices for API Deployment & Version Management**
- **Automating Deployment Pipelines: Technical Procedures & Best Practices**
- **Standard Procedures and Best Practices for Secure API Development**

(All resources are available on the shared crisis drive and may provide applicable guidance for rapid deployment or temporary reroute solutions.)

### 5. Expected Timeline & Milestones

- **0–1 hrs:** Situation triage, core isolation, and first technical update  
- **1–3 hrs:** Action on preliminary fix and start of targeted restoration  
- **3–6 hrs:** Progress update, phased service restoration (reprioritize if delays), post-outage impact assessment initiation  
- **6+ hrs:** Full network restoration, incident review, and executive summary

### 6. Communication Protocols

- **Incident Command:** Led by Aisha Patel; status via secure team Slack #crisis-response and standby phone bridge
- **Update Frequency:** Every 30 minutes or at major milestone completion
- **Stakeholder Reports:** Every 1 hour to executive leadership and customer service liaisons
- **Customer Messaging:** Drafted by Priya Raman in sync with PR; distributed via website, SMS, and enterprise client portals

### 7. Success Criteria

- **Restoration:** 5G network service fully restored to impacted metro areas
- **Stabilization:** No lingering or recurrent outages detected for 24 hours
- **Root Cause:** Documented and communicated, with preventive actions identified
- **Customer Communication:** Timely, clear, and accurate updates delivered throughout
- **SLA Compliance:** Post-crisis SLA review completed and breach instances mitigated

---

**ALL HANDS: Be vigilant, submit all findings through assigned channels, and prepare escalation summaries at each milestone.**
There were minor errors so retrying...
["Error activating team: 'EmergencyResponseWorkflow' object has no attribute '_generate_action_items'"]


1. Beginning crisis detecting and parsing provided information...
Crisis event saved into records
Crisis Event Generated:
{
  "event_id": "CRISIS-20250505-001",
  "event_type": "Network Outage",
  "severity": "critical",
  "title": "Critical 5G Network Failure in North America Impacting Millions",
  "description": "A complete 5G network failure has occurred across major North American cities due to equipment overheating during a maintenance window, causing core network and multiple gNodeB node failures. Business-critical outages and severe degradation of mobile data speeds are being reported.",
  "affected_systems": [
    "5G Network Service",
    "Core Network",
    "gNodeB Stations",
    "Primary Data Center"
  ],
  "affected_regions": [
    "New York City metro area",
    "Boston metropolitan region",
    "Philadelphia and surrounding counties"
  ],
  "customer_impact": "Approximately 2 million customers cannot access 5G services. Enterprise clients face business-critical disruptions; mobile data speed reduced to 4G in surrounding areas.",
  "required_skills": [
    "Network engineering (5G expertise)",
    "Hardware repair",
    "Crisis management",
    "Customer communications"
  ]
}


2. Identifying experts within records suitable to handle crisis event...
Search Query: Find experts with Network engineering (5G expertise), Hardware repair, Crisis management, Customer communications in their skills and experience
  0%|          | 0/1 [00:00<?, ?it/s]
Below are the experts identified ⬇️
[{'emp_id': 'employees-9', 'name': 'Aisha Patel', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['LTE/5G Networking', 'Network Security', 'Cisco Routers & Switches', 'RF Planning', 'Troubleshooting', 'Fiber Optic Communications', 'Data Center Networking'], 'current_projects': []}, {'emp_id': 'employees-5', 'name': 'Sarah Kim', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['Network Design', 'Telecommunications Infrastructure', 'Fiber Optic Networking', 'Routing & Switching', 'VoIP', 'Troubleshooting', 'Cisco Certified'], 'current_projects': []}, {'emp_id': 'employees-7', 'name': 'Sophia Kim', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['IP routing', 'network design', 'fiber optics', 'troubleshooting', 'VoIP'], 'current_projects': []}, {'emp_id': 'employees-4', 'name': 'Ravi Sharma', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['Network Design', 'Troubleshooting', 'Cisco Routers', 'Optical Fiber Communication', 'Telecommunications Protocols', 'Packet Switching'], 'current_projects': []}, {'emp_id': 'employees-0', 'name': 'Jordan Singh', 'role': 'Network Engineer', 'department': 'Network Operations', 'bio': None, 'skills': ['IP networking', 'routing and switching', 'fiber optics', 'network security', 'VoIP'], 'current_projects': []}]


3. Gathering knowledge assets to prep team on...
  0%|          | 0/1 [00:00<?, ?it/s]
Below are the knowledge assets gathered ⬇️
[{'asset_id': 'knowledge_assets-2', 'title': 'Technical Procedures and Best Practices for Cloud Data Migration', 'type': 'documentation', 'author': 'employees-6', 'content': '', 'creation_date': '2024-06-20'}, {'asset_id': 'knowledge_assets-4', 'title': 'Secure API Integration Procedures and Best Practices', 'type': 'best_practice', 'author': 'employees-3', 'content': '', 'creation_date': '2024-06-16T10:00:00Z'}, {'asset_id': 'knowledge_assets-3', 'title': 'Best Practices for API Deployment and Version Management', 'type': 'best_practice', 'author': 'employees-6', 'content': '', 'creation_date': '2024-06-18'}, {'asset_id': 'knowledge_assets-5', 'title': 'Automating Deployment Pipelines: Technical Procedures & Best Practices', 'type': 'documentation', 'author': 'employees-3', 'content': '', 'creation_date': '2024-05-12T09:30:00Z'}, {'asset_id': 'knowledge_assets-3', 'title': 'Standard Procedures and Best Practices for Secure API Development', 'type': 'best_practice', 'author': 'employees-4', 'content': '', 'creation_date': '2024-06-15T09:25:00Z'}]
4. Activating team and creating a response plan...
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
/tmp/ipython-input-1171894299.py in <cell line: 0>()
     60 # Execute the complete workflow starting from incident report
     61 print("=== Emergency Response System Activated ===")
---> 62 result = emergency_workflow.respond_to_crisis(incident_report)  # Empty dict to trigger detection & parsing
     63 
     64 # Print results

/tmp/ipython-input-1663767310.py in respond_to_crisis(self, incident_report)
    297         # Run the workflow
    298         config = {"configurable": {"thread_id": 1}}
--> 299         final_state = self.workflow.invoke(initial_state, config)
    300 
    301         return final_state

/usr/local/lib/python3.12/dist-packages/langgraph/pregel/main.py in invoke(self, input, config, context, stream_mode, print_mode, output_keys, interrupt_before, interrupt_after, durability, **kwargs)
   3092         interrupts: list[Interrupt] = []
   3093 
-> 3094         for chunk in self.stream(
   3095             input,
   3096             config,

/usr/local/lib/python3.12/dist-packages/langgraph/pregel/main.py in stream(self, input, config, context, stream_mode, print_mode, output_keys, interrupt_before, interrupt_after, durability, subgraphs, debug, **kwargs)
   2677                     for task in loop.match_cached_writes():
   2678                         loop.output_writes(task.id, task.writes, cached=True)
-> 2679                     for _ in runner.tick(
   2680                         [t for t in loop.tasks.values() if not t.writes],
   2681                         timeout=self.step_timeout,

/usr/local/lib/python3.12/dist-packages/langgraph/pregel/_runner.py in tick(self, tasks, reraise, timeout, retry_policy, get_waiter, schedule_task)
    165             t = tasks[0]
    166             try:
--> 167                 run_with_retry(
    168                     t,
    169                     retry_policy,

/usr/local/lib/python3.12/dist-packages/langgraph/pregel/_retry.py in run_with_retry(task, retry_policy, configurable)
     40             task.writes.clear()
     41             # run the task
---> 42             return task.proc.invoke(task.input, config)
     43         except ParentCommand as exc:
     44             ns: str = config[CONF][CONFIG_KEY_CHECKPOINT_NS]

/usr/local/lib/python3.12/dist-packages/langgraph/_internal/_runnable.py in invoke(self, input, config, **kwargs)
    654                     # run in context
    655                     with set_config_context(config, run) as context:
--> 656                         input = context.run(step.invoke, input, config, **kwargs)
    657                 else:
    658                     input = step.invoke(input, config)

/usr/local/lib/python3.12/dist-packages/langgraph/_internal/_runnable.py in invoke(self, input, config, **kwargs)
    398                 run_manager.on_chain_end(ret)
    399         else:
--> 400             ret = self.func(*args, **kwargs)
    401         if self.recurse and isinstance(ret, Runnable):
    402             return ret.invoke(input, config)

/tmp/ipython-input-1663767310.py in _activate_team_and_create_plan(self, state)
    174 
    175             # Use IssueResponseEngine to create team briefing
--> 176             briefing_text = self.issue_engine.team_activation_and_brief(
    177                 crisis_event, selected_team, relevant_knowledge
    178             )

/tmp/ipython-input-1941078503.py in team_activation_and_brief(self, crisis_data, experts_identified, knowledge_assets)
    100 
    101       # Call GPT-4.1 to generate briefing
--> 102       response = openai_client.responses.create(
    103         model="gpt-4.1",
    104         input=prompt,

/usr/local/lib/python3.12/dist-packages/openai/resources/responses/responses.py in create(self, background, conversation, include, input, instructions, max_output_tokens, max_tool_calls, metadata, model, parallel_tool_calls, previous_response_id, prompt, prompt_cache_key, reasoning, safety_identifier, service_tier, store, stream, stream_options, temperature, text, tool_choice, tools, top_logprobs, top_p, truncation, user, extra_headers, extra_query, extra_body, timeout)
    838         timeout: float | httpx.Timeout | None | NotGiven = not_given,
    839     ) -> Response | Stream[ResponseStreamEvent]:
--> 840         return self._post(
    841             "/responses",
    842             body=maybe_transform(

/usr/local/lib/python3.12/dist-packages/openai/_base_client.py in post(self, path, cast_to, body, options, files, stream, stream_cls)
   1257             method="post", url=path, json_data=body, files=to_httpx_files(files), **options
   1258         )
-> 1259         return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
   1260 
   1261     def patch(

/usr/local/lib/python3.12/dist-packages/openai/_base_client.py in request(self, cast_to, options, stream, stream_cls)
    980             response = None
    981             try:
--> 982                 response = self._client.send(
    983                     request,
    984                     stream=stream or self._should_stream_response_body(request=request),

/usr/local/lib/python3.12/dist-packages/httpx/_client.py in send(self, request, stream, auth, follow_redirects)
    912         auth = self._build_request_auth(request, auth)
    913 
--> 914         response = self._send_handling_auth(
    915             request,
    916             auth=auth,

/usr/local/lib/python3.12/dist-packages/httpx/_client.py in _send_handling_auth(self, request, auth, follow_redirects, history)
    940 
    941             while True:
--> 942                 response = self._send_handling_redirects(
    943                     request,
    944                     follow_redirects=follow_redirects,

/usr/local/lib/python3.12/dist-packages/httpx/_client.py in _send_handling_redirects(self, request, follow_redirects, history)
    977                 hook(request)
    978 
--> 979             response = self._send_single_request(request)
    980             try:
    981                 for hook in self._event_hooks["response"]:

/usr/local/lib/python3.12/dist-packages/httpx/_client.py in _send_single_request(self, request)
   1012 
   1013         with request_context(request=request):
-> 1014             response = transport.handle_request(request)
   1015 
   1016         assert isinstance(response.stream, SyncByteStream)

/usr/local/lib/python3.12/dist-packages/httpx/_transports/default.py in handle_request(self, request)
    248         )
    249         with map_httpcore_exceptions():
--> 250             resp = self._pool.handle_request(req)
    251 
    252         assert isinstance(resp.stream, typing.Iterable)

/usr/local/lib/python3.12/dist-packages/httpcore/_sync/connection_pool.py in handle_request(self, request)
    254 
    255             self._close_connections(closing)
--> 256             raise exc from None
    257 
    258         # Return the response. Note that in this case we still have to manage

/usr/local/lib/python3.12/dist-packages/httpcore/_sync/connection_pool.py in handle_request(self, request)
    234                 try:
    235                     # Send the request on the assigned connection.
--> 236                     response = connection.handle_request(
    237                         pool_request.request
    238                     )

/usr/local/lib/python3.12/dist-packages/httpcore/_sync/connection.py in handle_request(self, request)
    101             raise exc
    102 
--> 103         return self._connection.handle_request(request)
    104 
    105     def _connect(self, request: Request) -> NetworkStream:

/usr/local/lib/python3.12/dist-packages/httpcore/_sync/http11.py in handle_request(self, request)
    134                 with Trace("response_closed", logger, request) as trace:
    135                     self._response_closed()
--> 136             raise exc
    137 
    138     # Sending the request...

/usr/local/lib/python3.12/dist-packages/httpcore/_sync/http11.py in handle_request(self, request)
    104                     headers,
    105                     trailing_data,
--> 106                 ) = self._receive_response_headers(**kwargs)
    107                 trace.return_value = (
    108                     http_version,

/usr/local/lib/python3.12/dist-packages/httpcore/_sync/http11.py in _receive_response_headers(self, request)
    175 
    176         while True:
--> 177             event = self._receive_event(timeout=timeout)
    178             if isinstance(event, h11.Response):
    179                 break

/usr/local/lib/python3.12/dist-packages/httpcore/_sync/http11.py in _receive_event(self, timeout)
    215 
    216             if event is h11.NEED_DATA:
--> 217                 data = self._network_stream.read(
    218                     self.READ_NUM_BYTES, timeout=timeout
    219                 )

/usr/local/lib/python3.12/dist-packages/httpcore/_backends/sync.py in read(self, max_bytes, timeout)
    126         with map_exceptions(exc_map):
    127             self._sock.settimeout(timeout)
--> 128             return self._sock.recv(max_bytes)
    129 
    130     def write(self, buffer: bytes, timeout: float | None = None) -> None:

/usr/lib/python3.12/ssl.py in recv(self, buflen, flags)
   1230                     "non-zero flags not allowed in calls to recv() on %s" %
   1231                     self.__class__)
-> 1232             return self.read(buflen)
   1233         else:
   1234             return super().recv(buflen, flags)

/usr/lib/python3.12/ssl.py in read(self, len, buffer)
   1103                 return self._sslobj.read(len, buffer)
   1104             else:
-> 1105                 return self._sslobj.read(len)
   1106         except SSLError as x:
   1107             if x.args[0] == SSL_ERROR_EOF and self.suppress_ragged_eofs:

KeyboardInterrupt: 

AUTONOMY SCENARIO : Critical 5G Network Issue Response (Agentic AI )

Overview

image.png

Define Tools
[30]
[31]
[32]
[33]
Aggregate Tools
[34]
LLM Defintion
[35]
Agent Definition
[36]
Node Definition
[37]
[38]
Autonomous Graph Agent Definition
[39]
<langgraph.graph.state.StateGraph at 0x78b771387710>
[40]
/tmp/ipython-input-3211739412.py:5: DeprecationWarning: AsyncMongoDBSaver is deprecated and will be removed in 0.3.0 release. Please use the async methods of MongoDBSaver instead.
  mongodb_checkpointer = AsyncMongoDBSaver(async_mongodb_client)
[41]
Output
[42]
[43]
Executing the Agent
[44]
User: Can you run this incident report """   NETWORK CRISIS REPORT - PRIORITY CRITICAL    Incident #: INC-20250505-3547   Service: 5G Network Service   Status: ACTIVE OUTAGE    SUMMARY:   Complete 5G network failure reported across North America region    AFFECTED AREAS:   - New York City metro area   - Boston metropolitan region   - Philadelphia and surrounding counties    IMPACT ASSESSMENT:   - Estimated 2 million customers unable to access 5G services   - Enterprise customers reporting business-critical service disruptions   - Mobile data speeds degraded to 4G in surrounding areas    TECHNICAL DETAILS:   - Core Network Status: DOWN   - gNodeB Stations: 3/5 nodes failed   - Data Center: Primary facility shows hardware failures   - Root Cause: Equipment overheating during maintenance window    TIMELINE:   15:00 EST - Maintenance window begins   15:25 EST - First customer complaints received   15:30 EST - Network monitoring alerts triggered   15:45 EST - Service outage confirmed    REQUIRED RESPONSE:   - Network engineers with 5G expertise   - Hardware repair technicians   - Crisis management team   - Customer communications team    BUSINESS IMPACT:   - Revenue impact: $5,000/minute   - SLA breach: Yes (2-hour response requirement)   - Media attention: High (local news coverage)    NEXT STEPS:   1. Activate emergency response protocol   2. Dispatch on-site technicians   3. Prepare customer communications   4. Assess backup systems deployment """
Assistant: 
/tmp/ipython-input-3900299550.py:20: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  result = AIMessage(**result.dict(exclude={"type", "name"}), name="assistant")
Crisis event saved into records
Crisis Event Generated:
{
  "event_id": "CRISIS-20250505-001",
  "event_type": "Network Outage",
  "severity": "critical",
  "title": "Critical 5G Network Outage in Major US Metro Areas - North America",
  "description": "Complete 5G network failure across North American metro regions, with core network and gNodeB station failures due to equipment overheating during maintenance. Affects millions and enterprise customers, causing business-critical disruptions and media attention.",
  "affected_systems": [
    "5G Network Service",
    "Core Network",
    "gNodeB Stations",
    "Primary Data Center"
  ],
  "affected_regions": [
    "New York City metro area",
    "Boston metropolitan region",
    "Philadelphia and surrounding counties"
  ],
  "customer_impact": "Estimated 2 million customers unable to access 5G services; enterprise and business customers suffer critical disruptions; 4G data speeds in surrounding areas.",
  "required_skills": [
    "Network engineers with 5G expertise",
    "Hardware repair technicians",
    "Crisis management team",
    "Customer communications team"
  ]
}

1. Crisis detected and parsed:
- Type: CrisisType.NETWORK_OUTAGE
- Severity: SeverityLevel.CRITICAL
- Required skills: Network engineers with 5G expertise, Hardware repair technicians, Crisis management team, Customer communications team

[Tool Used: detect_crisis]
Tool Call ID: call_d8DN8mKlas8SZv8NVVKoo5yv
Content: {"event_id": "CRISIS-20250505-001", "event_type": "Network Outage", "severity": "critical", "title": "Critical 5G Network Outage in Major US Metro Areas - North America", "description": "Complete 5G network failure across North American metro regions, with core network and gNodeB station failures due to equipment overheating during maintenance. Affects millions and enterprise customers, causing business-critical disruptions and media attention.", "affected_systems": ["5G Network Service", "Core Network", "gNodeB Stations", "Primary Data Center"], "affected_regions": ["New York City metro area", "Boston metropolitan region", "Philadelphia and surrounding counties"], "customer_impact": "Estimated 2 million customers unable to access 5G services; enterprise and business customers suffer critical disruptions; 4G data speeds in surrounding areas.", "required_skills": ["Network engineers with 5G expertise", "Hardware repair technicians", "Crisis management team", "Customer communications team"]}
Assistant: Incident detected and analyzed:

- Crisis Type: Network Outage (Critical)
- Title: Critical 5G Network Outage in Major US Metro Areas - North America
- Description: Complete failure of the 5G network across New York, Boston, and Philadelphia due to equipment overheating during maintenance. Affects millions of customers and businesses, with major media coverage.
- Impact: 2 million customers affected, significant business disruption, degraded services, and high revenue loss.
- Affected Systems: 5G Network Service, Core Network, gNodeB Stations, Primary Data Center
- Required Response: 
  - Network engineers with 5G expertise
  - Hardware repair technicians
  - Crisis management team
  - Customer communications team

The crisis requires immediate activation of emergency protocols and multidisciplinary experts for rapid response. Would you like me to proceed gathering a crisis response team, related knowledge assets, and initiate response planning?
/tmp/ipython-input-3900299550.py:20: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  result = AIMessage(**result.dict(exclude={"type", "name"}), name="assistant")


User: q
Goodbye!
[ ]
[ ]