Incident Handling Process
1. Incident Handling Definition & Scope
1.1.1 Definition of Incident Handling
Incident Handling is a structured and systematic approach to managing and responding to computer and network security incidents.
Security Incident: According to the NIST Computer Security Incident Handling Guide, a security incident is an event with negative consequences for an organization's information systems or data. Examples include:
System Crashes: Unplanned outages or failures.
Packet Floods: Denial-of-Service (DoS) attacks overwhelming network resources.
Unauthorized Use of System Privileges: Abuse of legitimate access rights.
Unauthorized Access to Sensitive Data: Breach of confidentiality.
Execution of Destructive Malware: Malicious software causing damage or data loss.
Alert Fatigue: Security Operations Centers (SOCs) and Computer Security Incident Response Teams (CSIRTs) often face an overwhelming number of alerts. This course aims to help incident handlers:
Identify Critical Alerts: Focus on alerts that pose the most significant risk.
Understand Log Formats: Interpret various log types to extract meaningful information.
Utilize Context-Providing Techniques: Correlate alerts with other data sources to assess their severity.
1.1.2 Scope of Incident Handling
Beyond Intrusions: While intrusions are a common focus, incident handling encompasses a broader range of issues, including:
Malicious Insiders: Trusted individuals who misuse their access for malicious purposes.
Availability Issues: Disruptions to system availability, such as outages or service interruptions.
Loss of Intellectual Property: Theft of sensitive information, trade secrets, or proprietary data.
2. Incident Handling Process
The incident handling process is a cyclical framework consisting of four phases, often referred to as the Incident Response Life Cycle. This process serves as a roadmap for incident handlers to effectively manage and respond to incidents.

2.1 Preparation
Preparation is the foundation of effective incident response. It involves establishing the necessary resources, processes, and procedures to handle incidents efficiently.
Key Activities:
Building a Multidisciplinary Team:
Incident Handlers: Lead the response efforts.
Forensic Analysts: Collect and analyze evidence.
Malware Analysts: Analyze malicious software.
Support from NOC, Legal, and PR Departments: Ensure technical, legal, and communication aspects are addressed.
Establishing a Single Point of Contact (SPOC): Facilitates clear communication and coordination during an incident.
Developing Effective Reporting Capabilities: Enables timely and accurate information sharing among stakeholders.
Preparing an Incident Handling Starter Kit:
Data Acquisition Tools: Software for collecting digital evidence.
Read-Only Diagnostic Software: Prevents accidental alteration of data during analysis.
Bootable Linux Environment: Provides a secure environment for analysis.
Essential Hardware: Hard drives, Ethernet taps, cables, laptops, etc.
Implementing Defensive Measures:
Technical Controls: Antivirus, firewalls, intrusion detection/prevention systems (IDS/IPS), data loss prevention (DLP), endpoint detection and response (EDR), etc.
Security Information and Event Management (SIEM): Aggregates and analyzes security events.
Unified Threat Management (UTM): Combines multiple security functions into a single platform.
Threat Intelligence: Provides information about emerging threats and vulnerabilities.
Network Security Monitoring (NSM): Monitors network traffic for suspicious activities.
Central Logging: Collects logs from various sources for analysis.
Honeypots: Decoy systems to attract and detect attackers.
Creating Well-Defined Policies and Procedures:
Monitoring and Evidence Collection Rights: Ensure legal compliance and establish guidelines for data collection.
Response Procedures: Define steps for incident detection, analysis, containment, eradication, and recovery.
Breach/Incident Communication Plan: Outlines how to communicate with stakeholders, including employees, customers, and the media.
Maintaining Chain of Custody: Documents the handling of evidence to ensure its integrity and admissibility in legal proceedings.
Conducting Security Training and Awareness Programs: Educates employees about security risks and best practices, including recognizing social engineering attacks.
Planning for Incident Classification and Escalation:
Incident Classification: Categorizes incidents based on factors like system criticality, impact, and extent of compromise. This helps determine the appropriate response strategy.
Escalation Procedures: Defines when and how to involve senior management, such as the CISO or CIO, and other stakeholders.
Establishing Incident Tracking Mechanisms: Utilizes tools like Request Tracker for Incident Response (RTIR) to:
Consolidate incident information from various sources (e.g., system administrators, help desks, incident reporting systems).
Track the progress of incident response activities.
Facilitate collaboration among team members.
Additional Considerations:
Legal and Regulatory Compliance: Ensure incident handling practices adhere to relevant laws and regulations.
Business Continuity Planning: Integrate incident response with broader business continuity and disaster recovery plans.
Regular Review and Updates: Continuously assess and improve incident handling processes and procedures.
2.2 Detection & Analysis
Detection and analysis are critical phases where incidents are identified and their scope and impact are assessed.
Detection Methods:
Technical Sensors:
Firewalls: Monitor and control incoming and outgoing network traffic.
Intrusion Detection Systems (IDS): Analyze network traffic for suspicious patterns.
Agents: Software installed on endpoints to monitor activity.
Logs: Records of system and application events.
Human Intelligence:
Trained Personnel: Security analysts who monitor alerts and identify suspicious activities.
User Reports: Employees reporting unusual or suspicious behavior.
Key Points:
Assign a Primary Incident Handler: Responsible for coordinating the response to specific incidents.
Establish Trust and Effective Information Sharing: Use secure communication channels to protect sensitive information.
Alternative Communications: Establish secure channels in case the network is compromised.
Avoid Solutions Vulnerable to Man-in-The-Middle Attacks: Use end-to-end encryption and avoid untrusted networks.
Logically Categorize the Network:
Network Perimeter: External-facing systems and networks.
Host Perimeter: Interfaces between hosts and the network.
Host-Level: Individual host systems.
Application-Level: Software applications running on hosts.
Establish Baselines and Extend Visibility:
Baselining: Understanding normal network and host behavior to detect deviations.
Visibility: Ensuring comprehensive monitoring across all network and host levels.
Detection Examples:
Network Perimeter Level:
Analyze packet captures to identify scanning activities using tools like Wireshark.
Monitor for unusual traffic patterns, such as excessive connections to specific ports.
Host Perimeter Level:
Use tools like "lsof" and "netstat" to identify open ports and network connections.
Example: Detecting a Linux malware using port 22 for communication by checking network connections.
Host-Level:
Endpoint protection solutions can warn users about malicious executables.
Example: A user being alerted about a quarantined malicious file named "cloudcar.exe".
Application-Level:
Analyze application logs for abnormal behavior.
Example: An analyst identifying a potential web shell by analyzing IIS logs for overly long execution times.
Log-Reviewing Resources:
Microsoft Azure Logging & Auditing: https://docs.microsoft.com/en-us/azure/security/azurelog-audit
Apache HTTP Server Logs: http://httpd.apache.org/docs/current/logs.html
System Administrators & Detection:
Utilize cheat sheets to assist administrators in detecting:
Abnormal processes, services, files, network usage, scheduled tasks, user accounts, and third-party components.
Schedule periodic meetings with administrators to review detected abnormalities and refine the detection methodology.
Damage Estimation Questions:
What is the impact of the vulnerability exploitation? (e.g., remote code execution vs. information disclosure)
Are any critical assets affected?
What are the minimum requirements for effective exploitation? (e.g., privileged network position, internet connection, valid credentials, default configurations)
Is the vulnerability being actively exploited in the wild?
Is there a proposed remediation strategy?
Is there evidence of increased spreading capabilities?
2.3 Containment, Eradication & Recovery
2.3.1 Containment
Containment aims to prevent the incident from escalating and limit its impact.
Before Containment:
Identify if a malicious insider is involved: Assess the potential for insider threat.
Isolate the Investigation Area: Limit access to affected systems and data.
Utilize Incident Casualty Forms: Document the incident details and impact.
Classify the Incident: Based on factors like system criticality, impact, and extent of compromise.
Critical Systems vs. Non-Critical Systems: Affects response time and prioritization.
Impact on Assets: Determines the urgency of investigation.
Extent of Compromise: Influences escalation levels and communication with stakeholders.
Incident Communication:
Inform a senior management member (e.g., CIO, CISO) who is familiar with the incident handling team.
Ensure communication flows include both security and other management personnel to keep affected business units informed.
Incident Tracking:
Use tools like RTIR to consolidate incident information and avoid duplication of effort.
Coordinate with system administrators, help desks, and incident reporting systems to ensure comprehensive tracking.
Containment Sub-phases:
Short-term Containment:
Render the intrusion ineffective: Without altering the machine’s hard drive (to preserve evidence).
Disable network connectivity or disconnect the machine from power.
Place the machine in a separate/isolated VLAN.
Change DNS context.
Isolate the machine through router or firewall configurations.
Consider using Canarytokens (http://canarytokens.org/generate) to track attackers.
System Back-up:
Data Acquisition:
Order of Volatility: Acquire the most volatile data first (e.g., RAM) to prevent data loss.
Types of Data Acquisition:
Static Acquisition: Collect non-volatile data (e.g., hard disks, flash drives).
Dynamic Live Acquisition: Collect volatile data (e.g., RAM, temporary files) while the system is running.
Dead Acquisition: Collect data from systems with untrusted operating systems (e.g., systems with rootkits) using the system's hardware.
Data Acquisition Approaches:
Imaging: Create a forensic image of the hard drive.
Cloning: Create an exact copy of the hard drive.
Write Blockers: Prevent data alteration during acquisition by blocking write operations. Examples include:
WiebeTech Forensic UltraDock from CRU Inc.
Tableau Forensic Imager TD3 from Guidance Software.
Evidence Integrity:
Use hash functions (e.g., SHA-2, SHA-3) to validate the integrity of acquired data.
Store and communicate hash strings securely to prove data has not been altered.
Long-term Containment:
Decide the containment approach based on the incident's nature and impact.
If the attacker's actions or motives are unclear, monitor the system closely.
If the system cannot be taken offline (e.g., critical systems), implement measures to contain the incident while minimizing disruption.
Long-term containment activities include:
Patching affected and related systems.
Installing (H)IDS.
Changing passwords and trust relationships.
Implementing additional ingress/egress rules.
Dropping packets associated with the incident's source or destination.
Eliminating attacker access.
2.3.2 Eradication
Eradication focuses on eliminating the root cause of the incident and removing all attacker artifacts.
Identify the root cause and indicators of the incident using information from the Detection & Analysis and Containment phases.
Isolate the intrusion and identify the attack vector.
Eliminate attacker residuals:
Remove malware, including backdoors, rootkits, and malicious kernel-mode drivers.
For rootkits, zero the drive, reformat, and rebuild the system using trusted install media.
Analyze logs to identify credential reuse through protocols like Remote Desktop, SSH, and VNC.
Improve Defenses:
Configure additional router and firewall rules.
Obscure the affected system's position.
Establish effective system hardening, patching, and vulnerability assessment procedures.
Assess other systems in the network for the same vulnerability.
2.3.3 Recovery
Recovery involves restoring affected systems to normal operation.
Restore affected systems to production:
Perform quality assurance activities to ensure the system's running condition.
Ensure the system includes everything needed for operations.
Consult with the business unit to determine when to bring the system back online.
Post-Recovery Monitoring:
Monitor for oversights and signs of re-infection or re-compromise.
Utilize network and host-based intrusion systems to detect patterns related to the original attack.
Analyze critical logs and events for signs of re-infection.
3. Post-Incident Activity
The post-incident activity phase focuses on learning from the incident and improving incident handling processes.
Objective: Reflect on the incident and identify weaknesses, oversights, and blind spots in processes and technological measures.
Key Activities:
Create a Detailed Report:
Include both weaknesses and successful detection methods.
Highlight the effectiveness of response efforts against specific stages of the attack.
Schedule a Debrief Meeting:
Discuss the report with all involved parties, such as system administrators, affected business unit representatives, and the IT security team.
Focus on improving processes, technological measures, and visibility.
4. Incident Handling Forms
The document provides examples of essential incident handling forms, including:
Incident Contact List: Details of key personnel and stakeholders, such as:
CISO/CIO
SPOC of the incident handling or CSIRT team
Legal department contact
Public relations contact
ISP SPOC
Local cybercrime unit
Incident Detection: Information about the incident detection, such as:
The first person who detected the incident
The incident’s summary (type of incident, incident location, incident detection details, etc.)
Incident Casualties: Details of affected systems and their specifications, such as:
Location of affected systems
Date and time incident handlers arrived
Affected system details (e.g., hardware vendor, serial number, network connectivity details, host name, IP address, MAC address)
Incident Containment: Information about containment activities, such as:
Isolation activities per affected system (e.g., isolation status, date and time, method of isolation)
Back-up activities per affected system (e.g., handler who performed the restoration, back-up details)
Incident Eradication: Details of eradication efforts and root cause analysis, such as:
Handler(s) performing investigation on the system
Whether the incident’s root cause was discovered
Actions taken to ensure the incident’s root cause was remediated and the possibility of a new incident eliminated
Last updated