Security emergencies are never scheduled, and incident response to these emergencies must be immediate, effective, and thorough. An early step in an incident response is triage. During triage, a responder will assess the situation and determine the scope and severity of the incident. In this early stage, minutes count, and it is crucial that responders have the best information available to help them make crucial decisions on how to respond to the emergency. This information often comes in the form of security and information technology dashboards that represent the health and operational status of the network and devices from which they pulled detailed device and host log information. As the response to the incident evolves from triage to investigation, the defenders expand their assessment to any system that could provide any clues to the source, cause, or extent of the incident. Application and system logs play a critical part in the different phases of a security investigation. As an application developer or system administrator, you can help these defenders by configuring your systems to provide the most helpful log data that aid these efforts.
Most modern operating systems, applications, and devices can write events to log files that are forwarded to a centralized log server. These events often describe significant events triggered by the system, such as starting up, shutting down, errors, or interesting user activity. Forwarding logs to a central server help ensure the logs remain available and not tampered with in case the host system or device is compromised. A centralized log repository provides one place where defenders can run queries about devices across their environment, which saves time and effort for these defenders. The size and complexity of your environment will guide your solutions for deploying and managing a centralized logging server or platform that is right for you. Configuring a very simple syslog server is a quick way to centralize your logs. For a more advanced approach, consider an event management platform or framework—like Elastic Stack’s Elasticsearch, Logstash, and Kibana—that enhances centralized logging with analytics features that make it easier to find relevant data through data reduction, parsing, and visualization. Having some sort of a framework and business intelligence engine to help analyze and pivot raw log data can dramatically help a defender reduce piles of log data and convert event information into actionable alerts. These analytics depend on robust, quality log data as a foundation.
Choosing what data to log can be a balancing act. Logging too little information means your defenders may not have enough detailed information to assist them in a security investigation. Logging too much information might overwhelm systems and operators.
Many applications let you specify the level of logging verbosity and tag your events with metadata describing the severity of the event. An example would be differentiating an abnormal end of an important program versus a new connection to a webserver. Both may be interesting, but their criticality differs. Faster network connections, cheaper storage, and more advanced analytics programs mean organizations can increase the types of events they log. Even so, you will want to be careful about flooding your system with potential noise that makes discerning valuable bits of information more difficult. Larger environments will still want to review what types of logs are triggered by what devices and manufactures in order to carefully balance the total log volume versus event usefulness.
A well-structured logged event should include information such as:
There is not a log standard adopted by all manufacturers, and event record data schemas will differ. This is one reason why Security Information Event Management (SIEM) systems have become so popular. SIEM systems attempt to normalize events across manufacturers to assist in triage efforts. These systems often present a single dashboard and centralized alerting that includes ingestion and correlation of log data from many different sources. As an application developer, ask the security defenders and responders in your organization what events and metadata associated with these events are particularly useful or would fit well into their response framework and programs. Then, see if you can configure your environment’s devices to provide this information.
The body of the event contains the main message describing what happened. Take for example a user logon event. This event should include enough information to trace the activity back to the actual human that triggered the event. The event should specify whether the identity was device specific, domain credentials, or a federated identity. For more complex scenarios like federated or centralized identity management platforms, be sure you can track a user event at a specific device all the way across these interconnected systems to a specific individual. In the crisis of a security incident, the defenders are pressed for time. Running drills or tabletop exercises can help defenders determine how to quickly and efficiently analyze event data and correlate the data into actions. As an application designer, consider formatting your event message body in a way to facilitate machine parsing through scripted automation so that other systems can directly ingest the event message.
Beyond the message body, ensure that your application records are accurate and include useful metadata supporting the event. For example, be sure the date and time are correct and synchronized across all your devices using a Network Time Protocol (NTP) server, and that you can track the IP address and hostname to a unique device, system, or user. Because many IP addresses are dynamic—or reused across different offices, clouds, or datacenters—consider adding or associating location data to the event. A DNS/host naming convention might help in some cases. For example, an event containing the metadata hostname sea1DC01 with an IP address of 172.16.0.10 could be decoded as the first domain controller in the primary Seattle datacenter with a corresponding RFC 1918 private IP address of 172.16.0.10. Add IP address reservations for all servers and other critical devices to not only ensure that their IP addresses do not change—or are allocated to a different device—but also provide another data repository that a defender can leverage in their investigative toolkit. Also, consider some sort of IP lookup database that you can provide your defenders so they can quickly associate an IP address in a log with a system and owner.
Choosing which events to log is dependent on your environment and organization. Some devices only allow you to choose the severity level of which events to log. In this scenario, the only action an application developer or system administrator could take is to choose a severity level that might range from critical to information only. Other platforms provide a high degree of customization. For example, through Microsoft Windows Group Policy, object administrators can configure to a wide degree specific events to log in their Windows Active Directory environment. Different cloud providers, including Software-as-an-Service (SaaS) applications, will each offer different levels of logging specificity and control, so be sure to understand what is available. Of course, if you develop applications, you will have much more flexibility to choose exactly what your application will record to an event log. For each of these situations, take the perspective of a defender responding to a security incident and ask yourself what information would be useful to a security investigation. Examples of useful events include:
These examples are just the starting point for consideration as you develop your event logging strategy. Whether you are an administrator or application developer, put yourself in the shoes of an information security defender and ask how each of the events you are enabling or creating might help resolve a security incident. Even this simple role-playing exercise might help identify data or process gaps that you can work to methodically address before an actual crisis.
Jeff Fellinge has over 25 years’ experience in a variety of disciplines ranging from Mechanical Engineering to Information Security. Jeff led information security programs for a large cloud provider to reduce risk and improve security control effectiveness at some of the world’s largest datacenters. He enjoys researching and evaluating technologies that improve business and infrastructure security and also owns and operates a small metal fabrication workshop.
版权所有©2021 Mouser Electronics, Inc.
Mouser® 和 Mouser Electronics® 是 Mouser Electronics, Inc 的注册商标。