Depths of SIEM: correlations "out of the box". Part 2. Data schema as a reflection of the "world" model
This is the second article in the series, which is devoted to the methodology for creating correlative correlation rules for SIEM systems. In the previous article we set ourselves this task, described the benefits that will be obtained when implementing it, and also listed the main problems that stand in our way. In this article, we will begin to search for solutions and start with the problem transformation of the "world" model , as well as its manifestations at the stage of normalization of events.
transformation of the "world" model was told in the first article. Briefly recall its essence: when a phenomenon occurs at the source of events (for example, the process starts in the OS), it is fixed in different formats first in memory, then in the OS event log and then in the SIEM system. Each stage of processing is accompanied by loss of data, because at OS level - one model of "world", and in the OS log - another, limited by a set of fields - a log scheme. Thus, there is a reflection (transformation) of one model with a large number of parameters to another, with a smaller number. The normalization and preservation of the event in SIEM is another transformation, which also occurs with the loss of data, as the internal model of the "world" is embedded within the SIEM.
It is difficult to find a way that would allow the transformation of one model to another without loss. Knowing this limitation, it is necessary to formulate such an approach to normalization and the formation of a list of fields of the event scheme, in which the information important for the correlation and further investigation of IS incidents would not be lost.
Within the framework of SIEM, the model is represented by a scheme - a set of fields into which data from the initial event is stacked during the normalization process. In the future, it will be used by specialists when creating correlation rules. That the experts in the investigation of incidents and responsible for the development of correlation rules unambiguously interpreted normalized events, the scheme must satisfy the basic properties:
be unified for events of any type and source;
clearly describe who worked with whom and how;
preserve the essence and context of interaction.
In the process of developing the rules of normalization, the information about the interaction must be found in the original event and expanded into specially designated fields. The same thing needs to be done with the context and the essence of the interaction (more on this in the next article).
The question arises: is it possible to single out typical schemes for interactions that would be satisfied by any events created by all possible IT and IB sources? If so, what do these schemes look like?
To find the answer to these questions, you need to turn to analytics and try to analyze as many of the normalization rules already developed and functioning in SIEM-solutions as possible to reveal common patterns. Within the framework of such work, it was possible to analyze more than 3000 normalization rules from more than 100 different sources from such solutions as Positive Technologies MaxPatrol SIEM and Micro Focus ArcSight. As a result of the analysis, the following conclusions were obtained:
Typical schemes of interaction exist.
In each individual event, as a rule, there is information about the interaction at network level and at application level .
Typical schemes of interaction may differ at different levels, and this must be taken into account.
Schemes of interaction at network and application levels
Let's describe the typical schemes for each level. Before this, it is necessary to select the entities that are always present in the events. Further, on their basis, interaction schemes will be constructed. These include:
Subject . The entity that affects the object. For example, a user changing the registry key, or a host with IP ???.? sending the packet to the host with IP ???.1.
Object . The entity affected by the subject.
Source . As a rule, a host that registers the interaction of the Subject with the Object generates the event itself. For example, the source will be the firewall that fixed the transfer of packets from the host - the subject with IP ???.? the host - the object c IP ???.1.
Transmitter . There are cases when SIEM receives events not directly from the source, but from the intermediate server through which these events pass. The simplest example is an intermediate syslog server. An example is more complicated - when the Transmitter can be a management server, for example, Kaspersky Security Center. In this case, the Source is a specific Kaspersky Endpoint Security agent.
However, not all entities can be simultaneously represented in the event (about this further), so it is important to initially enter agreements, as in this case the corresponding fields of the scheme are filled. This will help in the future to clearly distinguish the cases in which these fields were not filled because of the error of the specialist who develops normalization rules from the cases when the original event did not contain any data about any entity.
We now turn to interaction schemes and examples of events. For clarity, all examples will be based on file logs, syslog messages or records in the relational database, but they can be applied to other log formats, for example, binary.
The primary identifier for the network layer entities is IP addresses. It is important to understand that there may be other related identifiers - MAC addresses on the link layer, FQDN - on the application level. The question arises: do they talk about the same essence or about different things? Can the same identities change over time with the same entity? This will be devoted to a separate article, now we will dwell on the fact that the main identifier for interaction models at the network level is the IP address.
So, the typical schemes of interaction of this level can be conditionally divided into two classes - basic and degenerate.
The basic schemes of interaction
scheme 1 , if there was no intermediate syslog server in it and we received the events directly from the firewall.
???.1 - The source (the node of the firewall),
???.1 - The subject (sending UDP-packets),
???.1 - An object (receiving UDP packets).
Scheme 3. Interaction with a set of Objects
This scheme of interaction at the network level is quite rare and, as a rule, typical for network equipment events. In a schema, one Entity interacts with multiple Objects, a similar interaction is present in events describing multicast, unicast, or broadcast distribution.
Note that sometimes a lot of Objects can be combined with a common identifier - a subnet address or a broadcast address. This must be remembered, since during the analysis of events, including at the level of correlation rules, it is easy to miss a potentially important interaction, since in such a scheme the address of the Object is hidden behind the group address.
The following example shows an event with the IGMP Relay server, through which the membership request to the multicast address is broadcast.
???.1 - Source (IGMP Relay server),
???.1 - The subject (requesting belonging to the group),
???.1 - The object (the group address).
Degenerate schemes of
Subject, Object and Source are the basic entities in the group of basic interaction schemes. However, it is not uncommon for an entity to miss an event.
Scheme 4. Interaction without the Object
Often such a scheme is typical in situations in which the subject reports about a change in his internal state - that is, he acts simultaneously in the role of the subject and the object. For example, this interaction can be observed in configuration change or malware detection events on the workstation. But this information is recorded not by the Subject himself, but by the centralized management system and stored in his journal.
The example shows how the Symantec Management Server management server records that the Symantec Endpoint Protection agent that it manages has detected a malicious file on its node.
???.1 - Source (Symantec Management Server),
???.1 - Entity (Symantec Endpoint Protection agent).
Scheme 5. Combining the role of the Subject and the Object in Source
The last degenerate scheme of interaction is typical for the situation when SIEM receives events from the Source, reporting changes in its internal state: for example, reconfiguration of the device or software, enabling or disabling the network port. In such a scheme, the role of the Source coincides with the role of the Subject and the Object. Unlike the previous scheme, here the events in SIEM come directly.
In this example, the Cisco IOS-based switch indicates that its interface has become UP.
Here ???.1 - Source (switch).
At this level there are interactions of already known entities: the Subject, the Object. However, all information about the Source and the Transmitter remains directly on the network level and does not have its reflection at the application level.
Most of all types of events have interactions at the same time at the network level and application level. However, we note that events generated directly by application software, for example, 1C: Enterprise, Microsoft SQL Server or Oracle Database, can contain only application-level interactions.
In addition, an additional entity appears at the application level. Resource .
Resource - an intermediate entity through which the Subject influences the Object without direct interaction. For example, granting Alex access to the MyFile file for Bob. Here Alex is the Subject, Bob is the Object, MyFile is the Resource. Note, in this example, Alex does not directly interact with Bob.
Important : application-level events can contain both the additional parameters of the subject and the object, and the resource itself. For example, the additional parameters of such a resource as "file" can be the directory in which it is located, or its size.
In this case, the Subject, Object and Resource are identified by name or unique identifier: e-mail address, file name, directory name, table name in the database.
Consider the additional interaction schemes that are specific to the application layer.
Scheme 6. Interaction through resource
In this scheme, the Subject indirectly affects the Object through an intermediate Resource. Typically, events with this schema are clearly visible in the database audit logs or work with file and directory access privileges at the OS level.
The example shows an entry from the audit log of the Oracle Database. It fixes the process of recalling the role of the user.
«ALEX» - Subject (user name that recalls the role),
«BOB» - The object (the user name to be recalled),
«ROLE» - Resource (name of the role to be recalled).
Schema 7. Interaction with a variety of resources
At the application level, as well as at the network level, there are types of events in which the Entity interacts with the Object directly through a set of Resources. This is very rare, but there are cases when the number of Objects is also more than one. These types of events appear when fixing mass operations. For example, granting access to several files to one user or changing a set of rules that are included in the policy.
In the example, the solution for protecting virtual environments Security code vGate fixes the addition of new policies to the set.
"Admin @ VGATE" - Subject (the name of the user that modifies the set of policies)
«Base» - Object (set of policies)
"Install and maintain the integrity of the file system", "Check SNMP agent settings", "Prevent automatic installation of VMware Tools" - Resources (names of added policies)
Model of the channel of interaction between the Subject and the Object
On all schemes, we identified different entities (subjects, objects, resources, sources, transmitters) and noted the so-called channel of interaction between them. Let us dwell in more detail on the penultimate component of the large "world" model that SIEM should operate on - models of the interaction channel between the subject and the object. Recall that the last component - the context of interaction (this will be the next article).
So, there are two entities that interact with each other. Within this interaction, data is transferred from one entity to another. These can be network data packets, files, or control commands. In this case, the formed channel can be represented in the form of a "pipe", along which there is a directed stream of data and commands. Such a model is clearly visible at the network level, but less pronounced at the application level (see
The data channel model is
Based on this model, each event that the SIEM receives can contain information describing:
Parameters of the channel itself are "pipe",
Data transmitted through this "pipe".
Typically, the channel is described by parameters such as session ID, data transfer protocol, channel setup time, end time, duration. The data in the events are characterized by the format used by the encryption algorithms, the number of transmitted packets, the number of bytes transmitted.
Consider an example of an event that contains data about the interaction channel. Here is an event from the Identity and Access Control System - Cisco Identity Services Engine (ISE), which records the user's network session as part of the accounting process (accounting).
Depths of SIEM: Correlations "out of the box". Part 1: Pure marketing or an unsolvable problem?
Depths of SIEM: correlations "out of the box". Part 2. Data schema as a reflection of the "world" model ( , This article is )
It may be interesting
I would like to thank you for the efforts you have made in writing this article. I am hoping the same best work from you in the future as well. In fact your creative writing abilities has inspired me to start my own Blog Engine blog now. Really the blogging is spreading its wings rapidly. Your write up is a fine example of it.