CYBERATTACK DETECTION IN VEHICLES USING CHARACTERISTIC FUNCTIONS, ARTIFICIAL NEURAL NETWORKS AND VISUAL ANALYSIS

. The connectivity of autonomous vehicles induces new attack surfaces and thus the demand for sophisticated cybersecurity management. Thus, it is important to ensure that in-vehicle network monitoring includes the ability to accurately detect intrusive behavior and analyze cyberattacks from vehicle data and vehicle logs in a privacy-friendly manner. For this purpose, we describe and evaluate a method that utilizes characteristic functions and compare it with an approach based on artiﬁcial neural networks. Visual analysis of the respective event streams complements the evaluation. Although the characteristic functions method is an order of magnitude faster, the accuracy of the results obtained is at least comparable to those obtained with the artiﬁcial neural network. Thus, this method is an interesting option for implementation in in-vehicle embedded systems. An important aspect for the usage of the analysis methods within a cybersecurity framework is the explainability of the detection results.

Controller Area Network (CAN) bus is flooded or ECUs are crashed by a high message load, and subsequent recovery. Furthermore measures for the detection of malicious messages need to be implemented. In principle, the detection of anomalies in network traffic within a vehicle caused by attackers could be done remotely by sending all internal traffic to a Security Operation Center (SOC). However, this would be problematic from a privacy perspective, it would be inefficient, it would incur high costs, and it might not meet real-time response requirements. This paper is based on our own preliminary work presented in [4]. We propose a new method for in-vehicle anomaly detection that satisfies the following four requirements: (1) the recognition accuracy should be equivalent to or better than existing IDS systems, (2) the method should be lightweight and resource efficient so that it can be executed on typical ECUs, (3) no hardware changes should be necessary and no additional third-party software libraries should be required (as they may not be available for specific ECUs), and (4) the anomaly detection results should be explainable in order to make informed decisions about countermeasures.
To meet these requirements, we propose a logic analysis method that we compare to an artificial neural network-based method that could likely be used in embedded systems in vehicles. We aim for better accuracy, faster and more resource-efficient message characterization, portability to embedded systems without dependencies on libraries such as Tensorflow, and rule-based reasoning so that message evaluation results related to anomalies can be traced back to the responsible rules. We evaluate the proposed method on data sets of the CAN bus, which is the standard solution for communication between ECUs in vehicles.
The remainder of this paper is organized as follows: Section 2 gives an overview on the background and related work. Section 3 introduces data sets from two different vehicles that have been used to evaluate the proposed method. Section 4 presents the principles of the characteristic functions method while Section 5 describes its implementation and the results of various detection setups. Section 6 describes some results from tests with neural networks in order to provide a benchmark for our work. Finally, Section 7 concludes this paper.
2. Background and Related Work. The security of a system can be improved by reducing its attack surface. In [8,9], for example, possible break-in points are listed together with suggestions for countermeasures such as cryptography, detection of anomalies and ensuring software integrity by separating critical systems. However, most of the intrusion prevention measures currently under discussion require hardware changes, which is inconsistent files from which 4 meaningful fields can be extracted, that are then mapped in the structure that is presented in The original data set does not contain ground truth flags for each message, but metadata files that describe the attack in regards to timing, content and target ID. For our method we reformatted the messages and marked each valid message in the log with a 1 and each malicious intrusion message with a -1. In the exemplary excerpt of a data log in Table 1 you can see that the first occurrence of a message with arbitration ID 208 is a valid message sent by the responsible ECU, whereas the second occurrence is a malicious message introduced by an intruder.  Type  1  42.0256  1533  189  221  253  128  126  255  237  218  1  2  42.0271  208  10  115  4  100  136  5  110  0  1  3  42.0282  51  0  6  128  0  12  66  183  208  1  4  42.0282  263  0  0  0  0  0  0  0  0  1  5  42.0282  4095  0  0  0  0  0  0  0  0  1  6  42.0282  14  32  82  150  2  8  9  118  148  1  7  42.0292  208  10  115  4  100  136  255  110  0  -1  8  42.0292  293  144  0  65  31  64  255  163  96  1  9  42.0292  186  6  152  5  4  16  0  2  100  1 Attacks that are presented in the ORNL data set can be divided into 3 categories: 1. Fuzzing attack -an attacker injects messages with maximum payloads for many random CAN IDs.
2. Message injection target attack -an attacker inject message with a specific CAN ID immediately after the legal message appeared. Thus injected messages are superimposed on legal messages.
3. Message injection target attack with masquerade -this attack is similar to the previous one, but legitimate messages were removed. So injected messages replace legal ones.
For a better understanding of what traffic looks like with injected and legal messages, we present these 3 attacks using radial bar chart visualization that was originally presented in [29]. Examples of visualization of these 3 attack types are presented in Figure 1, where malicious messages are marked with red bar color and red bubble. In this visualization, we present attack influence by radial time intervals, where each CAN ID is represented as a bar whose height equals the number of messages. Bars consist of arcs that represent payload -the more messages with the same payload the higher is the arc. So solid (or almost solid) arcs depict messages with the same payload, while thin or even transparent (their thickness is less than a pixel) bars depict messages with big payload variety. Fuzzing attack is the simplest for detection using visual analytics. Usually Fuzzy attack is characterized by many CAN IDs with almost empty bars (see red bubbles in Fig.1a).
For message injection attack there is a pattern in the radial chart -CAN ID with injection have 2 types of message frequency distribution. The first distribution that is without injected messages consists of thin arcs with various payloads. The distribution of injected messages is a solid bar that indicates a lack of variability of payload. We can see how injected messages are superimposed on legal ones by the next patterns depicted in Figure 1b: 1. Sharp difference between frequency (orange indicator #1) -the first part of the bar is very frequent (legal messages) and the second one is not (injected messages).
2. A part of such bar (legal messages -green indicator #2) has almost the same number of messages that other bars.
3. Whole bar (legal messages plus injected -purple indicator #3) does not have the same number of messages that other bars.
For masquerade attack the injection is not clearly detectable by visualization (see Fig.1c). The bars with injected messages look pretty normal except (orange indicator #1) sharp difference between frequency where legal messages have various payload and injected do not. But the (blue indicator #2) height of the bar looks normal, as the attacker replaced the legal messages with malicious ones.
The ORNL data set also contains "Accelerator Attacks"data sets. This type of attack exploits a vulnerability that puts the ECUs into a compromised state. Therefore, there are no injected messages, so we do not analyze "Accelerator Attacks"data sets in this paper.
4. Principles of the Characteristic Functions Method. Before proceeding to the presentation of characteristic functions, we start by formalising a generic notion of intrusion detection in Section 4.1, and prove that this setting is not usable for exhaustive search in practice. We then present in Section 4.2 criteria that we have considered, and the characteristic functions in Section 4. A log of length n is a finite sequence (e i ) 1 i n of elements in I × P. Let L be the set of logs. Given a log L ∈ L and ι ∈ I, we let π ι (L) be the subsequence of L of elements whose ID is ι. Given a log L = (e i ) 1 i n of length n and 1 k n, we denote L \ k = (e i ) 1 i n−1 where e i = e i if i < k, and e i = e i+1 otherwise. I.e., L \ k is the log L in which the kth event has been removed. Under the same premisses, we denote L <k the log (e i ) 1 i<k . Definition 1. (Evaluation functions) An evaluation function with memory k, or k-evaluation function, is a function ϕ : (I × P) k → B.
We say that a log L = (e i ) 1 i n of length n is accepted by a kevaluation function ϕ if, for all k i n, we have ϕ(e i−(k−1) , . . . , e i ) = . Conversely, for each k i n such that ϕ(e i−(k−1) , . . . , e i ) = ⊥, we say that the event i is an anomaly.
Let us now define how an evaluation is applied on a log that may contain anomalies.
Definition 2. (Application of a k-evaluation function) Given a log L of length n and a k-evaluation function ϕ, the application of ϕ on L at step k i n is denoted µ(ϕ, L, i). It is defined if ϕ accepts L <i and when this is the case, we have: The application of ϕ on L is denoted µ(ϕ, L) and is equal to µ(ϕ, L, 0).
We call the result of the application of an evaluation function ϕ on a log L the ϕ-accepted subsequence of the elements of a log L. Elements that have been eliminated are said to have been rejected by ϕ.
The Intrusion detection problem. We let L = {L i } i∈N be a set of logs. The intrusion detection computation problem consists in computing a parameter k and a k-evaluation function ϕ L such that, for all L ∈ L, we have µ(ϕ L , L) ∈ L. Unsurprisingly, given the generality of the notions introduced, we have: Theorem 1. Every k-evaluation function reckognises a regular language.
Proof (Sketch) Given a k-evaluation function ϕ, we construct a finite automaton A ϕ as follows: -All states are final, and are the elements in the 0 l k−1 (I × P) l ; -Letters are all the elements in I × P; -There is a transition (e 1 , . . . , e k−1 ) → e (e 2 , . . . , e k−1 , e) if, and only if, ϕ(e 1 , . . . , e k−1 , e) = ; -For l < k − 1, there is a transition (e 1 , . . . , e l ) → e (e 1 , . . . , e l , e); -The initial state is the state (). It is clear that A ϕ accepts a log L if, and only if, A ϕ accepts L.
As a corollary of Theorem 1, since sets of logs L are not assumed to be rational, the intrusion detection problem usually does not have a solution. Beyond this formal impossibility, one can also note that the set of possible k-evaluation functions, even for k = 1, is too large to be computed explicitely.
For practical purposes, one thus has to rely on heuristics to find evaluation functions that are of practical use to detect intrusion.

Criteria for relevant evaluation functions.
Since it is unlikely that the possible logs of a non-trivial system form a rational langugage, we aim at learning a flight envelope for the system under analysis by overapproximating the set of possible logs of the system with a rational language. Towards this end we try to compute, given the values occurring in the different fields of the legitimate messages, what the possible acceptable values are for these fields. Just as to locate a point in space there is an infinite number of possible basis in which the coordinates of the point can be expressed, there is in principle an infinite number of ways of looking at values of the fields and their interactions one with another.
For this implementation, we have focused on two sources of regularity in the messages normally exchanged on the CAN bus: -as a car is an example cyber-physical system, some field values represent "physical"values, and while their range may encompass the whole set of possible values, they are likely to change slowly from one message of a given ID to the next; -the ECU communicating over the CAN bus run computer programs, and those programs are likely to test for the presence of a specific value in the message, or its membership in a small set of possible values. Legitimate messages sent on the bus are constructed so as to pass these tests. These considerations, explored in more details below, led us to consider testing whether the value of a field stays in a small set, and whether the value of its differential stays in a similarly small state. The set of all possible tests is the test space. The tests that are consistently passed by all the messages of a given ID in a log are considered to be characteristic of that message ID, and the tests themselves are the characteristic functions.
Methodology. Each log file is read twice. In the first reading, anomalies are removed from the log file, and the analyzer computes for each message ID and each field a subset of the characteristics functions so that: -that subset is small enough; -each message occurring in the log pass at least one of the test. When no small subset is available, the analysis of the field is considered to be inconclusive. During the second read, for each message, the monitor scans each field for which at least one of the value or differential analysis was conclusive. Each field is accepted if one of the retained tests on its value succeeds, and is rejected otherwise. The message is accepted if no field has been rejected.
Methodology on the choice of the test space. The first step consists in choosing a set of simple tests that are likely to be relevant. The space of all possible message tests will then be all the possible conjunctions and disjunctions of these simple tests. We model packets by an ID and a sequence of bytes, i.e. 256-valued integers. This ID determines a class to which each packet belongs. We assume that all packets in a given class are similar enough so that some tests exist that are valid on all messages on the class and are not vacuous.
In principle the test space encompasses all boolean functions on messages or sequences of messages. However a succinct analysis already delineates a few types of tests that may be useful for the analysis of logs: -some tests are related to the syntactic content of the packet, such as the presence of a padding constant or the presence of a specific value, denoting e.g. a more precise type for the packet; -some tests are computed on the whole packet, such as an errorcorrecting code; -some tests are domain specific and relate to the possible evolution of physical data between consecutive packets or the set of possible values of some data; -some tests depend on the internal state of the devices, a packet being acceptable at some point of their execution but not at another point. For the sake of simplicity we consider in this paper only tests performed independently on the different fields of messages, as well as on their ID. That is, we consider only the first and third cases of the preceding list. We are currently working on implementing the second (whole message tests) and fourth (with an online process mining algorithm).
Automatic fields. A field is automatic if the device receiving and accepting this packet tests whether the value of the field is equal to a constant in its program. It is expected that, if different packets can be sent from one device to another, at least one automatic field exists so that the receiver can derive the type of the received packet. The statistical characteristic of such fields are that they should have only a few legitimate values, and that these values should have no other detectable relations. However, the difference between these values can be arbitrary as it is simply a case of a few bits switching value.
There is obviously some arbitrariness in deciding what a few means. Since the tests performed are not based on any hints from the protocol, we have arbitrarily decided to define a small set of different values to be the square root of the total number of possible different values, that is less than 16 values among the 256 possible ones. Tests relevant to automatic fields are value tests in which we record all the different values occurring in a field during training. If the number of different values is more than 16, the analysis is considered to be inconclusive, and no value test is performed on that field for that message ID during monitoring. Otherwise we verify during monitoring that the value in that field for a message is among the ones seen during training. To sum up, value tests are a conjunction, on all fields f , of a disjunction Physical values. These are values that are assumed to evolve slowly. For these values we assume a bound on the difference between the value present in the current packet wrt the value occurring in the last preceding similar packet. For these fields the analyzer keeps track of the value in the last accepted message and compares that value with the one in the current message. As in the case of value tests, these difference tests are performed during monitoring only if a small (less than 16, again based on a square root consideration) number of changes have been observed during the training phase. Re-using the same notation as above, but now denoting f the value of a field in the last accepted packet, and f its value in the packet under analysis, difference tests are a conjunction, on all fields f , of a disjunction 16, or of the true constant if more than 16 different values have been encountered for the difference between the values for that field between a message and its predecessor.
Random values. There are fields for which no relation was found in the data set among the ones that were searching for. In the data sets considered, a post-analysis of the rules has shown that in several cases these fields are often related with the physical value fields, and that the data conveyed were actually 2-bytes values. The analyzer does not perform any test on these fields, as per the construction described both the value and the difference tests are reduced to the constant for these.

Characteristic functions.
We sum up the presentation above with the following criteria on a sufficiently good evaluation function ϕ: 1. We can forget by relations between the content of different message IDs defining a log with multiple IDs as the coproduct of logs each restrict to one single ID ι, i.e., each log L is assumed equal to ιinI π ι (L); 2. The decomposition of each payload into a set of meaningful fields means that each ϕ ι can further be decomposed into evaluation functions specific each field f , i.e., ϕ ι = f ∈Fι ϕ ι,f ; 3. To take into account physical values, it suffices to assume that each ϕ ι,f is a 2-evaluation functions, and that it suffices to consider the difference between the present value of field and its former value; A final criterion, not introduced above but that we believe is necessary for the stability of the learning phase, is to refrain from having forbiden values, e.g. saying that the value in the field 0 will never be 129. As a heuristic these criteria can certainly be relaxed, but we have already obtained good results even though they may seem very restrictive.
In order to define characteristic functions, it suffices now to introduce, for each ID ι, a set of fields F ι . Each field f ∈ F ι is a function f : {ι}×[0, 255] 8 → Z. Also for each field f ∈ F ι we introduce two sets of values V ι,f and D ι,f which, according to the above discussion, can either be finite and of cardinal between 1 and 16, or Z.
Implementation. Characteristic functions, and thus the log analysis functions they define, have been implemented in C. As a first step, the log is translated if necessary into a binary file which is then mapped to an array of structures using mmap call, with each structure representing a packet. Records in this array are then analyzed independently by two modules, one tracking for each ID and for each field the number of different values, until the threshold 16 is reached, the other tracking the differences between consecutive values, again for each ID and for each field of that ID. Each analysis module constructs a balanced binary tree mapping an ID and a field to the result of the analysis on this ID for this field. The monitor module then uses this structure to parse and iterate over another log file to classify each packet as to whether it should be accepted or not. The complexity of treating each event in this architecture is Θ(log |I|), as we assume the number of fields is bounded, and thus the number of elements in the balanced binary trees is Θ(|I|). Thus the treatment time for a log of N events with K different IDs is Θ(N · log K), both for learning and monitoring.
Memroy footprint. During training the entire log file is virtually available in memory, and we rely on the operating system to optimize speed and memory consumption. During the rule evaluation the memory needed by the monitor is linear to both the number of different IDs and in the number of fields within the payload. We note however that since each ϕ ι is a 2-evaluation function, both the learning and the monitoring can be performed online, with space requirements of Θ(log |I|).

Implementation and Evaluation of the Characteristic Functions Method.
Our characteristic functions method attempts firstly to classify messages into classes, and secondly to characterize messages in a given class by the set of rules they are required to pass. The monitor module only implements tests that are satisfied by all messages in a given class. The classification tool then outputs the specific rules that are to be used in message classification for each individual class. This information is provided in a human-readable format and may potentially be useful in future research as well.
First, it permits to compute the probability that a random message satisfies all the tests in the class, and thus allows us to evaluate the robustness of the monitor against the injection of random messages. Assuming that in a given class there are n fields classified as automatic and m fields classified as physical, and that tests on fields all accept the maximum of 16 values, a random message in that class has a probability ( 16 256 ) n+m = 2 −4·(n+m) to be  accepted. This small but non-negligible probability explains the occurrences of false negatives in Table 2, where evaluation results for this approach are labelled with cf for characteristic functions. Scenario: log-file with simulated attacks; Precision (Positive Predictive Value) P P V = T P T P +F P ; Recall (True Positive Rate) T P R = T P T P +F N ; F1 Score : F 1 = 2 * P P V * T P R P P V +T P R ) Different classification scenarios are characteristic functions ( cf or neural networks trained with either the original ORNL intrusion sets (nn) or randomly introduced intrusion messages (nn').
Second, given that the rules generated implement simple tests, it is also in theory possible for a human to better understand the system by looking at the rules produces, and eventually produce new (and less generic) tests beyond those described in this paper. A side result of this is that it is also quite easy to build a fake traffic that will be accepted by a monitor once we know its rules.
Third, it permits to focus further classification work on classes for which only a few fields are tested. For example, some poorly classified messages seem to be frames in a more complex Multi-Frame Message (MFM). To handle this case we plan in future works to implement MFM protocol recognition. Also, and though this is outside of the scope of this paper, a manual analysis of the rules produced and of the messages in these classes strongly suggests new test functions, such as counter and checksum detection, to handle these currently poorly handled cases.
In addition to the discussion above, the results of experiments in Table 2 show next to no false positive classifications. This is further visualized in Figure 2, where only one column for the characteristic functions method, here marked as logan, can be seen. The evaluation results show that though arbitrarily selected, the heuristic threshold of 16 is not too high as it does not classify a field that contains random values into an automatic field, i.e. no over-fitting has been observed. This however should not be interpreted as an impossibility for our method to suffer from over-fitting. Especially a training data set which is too short would tend to produce illegitimate value tests, e.g. for the fields recording the timestamp of the packet. The high number of false positives on one intrusion scenario however, shows a behaviour yet to completely evaluated, where intrusions that alter existing messages instead of only introducing new messages potentially cause the internal state of the characteristic functions classifier to reject every message after the first malicious intrusion message has been classified. In a real-life scenario this would potentially not be harmful, due to the fact, that an intrusion has to be detected in order for this to occur.
For a better understanding of the classifier's errors, we visualized the number of FP and FN in a form of bar charts that are presented in Figures  2 and 3, so one can see how errors are distributed over different attack scenarios and classifiers. The logan classifier seen said Figures corresponds to the characteristic functions approach, whereas nn_orig and nn_fuzzed correspond to the different training scenarios for the neural network approach, discussed in Section 6.
We also map classification results in form of radial bar charts where blue represent TP and TN, red -FP, and orange -FN. The example is presented in Figure 4. All classification results in form of radial bar charts are available via the link https://guardeec.github.io/ornl_dataset_vis/visualization.html. You can select a data set and classifier type to view the corresponding result. The CAN ID is displayed by clicking on the bar.
As can be seen in Table 2, the results are very encouraging against the different attacks considered. It is to be noted that using knowledge of the results and models from the analysis modules, it would potentially be easy to construct attacks (i.e., introduction of additional malicious messages on the bus) that follow a pattern that will be accepted by the analyzer.
In addition to the intrusion detection performance we have also evaluated the number of classified messages per second from all test scenarios, which averaged at approx. 1700 messages per second. This test was performed on a Raspberry Pi 3 Model B to test the performance of the classifier on a device similar to what could be used as an edge node in a vehicle.
6. Baseline Benchmark: Artificial Neural Network. As a benchmark for the evaluation of our approach we implemented an artificial neural network approach using the Tensorflow Keras API [30,31]. Neural networks are the standard for deep learning and can model very complex nonlinear relationships. A fully connected neural network utilizes a number of layers with each layer supporting an arbitrary number of neurons. Data is propagated from the input to the output layer using weighted connections between the neurons of these layers. Specifically a multilayer perceptron (MLP) based on the Sequential model from the keras Tensorflow package with two hidden layers of 25 neurons each was used. This results in a model with ∼ 1200 trainable parameters. We specifically designed the network to perform well on weaker in-vehicle edge devices. As the activation function for the hidden layers we selected rectified linear unit (ReLU), which is computationally cheap. In total we trained on the data set for a learning phase of 20 epochs, with a validation split of 0.2, so 20% of the input data was used for validation. For the loss function of the model we decided to use binary cross-entropy, which is based on a classification of values between 0 and 1 and is best suited for binary classification, as is required for our training data. In addition to that the Adam [32] optimizer was used.
The data logs are then preprocessed into a structure where for each message m i with arbitration ID i and payload p m i a input vector (i, p m i , p m−1 i ) with the payload of the previous message with the same ID, is created. With first message of each ID, where there is no previous payload available, a vector of zeros is used as payload. The timing of the message is disregarded in this approach. This structure was selected to make the neural network approach as comparable to the characteristic functions approach possible, by providing the same information for the classifying process as in the case with characteristic functions.
For each of the intrusion scenarios described in Section 3 a separate model was trained, where scenarios that consist of more than one log file are merged into one model. For the training only the non modified log files from the ORNL road data set were used, due to the fact that the masquerade intrusions alter the structure of the log and can potentially impede training.
To improve our evaluation using neural networks we have designed two different evaluation scenarios. One scenario utilizes the original data logs for training data, while the other uses artificially generated intrusion messages, In total six models were trained per evaluation scenario, namely CSA (correlated signal attack), Fuzz (fuzzing attack), MECTA (max engine coolant temp attack), MSA (max speedometer attack), RLoff (reverse light off attack) and RLon (reverse light on attack).
The results are shown in Table 2, where the evaluation scenario using original log files is annotated with nn and the scenario using artificially generated messages with nn'.
As expected the results for the nn scenario are in most scenarios near perfect, except for the MECTA intrusion scenario, which contained too few intrusion messages for reliable training. In most other scenarios all introduced intrusion messages were classified correctly as intrusions, as indicated by a value of 1 in the T P R nn column. For the fuzzing attacks a below a 1 value in the P P V nn column indicates the occurrence of false positive values in classification, a close observation here shows misclassification here happens mostly at the beginning of the log, where a zero value vector was used in the input vector for the message, as described above.
The results for the training with artificially introduced intrusion data, which can shown in the nn' column of Table 2, are more diverse. The classifier models have shown that often for intrusion scenarios, where additional messages were introduced to obfuscate the normal behaviour, the classification performance of the models trained on artificially placed intrusion is zero or close to zero. This shows that the context of the messages, most importantly the previous message is decisive for correct classification. For the masquerade scenarios of each intrusion, many of the introduced and modified messages were classified correctly. A high positive predictive value here shows that the number of false positive classifications is close to zero, whereas the true positive rate varies significantly with different log files. These scenarios show, that despite the low classification rates on the non-masquerade versions of the logs, the models are able to detect derivations from normal behaviour in the log files.
The results here highlight the complexity of the intrusion scenarios from the ORNL Road data set. The nn' model evaluation signifies that even if the message structure of an intrusion scenario is known, the correct classification is non-trivial.
To provide a better performance comparison to the CF approach not only in terms of classification accuracy, but also in regards to time performance we have also run all evaluations on a Raspberry Pi 3 Model B for the ANN classifier. On this relatively low-performance device the ANN was only able to evaluate an average of 150 messages per second, which is less than a tenth of the performance shown by the CF classifier. 7. Conclusion. We have seen in previous work [6] that artificial neural network approaches to anomaly detection deliver good results but that it is hard to implement this kind of detection in-vehicle because of restrictions with respect to on-board resources of typical ECUs used in vehicular systems. Thus, we have started to analyze logs using a bind and branch approach that was very accurate but lacked robustness. From this experience we built a log analyzer in C that focused on payload bytes having either a small set of different values or a small set of possible changes. We have evaluated this characteristic functions approach on state-of-the-art CAN bus intrusion data from real-life intrusion scenarios and obtained results that are significantly more robust and accurate in comparison to a standard implementation of an artificial neural network classifier. The evaluations regarding the time performance of both approaches have also shown a significant margin between both approaches with characteristic functions being able to evaluate ten times more messages with the same time compared to even relatively small artificial neural network.