1. Introduction
About 95% of traffic accidents are due to erroneous decision-making of the drivers [1]. Driver assistance systems may cut down the number of accidents if their reliability is higher than that of human drivers. However, as such assistance systems become more powerful, they additionally have to compensate for the effect that a confident driver will become less watchful. Reliability becomes even more of a critical issue when gradually switching from level-2 to level-3, or higher assistance [2], where automobile OEMs (original equipment manufacturers) must assume full responsibility for any failures of their control systems. Mercedes-Benz was the first to release an officially certified level-3 assistant for autonomous driving on German highways, called Drive Pilot [3,4]. Other major players, such as BMW and Volvo, are poised to introduce similar products with enhanced performance to the market.
High reliability and low failure rates will be essential for the acceptance of self-driving vehicles by both consumers and politicians. This, however, somehow contradicts the software architecture behind such products, as the realization of self-driving functionalities is closely correlated with the development of AI methods. In particular, the appearance of deep learning was a major game changer, also enforcing a major paradigm change compared to traditional product development: models are extracted from data correlations instead of causal deduction starting from proven principles. The more powerful such models become, the higher the interpretability of predictions lost [5]. To counteract this property, car manufacturers have to work on two fronts: (i) making AI algorithms safe enough for autonomous driving, and (ii) carefully assessing the failure rate of such systems.
The safety challenge cannot be tackled with traditional field testing since the mileage required for assessing the desired low failure rate would be much too high [6]. Therefore, software-in-the-loop simulation (SiL) is utilized, embedding the real ADAS and sensor models in virtual vehicle models [7,8,9]. To save costs, only critical traffic scenarios are simulated. The simulation will then provide conditional probabilities of failure of the ADAS w.r.t. to various scenarios. For a final assessment of the overall failure probability, the rate of occurrence of these scenarios has to be known.
In close cooperation of automobile manufacturers with research institutions, the German research project PEGASUS [10] defined various logical scenarios with associated parameters. Here, the three scenarios—cut-in, cut-out, and cut-through—will be considered in more detail; see Figure 1. Mercedes-Benz AG has been collecting a rich dataset over the years via a large measurement fleet of vehicles, called ego, equipped with radar, lidar, and several cameras monitoring the traffic in the surrounding area. Only a minor subset of this data was analyzed by human engineers to extract characteristic time snippets and label them by the corresponding scenario class. To automatically find such events in a large dataset of measurements, an algorithmic search has to be performed. One possibility is to use pattern recognition of activities of the ego vehicle and the surrounding traffic [6,11,12,13,14]. The question, however, arises if the above-mentioned scenarios may simply be identified from measurements of the longitudinal distance $s\left(t\right)$ between the ego and the first ego vehicle running ahead of the ego or from lateral distances $d\left(t\right)$ of vehicles running on neighboring lanes; se Figure 2. The measurements are typically made with a measurement frequency of ${f}_{M}=50\phantom{\rule{0ex}{0ex}}\mathrm{Hz}$ resulting in time steps of size $\Delta t=0.02\phantom{\rule{0ex}{0ex}}\mathrm{s}$. These raw signals are then preprocessed by filling up dropouts and smoothing signals by filtering. The lateral data are additionally corrected by eliminating road curvature effects [15].
Section 2 will use the $s\left(t\right)$ measurement to detect cut-in, cut-out, and cut-through scenarios from sudden jumps when a car is entering or leaving the ego driving tube, see the upper plot in Figure 2. Rules will be derived from kinematic considerations and their impact on the longitudinal distance signal. Section 3 will build a detection and classification algorithm upon measurements of lateral distance $d\left(t\right)$ and its characteristic time-behavior associated with the above maneuvers; see the lower plot in Figure 2. Preliminary studies [16] on several machine learning algorithms have shown that a strategy based on the Time Series Forest (TSF) approach [17] is suited best. As a machine learning approach, it requires training data, which is why Section 4 develops a simple procedure to generate idealized training maneuvers from pieces of $d\left(t\right)$ time series. Both classification algorithms are tested and compared in Section 5 using the measurements of a 10-h ride on a German highway.
2. Rule-Based Classification Tree
The basis for deterministic scenario detection involves distance measurements $s\left(t\right)$ in the longitudinal direction; see the upper plot in Figure 2. Whenever there is a sharp transition from high to low values, this indicates a cut-in (CI) maneuver. This occurs because the distance suddenly shifts from the first ego vehicle to a shorter distance to the vehicle that is cutting in. This vehicle enters the driving lane of the ego vehicle, becoming the new vehicle in front and reducing the gap; see Figure 1. In Figure 2, this occurs for $t\approx 10\phantom{\rule{0ex}{0ex}}\mathrm{s}$. Subsequently, the ADAS reacts and enlarges the distance again to a proper, velocity-dependent distance. However, for $t\approx 30\phantom{\rule{0ex}{0ex}}\mathrm{s}$, the distance value suddenly jumps from a low to a high value, indicating a cut-out (CO) maneuver, where the distance increases to the vehicle in front of the former first ego, called the second ego, after the latter has left the ego driving tube. Cut-through (CT) maneuvers combine both characteristics, the abrupt drop, and increase in $s\left(t\right)$ within a short period of time, where the distance first reduces to the crossing vehicle and then back to the former first ego. This can be observed for $t\approx $ 151–158 s.
These relations may be embedded in a rule-based decision tree, as shown in Figure 3. The first three nodes exclude situations irrelevant to safety assessment. They check if the ego vehicle is driving on a highway with sufficient velocity since only then ADAS may take over control. Further ego lane changes are actually prohibited while driving in the level-3 mode. The more interesting fourth node detects jumps in the longitudinal distance $s\left(t\right)$ to the vehicle driving ahead, where the difference of subsequent measurements has to exceed a user-defined threshold ${\epsilon}_{s}$, i.e., $\Delta {s}_{i}:=|s\left({t}_{i+1}\right)-s\left({t}_{i}\right)|>{\epsilon}_{s}$. This could indicate one of the relevant scenarios (cut-in, cut-out, or cut-through), which may be directly classified depending on the sign of the jump if there exists only a single neighbor object in the subsequent measurement snippet. A cut-in maneuver suddenly shortens the distance yielding a negative jump; cut-out enlarges $s\left(t\right)$ with a positive jump, and cut-through combines both jumps if they occur within a predefined time interval $\Delta T<{\epsilon}_{t}$. In the following, ${\epsilon}_{s}=5\phantom{\rule{0ex}{0ex}}\mathrm{m}$ and ${\epsilon}_{t}=10\phantom{\rule{0ex}{0ex}}\mathrm{s}$ are chosen.
If multiple objects are located in the surrounding neighborhood of the ego vehicle and perform maneuvers, the distances between the ego and all objects of interest are measured at time points, ${t}^{*}$, where a vehicle enters the ego-driving tube. The vehicle with the minimum distance is then the most critical and relevant one for scenario classification, whereas the other vehicles are ignored.
When applying the rule-based decision tree to the measurement section in Figure 2, it detects four relevant scenarios; see Figure 4. Additionally, it extracts corresponding $20\phantom{\rule{0ex}{0ex}}\mathrm{s}$ snippets of the lateral distance $d\left(t\right)$ around the respective time point ${t}^{*}$ when the $s\left(t\right)$-jump arises. They confirm the correct classification: in the first CI-event the cut-in vehicle enters from the left adjacent lane (distance about $3.8\phantom{\rule{0ex}{0ex}}\mathrm{m}$ corresponding to the lane width), in the second CI-event from the right adjacent lane (distance about $-4\phantom{\rule{0ex}{0ex}}\mathrm{m}$). During cut-out, the vehicle leaves the ego lane to the left. Finally, the cut-through vehicle crosses from right to left.
At first glance, what appears logical and has thus become state-of-the-art, has two drawbacks: First, noisy data, sensor dropouts, and ghost objects can disrupt the deterministic classification method described. Second, tolerances for jump-through (${\epsilon}_{s}$) and cut-through durations (${\epsilon}_{t}$) have to be chosen by experience, which may have a major impact on the robustness of classification. AI-based approaches, as discussed in the next section, are often more robust since they learn such parameters from training data.
3. AI-Based Classification Scheme
Analogous to the rule-based classification, the general conditions—such as the ego vehicle driving on a highway at sufficient velocity without performing a lane change—remain valid. But the main difference involves using the lateral distance behavior $d\left(t\right)$ instead of $s\left(t\right)$ for classification, to align more closely with real drivers, who primarily rely on lateral information to assess driving maneuvers [18]. To identify scenario classes from a long sequence of measured data $d\left(t\right)$, a moving window approach is applied; see Figure 5. The sliding window of length ${T}_{W}=20\phantom{\rule{0ex}{0ex}}\mathrm{s}$ extracts pieces of the $d\left(t\right)-$measurement to classify each window separately. Figure 5b shows some selected 20 s-sequences in the time range from $30\phantom{\rule{0ex}{0ex}}\mathrm{s}$ to $50\phantom{\rule{0ex}{0ex}}\mathrm{s}$. From visual inspection, it can already be concluded that the $d\left(t\right)-$trajectory pieces have different characteristics. The first three windows show a jump from almost 4 m to $-4$ m in the lateral direction moving from right to left because the window moves in time from left to right. This may be considered a cut-through scenario. After some time, the jump diminishes, and the next three selected windows exhibit a gradual increase in $d\left(t\right)$ already observed for a typical cut-in maneuver. Again, after a while, this increase shifts to the left, and the last three windows display an almost straight line, which does not fit any of the three defined scenarios and is, therefore, classified as “other”.
This procedure results in the classification sequence in Figure 5c, where the time-step of moving the window is $1\phantom{\rule{0ex}{0ex}}\mathrm{s}$, which is short enough to ensure that every driving maneuver is included in at least one of the analysis windows. Obviously, adjacent windows mostly yield the same classification results associated with the same event. Therefore, these sliding window classification results have to be aggregated in final scenario events. Here, we determine that a sequence of at least five identical classification results can be summarized as a single event, each with associated time frames centered around the time points marked by red crosses for the scenario classes CI, CO, and CT. In particular, the middle crosses are taken as references, respectively, where the final windows with the corresponding lateral trajectories $d\left(t\right)$ of these events are illustrated in Figure 5d.
The kernel of the procedure is the assignment of a label $c\in C=\left\{\mathrm{CI},\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\mathrm{CO},\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\mathrm{CT}\right\}$ to each window, W, referring to “cut-in”, “cut-out”, or “cut-through”, where classification is solely based on the measured lateral distances $d({t}_{j}\in W)$; see Figure 6. If none of these scenarios apply to W, the label $c\in \overline{C}=\left\{\mathrm{other}\right\}$ is assigned. Presuming a measurement frequency of $50\phantom{\rule{0ex}{0ex}}\mathrm{Hz}$, each window of length ${T}_{W}=20\phantom{\rule{0ex}{0ex}}\mathrm{s}$ contains $M=1000$ data points. Thus, the data basis of $20\phantom{\rule{0ex}{0ex}}$ssequences extracted from the measurement campaign of real driving tests is represented as a time series $\mathbf{d}$ with an associated class c to be determined:
$$\begin{array}{c}\begin{array}{cc}& (\mathbf{d},c)\phantom{\rule{0ex}{0ex}}\mathrm{with}\phantom{\rule{0ex}{0ex}}\mathbf{d}=({d}_{1},{d}_{2},\dots ,{d}_{M}),\phantom{\rule{0ex}{0ex}}{d}_{j}=d({t}_{j}\in W),\hfill \\ & c\in C\cup \overline{C},\phantom{\rule{0ex}{0ex}}C=\left\{\mathrm{CI},\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\mathrm{CO},\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\mathrm{CT}\right\},\phantom{\rule{0ex}{0ex}}\overline{C}=\left\{\mathrm{other}\right\}.\hfill \end{array}\end{array}$$
The applied classification method is based on the time series forest (TSF) approach [17] using a specific feature-engineering procedure. It splits the window, W, into randomly selected subintervals ${w}^{\left(j\right)}$ and assigns features like the mean, variance, and slope to each subinterval. In total, N subintervals are obtained by the so-called bagging, a randomized stripping of the whole window, W, into fixed, equal-width subintervals ${w}^{\left(j\right)}\in W$ with replacement (here the bagging interval size is set to $2s$ resulting in $|{w}^{\left(j\right)}|=100$ data points, c.f. Figure 6). In the following, $K=3$ interval features are considered and summarized in the feature vector, as follows:
$$\begin{array}{c}{\mathbf{x}}_{f}\left({w}^{\left(j\right)}\right)={\mathbf{x}}_{f}^{\left(j\right)}:={\left[\{{\mu}^{\left(j\right)},{\sigma}^{\left(j\right)},{\beta}^{\left(j\right)}\}\right]}^{T},\phantom{\rule{0ex}{0ex}}j=1,\dots ,J.\end{array}$$
The extracted features are the mean ${\mu}^{\left(j\right)}$, standard deviation ${\sigma}^{\left(j\right)}$, and the slope ${\beta}^{\left(j\right)}$ of a regression line, Figure 6, as follows:
$$\begin{array}{c}\begin{array}{cccc}& {x}_{f,1}^{\left(j\right)}\hfill & \hfill :={\mu}^{\left(j\right)}& ={\displaystyle \frac{1}{|{w}^{\left(j\right)}|}}\sum _{{t}_{i}\in {w}^{\left(j\right)}}d\left({t}_{i}\right),\hfill \\ & {x}_{f,2}^{\left(j\right)}\hfill & \hfill :={\sigma}^{\left(j\right)}& ={\displaystyle \frac{1}{|{w}^{\left(j\right)}|-1}}\sum _{{t}_{i}\in {w}^{\left(j\right)}}{(d\left({t}_{i}\right)-{\mu}^{\left(j\right)})}^{2},\hfill \\ & {x}_{f,3}^{\left(j\right)}\hfill & \hfill :={\beta}^{\left(j\right)}& ={\displaystyle \frac{d\left({\widehat{t}}_{j}\right)-d\left({\stackrel{\u02c7}{t}}_{j}\right)}{{\widehat{t}}_{j}-{\stackrel{\u02c7}{t}}_{j}}},{\widehat{t}}_{j}=arg\underset{{t}_{i}\in {w}^{\left(j\right)}}{max}d\left({t}_{i}\right),{\stackrel{\u02c7}{t}}_{j}=arg\underset{{t}_{i}\in {w}^{\left(j\right)}}{min}d\left({t}_{i}\right)\hfill \end{array}\end{array}$$
The calculations of the feature vectors ${\mathbf{x}}_{f}^{\left(j\right)}$ are repeatedly applied to all randomly stripped subintervals, also known as sample generation. These samples are then used to build up a decision tree by learning splitting criteria ${x}_{f,k}^{\left(j\right)}\le {\tau}_{k}$ as the base component of a classification tree (see, e.g., Figure 7), classifying CI or non-CI. When satisfying this decision rule, the respective instances of the sample are sent to the left child node (blue), otherwise, to the right child node (yellow). These subsets are then further split until either the subsets become too small or a user-defined maximum depth ${N}_{f}$ of the classification tree is reached (in the following, ${N}_{f}$ is limited to the maximum depth 8). In the example in Figure 7, ${N}_{f}$ is only three, and only binary classification is demonstrated for simplicity, although TSF is capable of dealing with multi-class problems, as in our application.
Similar to the rule-based decision tree, time series trees also employ a top-down, recursive strategy. However, in this approach, each node independently learns the best-split feature, ${x}_{f,k}$, and the best threshold ${\tau}_{k}$ for the split criterion from training, instead of pre-defining it. To achieve this, the best thresholds ${\tau}_{k}$ are found for all features ${x}_{f,k}$, $k=1,2,3$, individually by maximizing a combination of entropy gain and margin [17]. Let ${E}_{k}\left({\tau}_{k}\right)$ denote the maximum values of the split for each feature ${x}_{f,k}$; the one with the highest value is then selected for this specific node, as follows:
$$\begin{array}{c}{k}^{*}=\mathrm{arg}\underset{k}{max}{E}_{k}\left({\tau}_{k}\right).\end{array}$$
The resulting tree and, thus, its classification result, depends on the training set used. To make the classification procedure more robust, ensemble learning is applied: several different training subsets ${W}^{\left(l\right)}\subset W$ are generated by bagging, each resulting in its own decision tree $\left({\mathrm{DT}}^{\left(\mathit{l}\right)}\right)$. Applied to any test data, these multiple decision trees will predict different classification outcomes ${c}^{\left(l\right)}$. To end up with a unique result, this tree ensemble is combined, such that the majority of classes ${c}^{\left(l\right)}$ define the final class c; see Figure 8. In this paper, the classification model consists of $L=200$ time series decision trees, each resulting from $J=50$ subintervals ${w}^{\left(j\right)}$ consisting of $|{w}^{\left(j\right)}|=100$ data points, respectively. In order to train the classification trees, a training set with hundreds of examples for each class is required. Since manual labeling of real measurement data is rather time-consuming, trajectories are generated as idealized driving maneuvers, as described in Section 4. The advantage is that the labels are known a priori.
The obtained time series-based machine learning classification model may then be applied to real-world data for automatic labeling of long-term measurements by performing classification of the moving windows. Figure 5 shows a short section of such a measurement with corresponding classification results. Similar to the rule-based approach in Figure 4, the AI approach finds the same four events labeled 1, 2, 4, and 6 in Figure 5c,d. In addition to the rule-based approach, the AI classification finds another three events, 3, 5, and 7, highlighted in red in Figure 5d. These, however, are misclassifications. The jump in event 3 from $d\approx 4$ to $d\approx -4$ was misinterpreted as a cut-through, although it actually resulted from switching the focus from the cut-in vehicle to another vehicle performing the next cut-in. Similarly, events 5 and 7 are misinterpretations of the different phases of a single cut-through maneuver 6. It should be noted that all these are speculations since the lack of causality in AI methods makes the interpretability of predictions difficult or even impossible. To be honest, the sequence in Figure 5 was selected to demonstrate the possibility of misclassification. Therefore, the above comparison should not be taken too seriously, but a rigorous assessment of the TSF performance must be made according to Section 5.
4. Generation of Idealized Maneuvers as Training Data
Apart from proper feature extraction and the model setup, the availability of enough training examples plays an important role in machine learning. In principle, labeled maneuvers obtained from measurements would be the best choice if available in a sufficient amount, otherwise, training data may be generated artificially [15,19]. To demonstrate the power of machine learning strategies, extremely idealized maneuvers will be used here, which are deduced from measurement data. According to Figure 1, ideal maneuvers have an initial phase with constant lateral distance $d\left(t\right)$, an S-shaped middle part describing the lane change, and a final phase that is also constant. Such an idealized cut-in maneuver is shown in Figure 9 as a red curve and can be described as a monotonic function ${y}_{i}=y\left({t}_{i}\right)$ by a cubic Hermite spline with three sections: (I) a constant initial phase with value ${d}^{0}$ for ${t}_{i}\le {t}^{0}$, (II) a 3rd-order polynomial, and (III) a constant end phase with value ${d}^{1}$ for ${t}_{i}\ge {t}^{1}$. Realistic values of the parameters determining this curve may be obtained by regression of a measured lateral trajectory $({t}_{i},\phantom{\rule{0ex}{0ex}}{d}_{i})$, $i=1,..,100$; see the blue curve in Figure 9. Thereby, we assume $20\phantom{\rule{0ex}{0ex}}\mathrm{s}$ time snippets consisting of 100 points similar to the specifications above. For simplicity, the original real-time ${t}_{i}$ is substituted by its associated index, i, i.e.,
$${t}_{i}\in \left\{1,2,\dots ,100\right\},\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}{t}^{0},{t}^{1}\in \left[1,100\right],\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}{t}^{0}<{t}^{1}.$$
The goal of regression is to find the transition points ${t}^{0},\phantom{\rule{0ex}{0ex}}{t}^{1}$ as well as the initial and final position values ${d}^{0},\phantom{\rule{0ex}{0ex}}{d}^{1}$ in an optimal sense. This may be formulated as a two-stage minimization problem, as follows:
$$\underset{1\le {t}^{0}<{t}^{1}\le 100}{min}\phantom{\rule{0ex}{0ex}}\underset{{d}^{0},{d}^{1}}{min}\phantom{\rule{0ex}{0ex}}\sum _{i}{({d}_{i}-{y}_{i})}^{2}.$$
Generally, a cubic polynomial with boundaries $({t}_{0},\phantom{\rule{0ex}{0ex}}{f}_{0},\phantom{\rule{0ex}{0ex}}\dot{{f}_{0}})$ and $({t}_{1},\phantom{\rule{0ex}{0ex}}{f}_{1},\phantom{\rule{0ex}{0ex}}\dot{{f}_{1}})$ may be described as follows:
$$f\left(t\right)={f}^{0}{G}_{0}+{f}^{1}{G}_{1}+\dot{{f}_{0}}{H}_{0}+\dot{{f}_{1}}{H}_{1}$$
where
$${G}_{0}=2{\xi}^{3}-3{\xi}^{2}+1,\phantom{\rule{0ex}{0ex}}{G}_{1}=-2{\xi}^{3}+3{\xi}^{2},\phantom{\rule{0ex}{0ex}}{H}_{0}=({\xi}^{3}-2{\xi}^{2}+\xi )\Delta ,\phantom{\rule{0ex}{0ex}}{H}_{1}=({\xi}^{3}-{\xi}^{2})\Delta $$
with
$$\xi ={\displaystyle \frac{t-{t}_{0}}{\Delta}},\phantom{\rule{0ex}{0ex}}\Delta ={t}_{1}-{t}_{0}.$$
For zero derivatives $\dot{{f}_{0}}=\dot{{f}_{1}}=0$, Equation (7) simplifies to the following:
$$f\left(t\right)={f}^{0}{G}_{0}+{f}^{1}{G}_{1}.$$
Applied to the three sections, for the given ${t}^{0}$ and ${t}^{1}$, we obtain the following regression conditions:
$$\begin{array}{cccccc}\hfill 1:& \phantom{\rule{0ex}{0ex}}{y}_{i}=f\left({t}_{i}\right)\equiv {d}^{0}\phantom{\rule{0ex}{0ex}}\stackrel{!}{=}\phantom{\rule{0ex}{0ex}}{d}_{i}\hfill & \hfill \phantom{\rule{0ex}{0ex}}& \phantom{\rule{0ex}{0ex}}\forall {t}_{i}\le {t}^{0},\hfill & \hfill \phantom{\rule{0ex}{0ex}}& \\ \hfill 2:& \phantom{\rule{0ex}{0ex}}{y}_{i}=f\left({t}_{i}\right)\equiv {d}^{0}{G}_{0}+{d}^{1}{G}_{1}\phantom{\rule{0ex}{0ex}}\stackrel{!}{=}\phantom{\rule{0ex}{0ex}}{d}_{i}\phantom{\rule{0ex}{0ex}}\hfill & \hfill \phantom{\rule{0ex}{0ex}}& \phantom{\rule{0ex}{0ex}}\forall {t}^{0}\le {t}_{i}\le {t}^{1},\hfill & \hfill \phantom{\rule{0ex}{0ex}}& \\ \hfill 3:& \phantom{\rule{0ex}{0ex}}{y}_{i}=f\left({t}_{i}\right)\equiv {d}^{1}\phantom{\rule{0ex}{0ex}}\stackrel{!}{=}\phantom{\rule{0ex}{0ex}}{d}_{i}\phantom{\rule{0ex}{0ex}}\hfill & \hfill \phantom{\rule{0ex}{0ex}}& \phantom{\rule{0ex}{0ex}}\forall {t}_{i}\ge {t}^{1}.\hfill \end{array}$$
With submatrices $\mathbf{1}={[1\dots 1]}^{T}$, ${\mathbf{G}}_{\u2022}={[\dots {G}_{\u2022}\left({\xi}_{i}\right)\dots ]}^{T}$ and $\mathbf{d}={[\dots {d}_{i}\dots ]}^{T}$, this may be summarized as an overestimated system of linear equations for ${d}^{0}$, ${d}^{1}$:
$$\underset{\mathbf{A}}{\underbrace{\left[\begin{array}{cc}\mathbf{1}& \mathbf{0}\\ {\mathbf{G}}_{\mathbf{0}}& {\mathbf{G}}_{\mathbf{1}}\\ \mathbf{0}& \mathbf{1}\end{array}\right]}}\underset{\mathbf{x}}{\underbrace{\left[\begin{array}{c}{d}^{0}\\ {d}^{1}\end{array}\right]}}=\underset{\mathbf{b}}{\underbrace{\left[\begin{array}{c}{\mathbf{d}}_{\mathbf{1}}\\ {\mathbf{d}}_{\mathbf{2}}\\ {\mathbf{d}}_{\mathbf{3}}\end{array}\right]}}$$
where $\mathbf{A}\in {\mathrm{I}\phantom{\rule{0ex}{0ex}}\mathrm{R}}^{100\times 2}$, $\mathbf{x}\in {\mathrm{I}\phantom{\rule{0ex}{0ex}}\mathrm{R}}^{2}$, $\mathbf{b}\in {\mathrm{I}\phantom{\rule{0ex}{0ex}}\mathrm{R}}^{100}$. The values ${d}^{0}$ and ${d}^{1}$ then result from the least-squares solution ${\mathbf{x}}^{*}={\left({\mathbf{A}}^{T}\mathbf{A}\right)}^{-1}{\mathbf{A}}^{T}\mathbf{b}$ and the final Hermite-spline is given as $\mathbf{y}={[\dots {y}_{i}\dots ]}^{T}=\mathbf{A}{\mathbf{x}}^{*}$.
The outer minimization problem in Equation (6) may be solved by any nonlinear optimization algorithm. In order to avoid constraint (5) for ${t}^{0}$ and ${t}^{1}$, we can re-parametrize the problem with normalized parameters ${p}_{1},{p}_{2}\in [0,1]$:
$$\begin{array}{cc}\hfill \phantom{\rule{0ex}{0ex}}& B:=1+({N}_{t}-2){p}_{1}\phantom{\rule{0ex}{0ex}}\in [1,{N}_{t}-1],\hfill \\ \hfill \phantom{\rule{0ex}{0ex}}& {t}^{0}=1+({N}_{t}-B-1){p}_{2}\phantom{\rule{0ex}{0ex}}\in [1,{N}_{t}-1],\hfill \\ \hfill \phantom{\rule{0ex}{0ex}}& {t}^{1}={t}^{0}+B\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\in [2,{N}_{t}].\hfill \end{array}$$
A two-phase solution strategy is chosen, where global optimization by differential evolution [20] as the first phase is combined with local optimization by sequential quadratic programming (SQP), with the BFGS update [21] as the second phase to find global optimal parameters ${p}_{1},{p}_{2}$ and, thus, ${t}^{0}$, ${t}^{1}$ for the regression problem (6). Each evaluation of the cost function requires a least-square solution of Equation (12), which is pretty fast. Differential evolution explores the parameter space to find an initial guess for the global minimum, while the SQP algorithm refines this solution to find a more precise minimizer.
As a result, we find the idealized maneuvers in Figure 10. In the upper row, cut-in maneuvers start from lateral distances of about ${d}^{0}\approx 3.8\phantom{\rule{0ex}{0ex}}\mathrm{m}$, which is the typical lane width of German highways, and ends at about ${d}^{1}\approx 0$. Cut-out maneuvers begin from the ego lane with ${d}^{0}\approx 0$ and involve a lane change to the neighboring lane, ending with ${d}^{1}\approx 3.8\phantom{\rule{0ex}{0ex}}\mathrm{m}$. Cut-through maneuvers typically start from the right neighboring lane with ${d}^{0}\approx -3.8\phantom{\rule{0ex}{0ex}}\mathrm{m}$ and proceed to ${d}^{1}\approx 3.8\phantom{\rule{0ex}{0ex}}\mathrm{m}$. In the lower row, analogous training maneuvers from the other side are shown. Obviously, measurements reveal a wide variety of slow and fast lane changes, which have to be detected by the algorithms described in Section 2 and Section 3. It should be noted that the training samples in Figure 10 could also be generated artificially by randomly selecting the parameters ${t}^{0}$, ${t}^{1}$, ${d}^{0}$, and ${d}^{1}$ within proper ranges and computing ${y}_{i}=y\left({t}_{i}\right)$ along Equation (11). Here, it should also be noted that the used classification model is trained with around 8000 generated trajectories from each scenario.
5. Comparison of Classification Strategies
In order to test the performances of the two classification approaches presented in Section 2 and Section 3 on real measurement data, 10 h of motorway driving are labeled, manually resulting in 77 cut-in (CI) maneuvers, 111 cut-out (CO) maneuvers, and 10 cut-through (CT) maneuvers. This is considered the ground truth. The classification quality of the rule-based (RB) decision tree and TSF-based model may then be assessed from the confusion matrices in Table 1.
The absolute frequencies or counts of predictions from the corresponding classification models, compared to the actual scenario events that occurred, are displayed. For example, Table 1a can be interpreted as follows: the rule-based decision tree correctly classifies 66 of the 77 CI events but misses 11 CI events and incorrectly classifies 8 other events as CI. Obviously, both approaches avoid misclassification with respect to the considered scenario types—CI, CO, or CT—but they do sometimes miss events or misclassify non-interesting measurement parts (= other) as one of the interesting scenarios. Therefore, cross-comparing the classification errors of the two classification models might be interesting. Table 2 highlights the falsely predicted events, which are labeled as ‘other’, in comparison to the predicted events from the other method for the same ones.
Based on these results, it can be concluded that the TSF complements the rule-based classification approach, e.g., in overall scenario classes, the TSF correctly predicts a relatively significant number of events that were falsely labeled by other methods. In contrast, the rule-based classifier correctly identifies only one event mislabeled in the TSF predictions. At first glance, the TSF appears to perform better, but an objective assessment requires additional metrics.
To apply the usual classification quality metrics to this multi-class classification problem, a transformation is needed that decomposes it into several binary tasks. The simplest approach is the One-vs-all (OVA) transformation shown in Figure 11. It generates three binary problems for the three classes, $c\in C=\left\{\mathrm{CI},\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\mathrm{CO},\phantom{\rule{0ex}{0ex}}\phantom{\rule{0ex}{0ex}}\mathrm{CT}\right\}$, where each problem discriminates a selected class from the other three classes, including $c=\u201d\mathrm{other}\u201d$ [22]. The resulting binary classification then only distinguishes two discrete classes, e.g.,
$$\begin{array}{c}{c}_{b}=\left\{\begin{array}{cc}1\hfill & \phantom{\rule{0ex}{0ex}}\mathrm{if}\phantom{\rule{0ex}{0ex}}c=\mathrm{CI}\phantom{\rule{0ex}{0ex}}(\mathrm{cut}-\mathrm{in}),\hfill \\ 0\hfill & \phantom{\rule{0ex}{0ex}}\mathrm{if}\phantom{\rule{0ex}{0ex}}c\ne \mathrm{CI}\phantom{\rule{0ex}{0ex}}(\mathrm{non}-\mathrm{CI}).\hfill \end{array}\right.\end{array}$$
In our case, Table 1 can be transformed into the binary confusion matrices shown in Table 3.
An ideal classifier would classify all cut-in events as CI and all non-CI maneuvers as non-cut-in. Misclassification occurs when a cut-in maneuver is classified as non-CI, or a non-CI event is classified as CI. These two variants of misclassifications have rather different implications. They are, therefore, distinguished when determining the classification quality.
We can count the following four cases of classification results after the OVA transformation; see Table 3a:
${t}_{p}$: number of true-positive outcomes (CI is classified as CI).
${t}_{n}$: number of true-negative outcomes (non-CI is classified as non-CI).
${f}_{p}$: number of false-positive outcomes (non-CI is classified as CI).
${f}_{n}$: number of false-negative outcomes (CI is classified as non-CI).
Based on the absolute frequencies of these four cases, commonly used criteria may be applied. For instance, the accuracy, i.e.,
$${s}_{ACC}={\displaystyle \frac{\mathrm{correct}\phantom{\rule{0ex}{0ex}}\mathrm{predictions}}{\mathrm{all}\phantom{\rule{0ex}{0ex}}\mathrm{events}}}=({t}_{p}+{t}_{n})/({t}_{p}+{t}_{n}+{f}_{p}+{f}_{n})$$
indicates how well a binary classifier predicts the correct result. According to Table 3, the rule-based decision tree has a lower accuracy of ${s}_{ACC,DT}=196/215\approx 91.2\%$ compared to ${s}_{ACC,TSF}=195/205\approx 95.1\%$ of TSF and, thus, has a lower classification performance.
Regarding problems with several classes, it is important to check the recall and precision of each class. The goal is to achieve maximum recall ${s}_{R}={t}_{p}/({t}_{p}+{f}_{n})$ and precision ${s}_{P}={t}_{p}/({t}_{p}+{f}_{p})$ for each label, which directly results in high accuracy. Recall is the probability that a real event is classified correctly, whereas precision describes the probability that the prediction of an event is correct. In other words, high precision means that the identified scenarios are likely to belong to the corresponding class, whereas high recall means that a large proportion of the real scenarios is identified.
The results of the rule-based decision tree (DT) and time series forest (TSF) for all three scenario types are listed in Table 4. Additionally, the supervised TSF (sTSF) was investigated, which is an extension of TSF, where feature intervals are not randomly selected from the entire time series but are instead determined based on their discriminative ability, enhancing the efficiency of TSF [23]. Moreover, additional features such as median, minimum, maximum, and interquartile range were extracted. All three classification approaches demonstrate high precision and recall performances for all three scenarios. In contrast to the impression gained from the discussion in Section 3, the slight differences confirm that TSF outperforms the other two classifiers. Notably, its highly sensitive approach results in outstanding precision, and its robustness—bolstered by a large number of decision trees with significant depth ${N}_{f}$—yields almost the best recall values for all scenario types.
Finally, an overall measure summarizing all three scenarios will be computed. Thereby, it typically should be taken into account that, due to the different rates of occurrence of the different scenario classes, the evaluated dataset is imbalanced. However, because of their equal relevance for risk assessment, here, each scenario class is taken into account equally to define total precision ${s}_{P,t}$ and total recall ${s}_{R,t}$ as arithmetic means
$$\begin{array}{c}{s}_{P,t}={\displaystyle \frac{1}{3}}\sum _{c\in C}{s}_{P,c},\phantom{\rule{0ex}{0ex}}{s}_{R,t}={\displaystyle \frac{1}{3}}\sum _{c\in C}{s}_{R,c}.\end{array}$$
This ultimately results in Table 5, highlighting that over all scenarios, the TSF outperforms the two other classification methods.
6. Conclusions
Validating highly automated driver assistance systems requires accurate risk assessment of safety-relevant driving situations. Detecting and classifying such scenarios in large datasets of measurements is a crucial initial step in the validation process. The currently applied rule-based scenario classification fails to meet all quality needs and is therefore replaced here by a time series forest (TSF) approach, which achieves better classification results. Due to its use of lateral distances to all neighboring vehicles, the TSF approach misses fewer critical scenarios than the rule-based decision tree, which only looks ahead along the ego-driving tube. It is also more robust against misclassifications of non-critical scenarios, thus leading to more reliable predictions of failure probabilities. It should be emphasized that highly idealized samples have been used for training the machine learning model. To further improve its robustness, measured snippets of more realistic artificial maneuvers generated by [15] may be used instead. Additionally, features may be extracted from $s\left(t\right)$ and combined with the $d\left(t\right)$-features already used in the paper.
7. Patents
A patent application is in the preliminary examination.
Author Contributions
Methodology and paper writing, Z.K. and D.B.; software and investigation, Z.K.; supervision, D.B. All authors have read and agreed to the published version of the manuscript.
Funding
The second author is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), project no. 501840485.
Data Availability Statement
The datasets presented in this article are not available due to the privacy restrictions of the OEM. Requests to access the datasets should be directed to correspondence.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Europäisches Parlament. Safer Roads: New EU Measures to Reduce Car Accidents. Available online: https://www.europarl.europa.eu/topics/en/article/20190307STO30715/safer-roads-new-eu-measures-to-reduce-car-accidents (accessed on 7 August 2024).
- Society of Automotive Engineers. Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles; ISO, J3016_202104; SAE International: Warrendale, PA, USA, 2021. [Google Scholar]
- van der Aalst, W. Six Levels of Autonomous Process Execution Management. arXiv 2022, arXiv:2204.11328. [Google Scholar]
- Richter, A.; Walz, T.P.; Dhanani, M.; Häring, I.; Vogelbacher, G.; Höflinger, F.; Finger, J.; Stolz, A. Components and their Failure Rates in Autonomous Driving. In Proceedings of the 33rd European Safety and Reliability Conference (ESREL 2023), Southampton, UK, 3–7 September 2023. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Reichenbächer, C.; Rasch, M.; Kayatas, Z.; Wirthmueller, F.; Hipp, J.; Dang, T.; Bringmann, O. Identifying Scenarios in Field Data to Enable Validation of Highly Automated Driving Systems. In Proceedings of the 8th International Conference on Vehicle Technology and Intelligent Transport Systems-VEHITS, Online, 27–29 April 2022; SciTePress: Setúbal, Portugal, 2022; pp. 134–142, ISBN 978-989-758-573-9. [Google Scholar] [CrossRef]
- Wagner, J.; Häring, J. Validierung von hochautomatisierten Fahrfunktionen mit cloud-basierter Simulation. ATZ 2020, 122, 50–55. [Google Scholar] [CrossRef]
- Pfeffer, R.; He, J.; Sax, E. Development and implementation of a concept for the meta description of highway driving scenarios with focus on interactions of road users. In Proceedings of the 6th International Conference on Vehicle Technology and Intelligent Transport Systems–VEHITS, Prag, Czech Republic, 2–4 May 2020. [Google Scholar]
- Bock, J.; Krajewski, R.; Eckstein, L.; Klimke, J.; Sauerbier, J.; Zlocki, A. Data Basis for Scenario-Based Validation of HAD on Highways. In Proceedings of the 27th Aachen Colloquium Automobile and Engine Technology, Aachen, Germany, 8–10 October 2018. [Google Scholar]
- PEGASUS. 2019. Available online: https://www.pegasusprojekt.de/en/pegasus-method (accessed on 14 April 2024).
- Muslim, H.; Endo, S.; Imanaga, H.; Kitajima, S.; Uchida, N.; Kitahara, E.; Ozawa, K.; Sato, H.; Nakamura, H. Cut-Out Scenario Generation with Reasonability Foreseeable Parameter Range from Real Highway Dataset for Autonomous Vehicle Assessment. IEEE Access 2023, 11, 45349–45363. [Google Scholar] [CrossRef]
- Zlocki, A.; König, A.; Bock, J.; Weber, H.; Muslim, H.; Nakamura, H.; Watanabe, S.; Antona-Makoshi, J.; Taniguchi, S. Logical Scenarios Parameterization for Automated Vehicle Safety Assessment: Comparison of Deceleration and Cut-In Scenarios from Japanese and German Highways. IEEE Access 2022, 10, 26817–26829. [Google Scholar] [CrossRef]
- Ma, X.; Ma, Z.; Zhu, X.; Cao, J.; Yu, F. Driver Behavior Classification under Cut-In Scenarios Using Support Vector Machine Based on Naturalistic Driving Data; SAE Technical Paper 2019-01-0136; SAE: Warrendale, PA, USA, 2019. [Google Scholar] [CrossRef]
- De Gelder, E.; Paardekooper, J.P.; Saberi, A.K.; Elrofai, H.; den Camp, O.O.; Kraines, S.; Ploeg, J.; De Schutter, B. Towards an Ontology for Scenario Definition for the Assessment of Automated Vehicles: An Object-Oriented Framework. IEEE Trans. Intell. Veh. 2022, 7, 300–314. [Google Scholar] [CrossRef]
- Kayatas, Z.; Bestle, D.; Bestle, P.; Reick, R. Generation of Realistic Cut-in Maneuvers to Support Safety Assessment of Advanced Driver Assistance Systems. Appl. Mech. 2023, 4, 1066–1077. [Google Scholar] [CrossRef]
- Pirouz, A. Identifizierung Kritischer Fahrszenarien aus Messdaten Mittels Machine Learning Methoden. Master’s Thesis, University of Stuttgart, Stuttgart, Germany, 2021. [Google Scholar]
- Deng, H.; Runger, G.; Tuv, E.; Vladimir, M. A Time Series Forest for Classification and Feature Extraction. Inf. Sci. 2013, 239, 142–153. [Google Scholar] [CrossRef]
- Roesener, C.; Harth, M.; Weber, H.; Josten, J.; Eckstein, L. Modelling Human Driver Performance for Safety Assessment of Road Vehicle Automation. In Proceedings of the 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018. [Google Scholar] [CrossRef]
- De Gelder, E.; Hof, J.; Cator, E.; Paardekooper, J.P.; den Camp, O.O.; Ploeg, J.; De Schutter, B. Scenario Parameter Generation Method and Scenario Representativeness Metric for Scenario-Based Assessment of Automated Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 18794–18807. [Google Scholar] [CrossRef]
- Storn, R.; Price, K. Differential Evolution—A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
- Fletcher, R. Methods of Optimization; Wiley: Chichester, UK, 1987. [Google Scholar]
- Rifkin, R.; Klautau, A. In Defense of One-vs-All Classification. J. Mach. Learn. Res. 2004, 5, 101–141. [Google Scholar]
- Cabello, N.; Naghizade, E.; Qi, J.; Kulik, L. Fast and Accurate Time Series Classification through Supervised Interval Search. In Proceedings of the IEEE International Conference on Data Mining, Sorrento, Italy, 17–20 November 2020. [Google Scholar] [CrossRef]
Figure 1. Scenario identification from fleet measurements.
Figure 1. Scenario identification from fleet measurements.
Figure 2. Measured data for longitudinal distance $s\left(t\right)$ and lateral distance $d\left(t\right)$.
Figure 2. Measured data for longitudinal distance $s\left(t\right)$ and lateral distance $d\left(t\right)$.
Figure 3. Rule-based decision tree for scenario classification.
Figure 3. Rule-based decision tree for scenario classification.
Figure 4. Scenarios identified from distance measurement $s\left(t\right)$ with corresponding snippets of lateral distance $d\left(t\right)$.
Figure 4. Scenarios identified from distance measurement $s\left(t\right)$ with corresponding snippets of lateral distance $d\left(t\right)$.
Figure 5. Selected moving windows (b) extracted from a lateral distance measurement (a) and corresponding classification (c), which finally results in aggregated scenarios (d).
Figure 5. Selected moving windows (b) extracted from a lateral distance measurement (a) and corresponding classification (c), which finally results in aggregated scenarios (d).
Figure 6. Feature extraction from subinterval ${w}^{\left(j\right)}\subset W$.
Figure 6. Feature extraction from subinterval ${w}^{\left(j\right)}\subset W$.
Figure 7. Exemplary binary time series decision tree of depth ${N}_{f}=3$, where non-CI represents CO, CT, or other.
Figure 7. Exemplary binary time series decision tree of depth ${N}_{f}=3$, where non-CI represents CO, CT, or other.
Figure 8. Exemplary ensemble of time series decision trees $\left({\mathrm{DT}}^{\left(\mathrm{l}\right)}\right)$ with individual results ${c}^{\left(l\right)}$ combined to determine the final predicted class c.
Figure 8. Exemplary ensemble of time series decision trees $\left({\mathrm{DT}}^{\left(\mathrm{l}\right)}\right)$ with individual results ${c}^{\left(l\right)}$ combined to determine the final predicted class c.
Figure 9. Idealized parametric model (red) describing a real cut-in maneuver (blue) by the three-section Hermite-spline regression.
Figure 9. Idealized parametric model (red) describing a real cut-in maneuver (blue) by the three-section Hermite-spline regression.
Figure 10. Samples of idealized maneuvers: (a) cut-in from left and right, (b) cut-out to left and right, and (c) cut-through from right and left.
Figure 10. Samples of idealized maneuvers: (a) cut-in from left and right, (b) cut-out to left and right, and (c) cut-through from right and left.
Figure 11. One-vs-all transformation of multi-class classification problems into binary classification problems (colors and marker types refer to predicted classes, filled markers refer to correct classification, unfilled markers refer to misclassification).
Figure 11. One-vs-all transformation of multi-class classification problems into binary classification problems (colors and marker types refer to predicted classes, filled markers refer to correct classification, unfilled markers refer to misclassification).
Table 1. Confusion matrix for the scenario classification.
Table 1. Confusion matrix for the scenario classification.
(a) rule-based decision tree | (b) TSF-based model | ||||||||||
Real Scenario | Real Scenario | ||||||||||
CI | CO | CT | other | CI | CO | CT | other | ||||
prediction | CI | 66 | - | - | 8 | prediction | CI | 69 | - | - | 2 |
CO | - | 97 | - | 7 | CO | - | 103 | - | 4 | ||
CT | - | - | 9 | 2 | CT | - | - | 10 | 1 | ||
other | 11 | 14 | 1 | - | other | 8 | 8 | 0 | - |
Table 2. Cross-comparison between prediction errors of the different classification approaches.
Table 2. Cross-comparison between prediction errors of the different classification approaches.
(a) wrong RB and corresponding | (b) wrong TSF and corresponding | ||||||||||
TSF prediction | RB prediction | ||||||||||
TSF Prediction | RB Prediction | ||||||||||
CI | CO | CT | other | CI | CO | CT | other | ||||
wrong RB | CI = 11 | 4 | - | - | 7 | wrong TSF | CI = 8 | 1 | - | - | 7 |
CO = 14 | - | 6 | - | 8 | CO = 8 | - | - | - | 8 | ||
CT = 1 | - | - | 1 | - | CT = 0 | - | - | - | - | ||
other | - | - | - | - | other | - | - | - | - |
Table 3. Confusion matrix for cut-in (CI) after OVA transformation.
Table 3. Confusion matrix for cut-in (CI) after OVA transformation.
(a) generally | (b) rule-based decision tree | (c) TSF-based model | |||||||||
Real | Real | Real | |||||||||
CI | non-CI | CI | non-CI | CI | non-CI | ||||||
pred. | CI | ${t}_{p}$ | ${f}_{p}$ | pred. | CI | 66 | 8 | pred. | CI | 69 | 2 |
non-CI | ${f}_{n}$ | ${t}_{n}$ | non-CI | 11 | 130 | non-CI | 8 | 126 |
Table 4. Precision and recall of various classification methods.
Table 4. Precision and recall of various classification methods.
Classification Models | Cut-in | Cut-out | Cut-through | |||
---|---|---|---|---|---|---|
${s}_{P}$ | ${s}_{R}$ | ${s}_{P}$ | ${s}_{R}$ | ${s}_{P}$ | ${s}_{R}$ | |
rule-based decision trees (DT) | 89.2% | 85.7% | 93.3% | 87.4% | 81.8% | 90.0% |
time series forest (TSF) | 97.2% | 89.6% | 96.3% | 92.8% | 90.9% | 100.0% |
supervised time series forest (sTSF) | 95.8% | 88.3% | 95.4% | 93.7% | 83.3% | 100.0% |
Table 5. Total precision and recall for various classification methods.
Table 5. Total precision and recall for various classification methods.
Classification Models | All Scenarios | |
---|---|---|
${s}_{P,t}$ | ${s}_{R,t}$ | |
Rule-based decision trees (DT) | 88.1% | 87.7% |
Time series forest (TSF) | 94.8% | 94.1% |
Supervised time series forest (sTSF) | 91.5% | 94.0% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).