DECISION TREES AS A TOOL FOR REAL-TIME TRAVEL TIME ESTIMATION ON HIGHWAYS DECISION TREES AS A TOOL FOR REAL-TIME TRAVEL TIME ESTIMATION ON HIGHWAYS

The typical example is published in paper [4]. The authors put forward new intersection control optimization method based on real-time delays estimation. The total delay is done by average delay caused by signal control, the additional delay caused by the queuing of previous vehicles and also by the stochastic delay. To obtain the total time delay, it is nec-essary to use analytical equations with many parameters that must be set in a complicated way. The presented method overcomes the necessity to apply analytical methods with many boundary condi-tions by the using of AI methods. The neural network optimized by generic algorithm of models for the real time forecast of traﬃc ﬂow is presented in [5]. Traﬃc ﬂow is a better predicable parameter than travel time is. The reason is done by slowly varying curve of traﬃc ﬂow which is “modulated” by pseudo stochastic randomly distributed clusters of vehicles. The average relative error of traﬃc ﬂow forecast is 5.53%. However, the proposed method of estimating TT has not a larger deviation than This paper presents a time travel estimation model based on a decision tree. Input parameter for travel time prediction is occupancy of detectors. The proposed model was tested on the most widely used arterial in Prague and also in the Czech Republic. This road section has many unmeasured inputs and outputs and with regards to only two detectors within a section it is difficult to estimate the travel time. A temporary installation of ALPR system is used for training the decision tree model which consequently provides reliable travel time estimation.


Introduction
Travel time (TT) is an important traffic system performance indicator and its great advantage is that it is easy to understand for drivers and generic users in contrast to the usual manner when the Level of Service is expressed as integer number 1 to 5 ("1" is free flow and "5" is congestion).
Travel time can be measured directly by a license plate matching system for every individual vehicle or indirectly estimated through many techniques, mostly using stationary traffic detectors as the most common type of measurement tool, ref. [1], [2]. However the automated licence plate recognition (ALPR) system is still rather expensive and the proposed method uses ALPR only for calibration of estimations of TT provided by static traffic detectors. The described method in this paper is based on a decision tree which estimates travel time using data from loop detectors which were being calibrated by ALPR. ALPR could be installed for a limited time, but it is highly recommended to share the wide range of traffic situations.

Literature review
There are many methods for travel time estimation. A trajectory method uses spot speed measures and transforms it to the estimate of the space mean speed. A first conversion formula was described as early as in 1952 by Wardrop,[3]: where v s is average space mean speed, v τ is average time mean speed and σ 2 s is variance of the space-mean speed distribution. In the real traffic environment we can directly measure the speed at the spot (time-mean speed), but this formula enables to calculate it from space-mean speed, which is impossible to obtain by standard inductive loops.
Travel time is more meaningful traffic parameter than volume of traffic or occupancy of traffic detectors are. Travel time could be directly used in intelligent transport systems to inform drivers through information displays placed at highways or roads. Travel time is also measured for delay in a road network and could be used as input variable in traffic control systems. The typical example is published in paper [4]. The authors put forward new intersection control optimization method based on real-time delays estimation. The total delay is done by average delay caused by signal control, the additional delay caused by the queuing of previous vehicles and also by the stochastic delay. To obtain the total time delay, it is necessary to use analytical equations with many parameters that must be set in a complicated way. The presented method overcomes the necessity to apply analytical methods with many boundary conditions by the using of AI methods.
The neural network optimized by generic algorithm of models for the real time forecast of traffic flow is presented in [5]. Traffic flow is a better predicable parameter than travel time is. The reason is done by slowly varying curve of traffic flow which is "modulated" by pseudo stochastic randomly distributed clusters of vehicles. The average relative error of traffic flow forecast is 5.53%. However, the proposed method of estimating TT has not a larger deviation than 6%, while it is relatively less complicated than neural network oriented method.

A. Trajectory based methods
Trajectory methods are the simplest and the most commonly used techniques to estimate travel time, [6]. Both methods consider the measured speed is constant until the next detector (or to the half distance on both sides). In half distance method the speed measured by a detector is applicable to half the distance on both sides. (2) In average speed method the speed is assumed to be the average speeds measured by detectors 1 and 2. (3) The main disadvantage of the constant speed trajectory methods is the decreasing performance with increasing traffic congestion.
The author in paper [1] describes a method which uses raw (not time aggregated) data from loop detectors. The spot speed of an individual vehicle is considered constant until the next measurement of another available vehicle. The trajectory of one individual vehicle in time and space is constructed from the sequence of speed measurements obtained from the same vehicle and its few successors. The author evaluated the mean error of his method result between 7 -10 % considering constant shock wave speed at 22.4 km/h. Unfortunately, the raw input data is essential for this method, but this data is not commonly available online in the Czech Republic.

B. Traffic flow theory based methods
Theoretical models have also been developed for the estimation of travel time from loop detector data based on traffic flow theory.
The Adaptive Kalman Filter (AKF) is used in the paper to improve the travel time estimate from inductive loop using a small number of probe vehicles. Unfortunately probe vehicles were simulated by microscopic simulation model PARAMICS. Because of the inaccuracy of the travel time estimation from point detectors, traffic data from other sources can be incorporated to improve the estimation. Due to the recent advances in probe vehicle technologies, such as Global Positioning Systems (GPS), Automatic Vehicle Identification (AVI), cellular phone positioning, and vehicle re-identification technology, probe vehicle has shown its potential to be another valuable real-time traffic data source according to the [2].
Both data inputs (loops and probe vehicles) do not truly reflect accurate section travel time especially under recurrent or non-recurrent traffic congestion condition.
The proposed method in [2] consists of two parts. First is based on the conservation or continuity equation: where q is flow (vehicles/hour), k is density (vehicles/km), x is location, and t is time.
For a typical urban freeway section including one on-ramp and one off-ramp, the traffic flow passing the section during time period (t Ϫ 1, t) can be estimated as: (5) where α is a smoothing parameter that is set to 0.5. q u (t) and q d (t) are traffic flows of the upstream and downstream boundaries within (t Ϫ 1, t). q on (t) and q off (t) are total on-ramp and off-ramp traffic flows within (t Ϫ 1, t).
Assuming that the traffic inside of the section is homogeneous, an intuitive estimation of the section travel time is: (6) where Δx is length of the section between upstream and downstream detectors. Section density, k(t), which can be represented as a time series: (7) where L is the number of lanes on the mainline freeway.
The Kalman filter is used for correction of section density, which is used for travel time estimation. Section density is treated as a state variable and the section travel time is treated as a measurement variable. The performance of the proposed algorithm is evaluated by Mean Absolute Percentage Error (MAPE). The author also simulated constant error patterns and time-varying error pattern on the loop detectors in order to evaluate how the proposed method can handle these kinds of common errors.
The AKF Algorithm using 5% probe vehicles in the traffic stream (simulated in PARAMICS simulation model) has MAPE in range 7.6 -9.7% compared to both single data source methods up to 16.6%, which is a significant improvement. The author also stated that with the higher probe rate available, better performance can be expected from the AKF model. When the probe rate is higher than 20% no more improvement is observed.

Proposed model
The Travel time estimation model which improves travel time based only on data from the series of loop detectors is presented in this paper (see Fig. 1) and in [7]. The Automatic licence plate recognition (ALPR) system was temporarily used for training of  The length of the section where estimation of travel times was being tested is 3.6 kilometres and there are three profile measurements by traffic loop detector. The average distance between detectors is greater than one kilometre. The application of the analytical methods described in Chap. 2 was not successful and the difference between real measured TT and estimated time was often higher than 100 percent; depending on the actual traffic volume. The results were poorer for low or very high traffic volume. The reason is simple -the distance between detectors is too large. For instance, the recommended distance between fixed detectors on highways is about 500 m in the Netherlands.
A road without many exits is most appropriate for accurate TT estimation of TT, but it is not the case of our tested road, see Fig. 2. Jizni spojka is an arterial road in Prague that carries more than 100 000 personal cars connecting the southern and northern parts of Prague. Three significant exits and two petrol stations on this short tested section are present. Arriving and departing vehicles significantly influence accuracy of TT estimation. In order to reduce the inaccuracy to the lowest possible level the method based on artificial intelligence is probably best to use. Well-trained decision trees have the capacity to cover all traffic scenarios at this road.
A. Decision tree learning process Decision trees (DT) can be used to discover features and extract patterns in large databases that are important for predictive models. Decision trees have an established position in artificial intelligence methods. DT are used as classification and prediction tool of travel time estimation based on input parameters which occupancy of detectors is.
Detector occupancy is the ratio between the time that the vehicle is over the detector, and the scanning period expressed in percentage.
The Decision tree learning process requires input variables which are traffic parameters (occupancy) and output values as travel time. Figure 2 depicts a specific part of Jizni spojka in Prague. It is possible to see the whole section enclosed by green (start) and red (finish) flags. Two video-detectors (2 and 3) lie within the road link section, a first one is located outside, in position -364 meters from the start. The ALPR system measured TT between point red flag and green flag and it is 3644 metres long. Data recorded in one month contain both free flow and congested traffic states. 90% of data was used for learning decision tree model (training data). The rest was verification data. The preliminary DTM tests used traffic volumes as well as occupancy of detectors. Many experiments showed that the trees are too complex in this case and practically the same results can be achieved when using only occupancy.
The final model uses five minute aggregated occupancy values of all three video-detectors as the input variables. Based on these inputs the travel time estimation is provided in same time step (5 minute time step). Part of related decision tree for travel time estimation can be seen in Fig. 3, which shows how complex created DT could be.
Very fast evaluation of binary condition (if -then) is the significant advantage of decision trees for traffic engineering application. Hence the decision tree-based model can be easily implemented.

B. Model validation and performance evaluation
Consequently unused input data (about 10%) was used for model performance validation using the truth value of TT coming from ALPR system. Three performance indicators were used for evaluation purposes: MSE (mean square error) of average speed, MAE (mean absolute error) of average speed and SSE (sum of square error) of average speed  Table 1. It is possible to see that the travel time error (in seconds) corresponds to the mean absolute error of the speed on the road section.  The diagram in Fig. 4 (bellow) is graphical presentation of real measured TT by ALPR (red curve) and predicted value by DTM (green curve). X-axis expresses the time. The residual value is the curve at the bottom.

Practical application
System for prediction of travel time based "only" on very limited set of input data was implemented as Czech web site of the City Hall of Prague to give a possibility for road users to check accuracy of the proposed system, Fig. 5.
Three different colours represent all three lanes measured on the tested link section (the legend translation is following: levy / stredny / pravy pruh ϭ left / middle / right lane). The occupancy of three detectors is depicted in the figure above and the result of predicted travel time is shown in the table. The TT was predicted by decision trees and also neural network tools. The displayed values as outputs of both AI methods were mutually compared for the research reasons.

Conclusion and future work
The presented method is very effective because it enables us to estimate travel times relatively reliably, even when traffic detectors are far from each other. In comparison to the neural networks models, which were also tested in the frame of the project, the decision tree model is more controllable and well-arranged.
Nevertheless the temporary installation of ALPR system for TT measurement also has operational expenditures. The research continues in the direction to replace ALPR equipment by floating car data. The first measurements on the D1 highway proved a penetration of floating cars in the traffic stream volume is about 5 %. This volume of floating cars could very precisely calibrate travel times estimated by fixed detectors.