THE INFLUENCE OF TRAFFIC, GEOMETRIC AND CONTEXT VARIABLES ON URBAN 1 CRASH TYPES: A GROUPED RANDOM PARAMETER MULTINOMIAL LOGIT 2 APPROACH 3

18 Numerous road safety studies have been dedicated to the estimation of crash frequency and injury 19 severity models. However, previous research has shown that different factors may influence the 20 occurrence of crashes of different types. In this study, a dataset including information from crashes 21 occurred at segments and intersections of urban roads in Bari, Italy was used to estimate the likelihood 22 of occurrence of various crash types. The crash types considered are: single-vehicle, angle, rear-end and 23 sideswipe. Models were estimated through a mixed logit structure considering various crash types as 24 outcomes of the dependent variable and several traffic, geometric and context-related factors as 25 explanatory variables (both site-and crash-specific). To account for systematic, unobserved variations 26 among the crashes occurred on the same segment or intersection, the grouped random parameters 27 approach was employed. The latter allows the estimation of segment-or intersection-specific 28 parameters for the variables resulting in random parameters. This approach allows assessing the 29 variability of results across the observations for individual segments/intersections. 30 Segment type and the presence of bus lanes were included as explanatory variables in the model of 31 crash types for segments. Traffic volume per entering lane, total entering lanes, total number of zebra 32 crossings and the balance between major and minor traffic volumes at intersections were included as 33 explanatory variables in the model of crash types for intersections. Area type was included in both 34 segment and intersection models. The typical traffic at the moment of the crash (from on-line traffic 35 prediction tools) and the period of the day were associated with different crash type likelihoods for both 36 segments and intersections. Significant variations in the effect of several predictors across different 37 segments or intersections were identified. The applicability of the study framework is demonstrated, in 38 terms of identifying roadway sites with anomalous tendencies or high-risk sites with respect to specific 39 crash types. 40


ABSTRACT 18
Numerous road safety studies have been dedicated to the estimation of crash frequency and injury 19 severity models. However, previous research has shown that different factors may influence the 20 occurrence of crashes of different types. In this study, a dataset including information from crashes 21 occurred at segments and intersections of urban roads in Bari, Italy was used to estimate the likelihood 22 of occurrence of various crash types. The crash types considered are: single-vehicle, angle, rear-end and 23 sideswipe. Models were estimated through a mixed logit structure considering various crash types as 24 outcomes of the dependent variable and several traffic, geometric and context-related factors as 25 explanatory variables (both site-and crash-specific). To account for systematic, unobserved variations 26 among the crashes occurred on the same segment or intersection, the grouped random parameters 27 approach was employed. The latter allows the estimation of segment-or intersection-specific 28 parameters for the variables resulting in random parameters. This approach allows assessing the 29 variability of results across the observations for individual segments/intersections. 30 Segment type and the presence of bus lanes were included as explanatory variables in the model of 31 crash types for segments. Traffic volume per entering lane, total entering lanes, total number of zebra 32 crossings and the balance between major and minor traffic volumes at intersections were included as 33 explanatory variables in the model of crash types for intersections. Area type was included in both 34 segment and intersection models. The typical traffic at the moment of the crash (from on-line traffic 35 prediction tools) and the period of the day were associated with different crash type likelihoods for both 36 segments and intersections. Significant variations in the effect of several predictors across different 37 segments or intersections were identified. The applicability of the study framework is demonstrated, in 38 terms of identifying roadway sites with anomalous tendencies or high-risk sites with respect to specific 39 crash types. 40

Introduction 45
Urban road crashes result in about 15,000 deaths per year in the European Union only (EU-28: 1999-46 2014 Eurostat data). A recent study (Bauer et al., 2016) has pointed out that urban road fatalities are 47 decreasing over time in the EU, but their percentage among all crashes is nearly stable (actually, it is 48 slightly increasing). Moreover, in some South/Eastern European countries and Portugal (see Bauer et 49 al., 2016) fatalities caused by urban crashes account for more than half of the total fatalities. In the 50 United States, the number of urban fatalities is even increasing, on average, considering a 10-year trend 51 until 2017, and they have exceeded the number of rural fatalities over the recent years (NHTSA, 2019). 52 Since the crash involvement rate of vulnerable road users is notable in urban environments (especially 53 in serious-injury crashes, see Aarts et al., 2016), the need for safer cities (in particular for vulnerable 54 road users) requires thorough understanding of the generation mechanism of severe urban crashes. 55 There is a considerable amount of research in the field of crash frequency modelling for urban road 56 segments and intersections (Sayed and Rodriguez, 1999;Lord and Persaud, 2000;Persaud et al.;57 Harwood et al., 2007). However, as highlighted in Colonna et al. (2019a), most of them concern urban 58 roads in the U.S., which may be significantly different than European urban environments. 59 Transferability issues of models from the U.S. to European contexts (and even within the same country) 60 were already raised indeed (Sacchi et  Behnood and Mannering, 2019). However, also in the case of severity models, most studies were 67 conducted with data from the U.S. and by considering the rural or mixed urban/rural environment. 68 Besides modelling crash frequency and crash severity, previous research (Kim et al., 2006(Kim et al., , 200769 Jonsson et al., 200769 Jonsson et al., , 2009) has shown the importance of differentiating crashes into crash types, in order 70 to highlight variations in the influence of traditional predictors. However, the latter aspect is often 71 overlooked in crash frequency and crash severity analyses, especially in urban environments. For 72 instance, all the above cited studies (Kim et al., 2006(Kim et al., , 2007Jonsson et al., 2007Jonsson et al., , 2009) refer to rural 73 intersections. The importance of differentiating crashes considering crash types and studying 74 differences between influential predictors is also crucial for identifying specific countermeasures, which 75 can be effective for a given crash type (see e.g., Retting et al., 1995). In fact, some countermeasures can 76 generally improve safety performances, e.g., those aimed at reducing speeds leading, in turn, to crash 77 reduction (Aarts and Van Schagen, 2006;Elvik, 2013). However, some other are specifically targeted 78 at addressing some specific crash types. For example, if there is a significant amount of angle crashes 79 at signalized intersections, then traffic light systems could be improved (e.g., by implementing 80 dedicated turn signals, depending on the prevailing traffic flow and the intersection-specific crash 81 patterns). This evidence could not emerge from a traditional crash frequency model or an injury severity 82 analysis. 83 Hence, this study is focused on the analysis of the predictors of specific urban road crash types. Using 84 a dataset of urban crashes and related site-specific and crash-specific explanatory variables, the 85 probability of a crash of a given type to occur (conditional on a crash having occurred and recorded 86 through a crash report) is modelled. This problem is typically addressed through a multinomial logit 87 structure, in case of non-binary crash outcomes. Multinomial logit structures were extensively used in 88 previous research concerning injury severity analysis (see e.g., Shankar and Mannering, 1996 parameter multinomial logit structure for predicting crash types. As previously discussed, highlighting 109 the specific influence of the considered predictor at the segment/intersection-level may reveal local 110 patterns, which is useful for practical purposes (i.e. selecting specific countermeasures). 111 The study answers the following main research questions: 112 • What are the main geometric and traffic-related predictors of crash types on urban segments 113 and intersections? 114 • Is it possible to associate crash-specific variables (i.e. context variables, not directly related to 115 the geometry of segments and intersections) to different urban crash types? 116 • Does the influence of predictors on crash types vary considerably across segments or 117 intersections? 118 Research questions are addressed by analysing a dataset from an Italian city. Considering the 119 aforementioned gaps in previous research, this study, which is exploratory in its nature, expand the 120 existing knowledge in several ways: a) conducting safety analysis disaggregated for different crash 121 types, b) deepening knowledge related to urban road safety predictions, c) highlighting results from the 122 application of a grouped random parameter multinomial logit structure to crash type prediction, d) using 123 a dataset from an European city, considering the impact of urban spatial setting on traffic safety. 124 The remainder of the paper is structured as follows. Methods used for data analysis are described in 125 detail in the next section. Then the modelling results are presented and discussed, in light of previous 126 relevant research. The applicability of the results is shown in practice, by highlighting specific high-127 risk sites based on the modelling results. Finally, the main conclusions from the study are drawn. 128

Methods 129
The methods used in this article are described as follows, starting with the crash dataset and the 130 predictors that were used for the statistical analysis of crash types. Next, the statistical methods used 131 for model estimation are presented in detail. 132

Database 133
The study is part of a larger National research project ("Scientific Park for Road Safety", funded by the 134 Italian Ministry of Transport and Infrastructures, leading agency: Municipality of Bari, Italy). In this 135 project, evidence from local urban road safety studies is used to infer possible policies and strategies, 136 which may help reduce urban crashes at a higher level (e.g., at a national level). In the context of this 137 research project, data about crashes occurred on the road network of the Municipality of Bari between 138 2012 and 2016 were collected and put together with some possible influential variables, which may be 139 related to crashes. The City of Bari is a medium-sized Southern Italian city, with a population of about 140 320,000 inhabitants, and an area of about 120 km 2 . 141 Crash data were provided by ASSET (http://asset.regione.puglia.it/), the local agency that manages 142 these data in collaboration with the National Institute of Statistics (ISTAT). In addition to publicly 143 available crash data, the exact localisation of the crash (GPS position) is included in the dataset 144 provided. Note that the crash dataset provided, according to the European state-of-practice, includes 145 only fatal+injury crashes, which are locally collected and standardized by the National Institute of 146 Statistics (ISTAT). The crash dataset includes information about the day, hour, crash type, the involved 147 vehicles and users, the contributory factors and the boundary conditions (i.e., weather, pavement, etc.). 148 Other information was manually matched with crash data instead, such as road geometric data and 149 traffic volumes (more details are provided in: Intini et al., 2019b;Colonna et al., 2019b). 150 Based on localisation, crash data were assigned to the road segments or intersections. In cases where 151 inaccuracies in the data localisation did not allow to identify the crash site precisely, the records were 152 removed from the initial dataset. Give-way/stop lines and zebra crossings (included in the intersection 153 area if close to the intersections) were initially used as preliminary thresholds for intersection-related 154 crashes. However, given the high probability of misclassification of crashes (into intersection-or 155 segment-related crashes) when the classification is based on fixed thresholds (e.g., distance from the 156 intersection centre or stop lines/crossings position), crash locations, types, circumstances and related 157 features were manually explored, to distinguish the intersection-related crashes from the segment-158 related crashes. This further level of preliminary analysis was necessary given that this study is focused 159 on crash types, separately assessed for segments and intersections. Moreover, segments were divided 160 into homogeneous sections on the basis of their internal geometric characteristics (e.g., a different 161 number of lanes, or the presence of medians). In other words, if notable macro-differences were 162 identified among different sections of the same segment located between two major intersections 163 (excluding driveways and intersections with minor roads), that segment was split into two or more 164 homogeneous sections (AASHTO, 2010). For this reason, the word "segment" is henceforth referred to 165 as homogeneous sections. Descriptive statistics about crash data are reported as follows, differentiated 166 for segments and intersections of the urban road network. 167 The study is focused on crash types, and then information about crash types were retrieved from the 168 database. The most disaggregate classes found for crash types are: run-off-road, fixed object, pedestrian 169 hit, fallen from vehicle, angle, head-on, sideswipe (not further classified by vehicle directions), rear-170 end. Since some of these categories were significantly under-represented in the sample (e.g., the fallen 171 from vehicle crash: only 2 crashes), then crash types were grouped into broader categories. Run-off-172 road, fixed-object, pedestrian hit and fallen from vehicle crashes were grouped into a "single-vehicle" 173 crash type, given that only one vehicle was involved. Moreover, head-on crashes account for only about 174 3% of the total sample (29 out of 1036 variable such as e.g., more or less than four entering lanes, because the latter classification was deemed 194 to assume a higher degree of arbitrariness in the threshold lanes with respect to the continuous variation. 195 However, the authors are not interested here in specifically assessing the effects of each one entering 196 lane increase, but the number of entering lanes was rather used in this study as a proxy measure for the 197 complexity of the intersection. In fact, it is assumed that the complexity can have an influence on 198 different crash type outcomes. 199 Area type was defined with regard to different city areas, as shown in city centre areas reflecting operating speeds significantly lower than 50 km/h and transition areas 207 reflecting operating speeds significantly higher than 50 km/h. To capture this difference, the area type 208 variable was introduced in the analysis. Segments in sparsely populated areas, which lead to the main 209 beltway connecting to the rural network were assigned to the "transition area" category as well as the 210 intersections lying on them. Moreover, the transition area variable is also used as a surrogate measure 211 of parking, since on most of the sample sites included in this area there is no on-street parking, contrary 212 to the roads belonging to the other area types (city centre and neighbourhoods). 213 Crash-specific explanatory variables were obtained from the crash dataset. They include basic 214 information such as crash date and hour and pavement conditions at the moment of the crash. Based on 215 this information, the following variables were defined: season, type of day (weekday or 216 weekday/holidays), period of the day (6 a.m.-6 p.m. or 6 p.m.-6 a.m., henceforth referred to as, namely, 217 "day" or "night"), pavement conditions (dry or wet/slippery/icy). Moreover, a qualitative, crash-specific 218 measure of the traffic volume that was present at the moment of the crash was inferred from the online 219 Google Maps ® tool for typical traffic at given hours and given days of the week, based on a colour scale 220 (ranging from green labelled as "fast", to dark red: "slow"). Hence, in this study, three classes were 221 defined aggregating information inferred from the colour scale: no delays expected (green colour), some 222 delays expected (orange colour), delayed/congested traffic (red/dark red colours, colours grouped 223 together since there are very few situations in which the dark red colour is observable on the inquired 224 road network). It should be noted that the measure is highly qualitative, since no numerical thresholds 225 were considered and it is based on visual exploration of on-line sources. However, it was deemed as an 226 interesting potential measure for capturing real-time traffic conditions, which are otherwise very hard 227 to obtain (while they are generally useful for safety modelling, see Christoforou

Statistical methods 237
In this study, a multinomial logit structure was used to predict the likelihood of different crash types 238 (with four possible outcomes: single-vehicle, angle, rear-end, sideswipe). The most disaggregate 239 observational unit used for modelling is the individual crash in the dataset. Site-specific and crash-240 specific explanatory variables are used to predict the likelihood of different crash types. Note that, based 241 on the data availability and sample size, the crash type outcome was chosen as dependent variable, Mannering et al., 2016). In this specific case, the parameters are allowed to vary across the segments or 249 intersection. As such, rather than having a single parameter estimate for each individual crash, the 250 parameters were grouped for each set of crashes corresponding to each individual segment or 251 intersection. In this way, it may be possible to capture some specific unobserved characteristics be fixed and some other may be site-specific (segment or intersection-specific): 257   Table 1). Thus, in this case, binary dummy variables were 279 generated (1 -presence of the given attribute, 0 -absence of the given attribute, e.g., for winter season: 280 1 -winter, 0 -other seasons).

282
The mixlogit command implemented in the STATA® software (based on Hole, 2007) was used for 283 estimating the mixed logit models. The underlying software algorithm, based on a mathematical 284 transformation from the standard mixed logit structure, estimates the logarithm of the odds of a given 285 outcome with respect to a reference outcome (StataCorp, 2015) in the set, as follows: (3) 288 Where: 289 ( 0 ) = probability of observing the reference crash type 0 (among the set T) for the crash unit c; 290 all other terms were previously defined for Equations 1 and 2. Note that the estimate 0, for the 291 intercept may eventually be site-specific as well, or fixed ( 0 ).

293
This approach was previously applied for similar purposes (i.e., crash types as outcomes) in a standard 3, and considering that the sum of the observed probabilities of all outcomes should be equal to 1, the 296 probability of observing each crash type outcome t can be computed. In this case, using the above 297 explained transformation for the model application leads to estimating three functions, by selecting the 298 single-vehicle crash type as a reference. 299 300 According to literature, the mixed logit model was developed using a maximum likelihood estimation 301 approach coupled with the Halton draws sampling technique (Halton, 1960  for all the observations. Once elasticities and pseudo-elasticities are estimated for each crash unit, 328 average elasticities are computed among the observations, to represent an overall effect. 329 330

Results 331
The results for the separate sub-sets of segment-and intersection-related crashes are reported in this 332 section and discussed in the following one. 333

Model for segment crashes 334
The predictors and the related estimated coefficients associated to different crash types likelihood on 335 segments (with respect to single-vehicle crashes) are presented in Table 2. 336

342
Predictors included in the model are: the segment type (undivided 2-way 4-lane segments in case of 343 angle crashes), the area type (city centre in case of angle crashes, transition areas in case of rear-end 344 crashes), the typical traffic (some delays expected in case of angle crashes), the day period (in case of 345 both rear-end and sideswipe crashes). Traffic volume and segment length were not included as 346 predictors in the model, due to the lack of statistically significant estimates, as well as several other 347 segment-specific and crash-specific variables. 348 The coefficient for the period of the day (night: 6 p.m.-6 a.m.) in the function of sideswipe crashes 349 likelihood (with respect to single vehicle crashes) was estimated as a random parameter across the 350 segments. This means that, given the approach selected, a specific coefficient estimate is calculated for 351 each segment. The grouped random parameter approach leads to a statistically significant improvement 352 with respect to the correspondent fixed parameters model (i.e., considering a fixed parameter for the 353 period-of-the-day variable in the function of sideswipe crashes), as based on the Likelihood Ratio Test 354 (LRT -see Table 2); the latter reveals an overall significance for the estimated standard deviation (Hole,355 2007). Moreover, the Wald test confirms that the selected predictors included in the model significantly 356 improve the fit. 357 Based on the estimates presented in Table 2, elasticities are computed in Table 3. Given that all the 358 predictors included in the segment model are indicators, then pseudo-elasticities are computed. 359 Table 3

Model for intersection crashes 376
The predictors and the estimated coefficients associated to the likelihood of different crash types on 377 intersections (with respect to single-vehicle crashes) are presented in Table 4. 378 379

385
Predictors included in the model are: the traffic volume per entering lane (in case of both angle and 386 sideswipe crashes), the ratio of the minor to the major traffic volumes (for both angle and rear-end 387 crashes), the total number of entering lanes (for sideswipe crashes), the total number of zebra crossings 388 (for sideswipe crashes), the typical traffic (both some delays expected and delayed traffic in case of 389 sideswipe crashes), the area type (transition areas for all crash types), the day period (in case of both 390 angle and rear-end crashes). In this case, some intersection-related, traffic and geometric variables are 391 included in the selected model. However, the intersection type (with respect to traffic signals and legs) 392 is not included, while the total number of entering lanes, which reflects the degree of complexity of the 393 intersection, is a predictor of SS crash likelihood (compared to single vehicle crashes). 394 The coefficients for traffic volume per entering lane (in the angle function) and for day period (in the 395 rear-end function) were estimated as random parameters across the intersections. Given the approach 396 selected, a single coefficient estimate for the two above listed predictors is then obtained for each 397 intersection. The grouped random parameter approach leads to a statistically significant improvement 398 with respect to the correspondent fixed parameters model, as based on the LRT test (see Table 4) which 399 reveals an overall significance for the estimated standard deviations (Hole, 2007). Moreover, the Wald 400 test confirms that the selected predictors included in the model significantly improve the fit. 401 Based on the estimates presented in Table 4, elasticities are computed in Table 5. In this case, some 402 predictors included in the model are indicator variables and some other predictors are numerical 403 variables. Hence, both elasticities and pseudo-elasticities are computed. 404 405

Predictors of urban segment and intersection crash types 434
Several traffic, geometric and context related factors were investigated as potential predictors of 435 different urban crash types likelihood. Among these variables, the presented models include: a) for 436 intersections, the traffic volume per entering lane, the overall number of entering lanes, the total number 437 of zebra crossings and the balance between major and minor traffic volumes; b) for segments, the 438 segment type and the presence of bus lanes; c) for both segments and intersections, the area type context 439 variable. Most of the influential geometric variables are specific to the considered road element (i.e., 440 segments or intersections) and so, their influence is separately discussed for the two road element 441

categories. 442
For what concerns segments, the undivided 2-way 4-lane segments are associated to an evident increase 443 in the probability of observing an angle crash. This could be attributed to two possible mechanisms. 444 Firstly, speeds may be higher on these urban arterial roads because of the increased road width (as 445 highlighted, for example, by Silvano  This could be explained by drivers reducing speeds and adjusting headways when traffic is balanced 466 among the intersection legs, because of the intrinsic intersection complexity. In fact, it was shown that, 467 as the intersection complexity decreases, inadequate drivers' attention allocation can be suggested, 468 leading to more crashes (Werneke and Vollrath, 2012). Table 5 shows that higher traffic volumes and 469 greater minor-to-major traffic ratios increase the likelihood of angle crashes at intersections. Both 470 identified effects can be explained by the increased number of crossing conflicts, which may generate 471 angle crashes. 472 Besides of road element-specific geometric variables, there are some variables that were taken into 473 account for both segment and intersection models. Their association with the likelihood of different 474 crash types is shown in Table 6, based on the computed elasticities and pseudo-elasticities in Tables 3  475 and 5. The influence of traffic per entering lane and total zebra crossings was previously discussed. It 476 is worth to note here that these factors were not found to be influential on the likelihood of different 477 crash types in the segment-based model. 478 Table 6 between ± 50% and ± 100%, +++/---for more than ± 100% change).

485
The likelihood of different crash types changes if segments and intersections are located in the rural-to-486 urban transition areas. In both segments and intersections, a notable decrease in the single vehicle and 487 sideswipe crash likelihoods and a notable increase in the rear-end crash likelihood are noted. If the 488 drivers are not guided in the transition from the rural to the urban environment through appropriate 489 design measures (see e.g. Lantieri et al., 2015), they may maintain a typically rural-based driving 490 behaviour (Colonna and Berloco, 2011). In this case, the sub-urban characteristics of these road 491 segments and intersections may allow drivers to maintain high speeds (see Liu, 2007 in case of 492 approaching intersections) but also provide the ground for aggressive driving behaviour, possibly due 493 to the presence of mind wandering and distraction (for further details, see also Fountas et al., 2019).

494
Such behavioural trends are typically observed in low-demand roadway environments (Lin et al., 2016), 495 such as e.g., low traffic rural highways. This may explain the increase in the rear-end crash likelihood. 496 On the other hand, most of the urban single vehicle crashes included in the dataset are pedestrian hit 497 (73 % of single vehicle crashes). Hence, the decrease in single vehicle crash likelihood can be attributed 498 to the nature of transition areas, which normally exhibit low pedestrian volumes. Another interesting 499 aspect of the results arises from the identified differences in the effect on angle crashes for segments 500 and intersections (namely, notable decrease and increase in angle crash likelihood, respectively). In this 501 case, the underlying crash mechanisms are most likely different: on transition segments, there is a 502 considerable decrease in the number of driveways/minor intersections related to angle crashes, while 503 the causes of angle crashes at intersections are still relevant and their likelihood was actually found to 504 increase. 505 The "city centre" area type is influential for segments only and it is mainly related to an evident decrease 506 in the angle crash likelihood. In this specific dataset, segments in the city centre are considerably short 507 (i.e., on average, between 50 and 100 m long) and often configured as one-way roadways, in several 508 cases single lane roadways with on-street parking on both sides. This may prevent reaching high speeds 509 between two close intersections (see e.g. Silvano and Bang, 2015). Hence, drivers may experience 510 possible angle conflicts without resulting in angle crashes. 511 Finally, concerning excluded variables, it is worth to note that the intersection type is not found to have variables such as e.g., traffic volume ranges. However, on one hand, the number of entering lanes 517 (included in the intersection model) can serve as a proxy variable for the intersection type (likely 518 presence of traffic signals in case of several entering lanes) and complexity. On the other hand, during 519 night, some of the traffic control systems may be not active, as such, their presence may not be 520 influential on the safety performances. Moreover, there are instances where total crash frequencies of 521 the two intersection types may be comparable for similar ranges of traffic volumes (see, for example, 522 the models developed by Persaud et al., 2002), or the presence of traffic signals may not be influential 523 for crash frequency predictions (Gomes et al., 2012). 524 The traffic volume for segments (contrary to the typical traffic which is significant), the segment length 525 and the presence of bike paths are other not statistically significant determinants of crash type 526 likelihood. The scarce influence of segment length may be due to the low variability of lengths in the 527 dataset (see Table 1) or it may partially be captured by the area type variable. Finally, all bike paths in 528 the sample are physically separated from the main roadway, thus explaining their scarce influence. 529

Associating crash-specific variables to urban crash types 530
Several crash-specific variables, either extracted from the crash dataset or inferred using the available 531 data, were modelled to predict different urban crash type likelihoods. Among these variables, the 532 presented models include: typical traffic and period of the day. A summary of their association to 533 different crash type likelihoods is provided in Table 7, as based on the computed pseudo-elasticities in  534  Tables 3 and 5. 535 Table 7

542
The typical traffic at the crash day/hour was included in both intersection and segment models, with the 543 attributes: some delays expected and delayed (only for intersections). In cases in which both delayed 544 and with some delays expected typical traffic can be associated with different crash types (i.e., at 545 intersections), their effect is consistent. In fact, for each crash type, changing from some delays expected 546 to delayed traffic, the same effect is preserved (i.e., positive or negative) and amplified in case of 547 delayed traffic (i.e., an effect of greater magnitude). In particular, a delayed traffic results in a notable 548 decrease of the likelihood for angle crashes. This finding could be explained by the expected decrease 549 of speed in delayed traffic conditions, which may prevent collisions between traffic streams having 550 conflicting angles at intersections (see e.g., Wang et al., 2009). 551 The variable representing traffic with "some delays expected" would capture intermediate conditions 552 in which there is neither free-flow traffic nor congestion. In such conditions, drivers are still likely to 553 have some freedom in choosing speeds and trajectories according to their desires, but their choices 554 could be constrained by the presence of other drivers. For intersections, as already stated, traffic with 555 some delays expected was found to affect different crash type likelihoods similarly to the delayed traffic 556 variable, even to a minor extent. Moreover, the different effects on crash types found for segments are 557 similar to those discussed for the intersections.. 558 Time-of-the-day when the crash occurred, and particularly, night time was also found to affect different 559 crash type likelihoods at segments and intersections, but with substantial variations. A consistent 560 reduction of rear-end and sideswipe crashes was identified for both segments and intersections during 561 night. Rear-end crashes can be associated to high speeds (Islam, 2016), short headways and drivers' 562 distraction (Gao and Davis, 2017). Under conditions of reduced visibility (even in the presence of 563 lighting), it is likely that the driver would compensate for reduced visibility with a more cautious (Bella 564 et al., 2014) and attentive behaviour. The highly attentive behaviour could result in promptly reacting 565 to abrupt braking of preceding vehicles. Moreover, the intentions of drivers of the preceding vehicles 566 can be more clear because of the increased visibility of car lights, compared to the daylight condition. 567 The reduced likelihood of rear-end crashes at night is more evident at intersections (coherently with undertaking these types of manoeuvres on segments and, in fact, the reduced likelihood of sideswipe 573 crashes at night is more evident at segments. An interesting difference stems from the indirect estimated 574 effect of night-time on single-vehicle crashes: the latter are likely to decrease at intersections, but to 575 increase at segments (in consistency with Bham et al., 2012). However, an increase in the angle night 576 crashes likelihood was noted, which can be linked to lack of visibility for conflicting vehicles. 577 Seasonal and weekly variations are potentially related to different driving behaviour but also to different 578 drivers' population , but they were not found significant for crash types. The 579 influence on safety of seasonal and weekly variation may be more evident in rural than in urban areas, 580 for instance because of the presence of summer/weekend recreational drivers (Intini et al., 2019c). 581 Moreover, the effect of wet pavements may be more influential in rural rather than in urban 582 environments (e.g. on run-off-road crashes, see McLaughlin et al., 2009). However, note that in the 583 study by Bham et al. (2012), in which urban roadways were considered, weekends and wet pavements 584 were associated to an increase in the single vehicle likelihood compared to other crash types. 585

Site-specific variability of estimated parameters 586
The random parameter model structure used in this study allows the identification of the variable effect 587 of some predictors across the sites, based on the model estimates. As far as these predictors are 588 concerned, the grouped random parameter structure enables the computation of a separate parameter 589 estimate (β) corresponding to each individual segment/intersection. The variables that were found to 590 have statistically significant grouped random parameters, and for which, segment-or intersection-591 specific parameters were estimated are (see also Tables 2 and 4 The distribution of coefficients varies depending on the associated explanatory variable; specifically, 609 the boxplots show a considerably broad range for the night variable, especially for segments, and a 610 small range of variation for the traffic variable. All the distributions of estimated parameters in Figure  611 2 have some "outliers" (conventionally identified as above or below 1.5 times the interquartile range of 612 the distribution). However, it is crucial to note that the effect of a given variable is generally 613 positive/negative for all the segments/intersections, except for some of these outliers, where the effect 614 is reversed. Those cases are discussed in the following. 1 615 For what concerns the night effect in the segment model, it is directly related to a decrease in the 616 sideswipe crash likelihood for 110 segments (92 % of the population). However, for 9 segments (8 % 617 of the population), positive parameters were estimated. An investigation of the characteristics of these 618 segments has revealed that most of them are undivided roads with parked vehicles on both sides (in 619 some cases coupled with narrow lanes and one-way traffic). The mechanism of sideswipe crashes can 620 be eased by the presence of side parking on narrow roads or in cases of roads with more-than-one lanes, intersections (98 % of the population), likely due to the increased angular conflicts. However, there are 637 three intersections (2 % of the sample) for which the traffic volume parameter estimate is negative. In 638 one intersection, there is one major two-way two-lane road and a one-way minor road, on which the 639 traffic from the major road can only enter into. Hence, in this case, angle crashes could be only caused 640 by the left-turn manoeuvre from the major to the minor road. As the traffic volume increases, drivers 641 may be more cautious while negotiating the left-turn manoeuvre; the risk compensating behaviour of 642 drivers in such cases may explain the reduction in the angle crash likelihood. In another case, the 643 intersection is between an entering one-way road and a major two-lane road, having an angle greater 644 than 90°. In this case, the vehicle flow from the minor road (give-way regulated) enters almost parallel 645 to the direction of vehicles on the main road. In fact, half crashes on this site are sideswipe crashes. 646 Hence, in this case, the effect of traffic on angle crashes is not influential. The third case is a four-legged 647 signalized intersection with highly unbalanced traffic between the major and the minor road. In this 648 case, angular conflicts are largely independent on the average traffic per lane (mainly governed by the 649 main road traffic). The most frequent crash type on this intersection is the rear-end crash indeed.

Practical application of results 652
The estimated models can be used in practice to highlight high-risk sites with respect to a given crash 653 type. In fact, based on the models and the dataset, individual probabilities of occurrence of crash type 654 outcomes can be assessed. In the estimated models (for segments and intersections), some site-specific 655 (segment-or intersection-related) and crash-specific variables were included (see Fig. 3). 656 In this case, the high-risk sites identification should be aimed at highlighting sites having a very high 657 probability of a specific crash type to occur. This procedure is carried out for particular combinations 658 of crash-specific variables (which can be seen as crash contributing factors), leading to different 659 possible scenarios. The criteria used to generate scenarios for both segments and intersections are shown 660 in Fig. 3. In detail, the probabilities associated to different crash types were computed in four different scenarios 665 for segments, and six different scenarios for intersections, as indicated in Figure 3. distributions are provided in Figure 4 for both segments and intersections.  Based on this approach, high-risk sites having high likelihood of a given crash type to occur, can be 684 identified in the different scenarios for both segments and intersections, by setting given thresholds 685 depending on the scope of high-risk sites analysis. For example, starting from the population of all the 686 computed probabilities of different crash types for all sites (segments or intersections), it is possible to 687 define some threshold percentiles (e.g., 85 th , 90 th or 95 th percentile). The definition of thresholds may 688 depend on the scope of the analysis (exploratory purposes, network screening, inspection planning, 689 etc.). Once thresholds are defined, the sites showing percentages of crashes of a given type exceeding 690 the thresholds, can be identified as "high-risk sites" for that crash type. This detailed analysis may result 691 in selecting countermeasures specifically related to given crash types. 692

Conclusions 693
In this study, a dataset of urban segments and intersections was used to identify the factors influencing 694 the likelihood of different crash types (single-vehicle, angle, rear-end and sideswipe). A multinomial 695 logit approach, with different crash types serving as outcomes and several traffic, geometric and 696 context-related variables serving as possible explanatory variables, was implemented. In detail, the 697 mixed model structure was used to account for the variability of estimates across the crash observations. 698 Parameter estimates were grouped per road site (segment/intersection), in order to account for 699 unobserved effects and assess the influence of predictors on crash types at the individual site level, 700 which is a research novelty for crash type modelling to the authors' knowledge, especially for urban 701 crashes. The main aim of this study was to explore: a) the influence of geometric and traffic-related 702 predictors on different urban crash types (both at segments and intersections); b) the association of 703 crash-specific variables to urban crash types, c) the possible variability of results across the crash 704 observations for individual segments and intersections. 705 The results show that the segment type and the presence of bus lanes are predictors of different types 706 of crash occurring on road segments. Traffic volume per entering lane, total number of entering lanes, 707 total number of zebra crossings and the ratio between major and minor traffic volumes at intersections 708 influence different crash types at intersections. The context variable: area type is a predictor of different 709 crash types for both urban segments and intersections. 710 The crash-specific variables, which were significantly associated with different crash types (for both 711 segments and intersections), are the typical traffic at the moment of the crash and the period of the day. 712 However, no significant seasonal and weekly variations were noted, as well as no influence of different 713 pavement conditions. It is important to note that a measure of the traffic conditions at the moment of 714 the crash (even if inferred from online sources) was statistically associated with different crash types. 715 Hence, the use of similar variables is encouraged for future research. 716 For the predictors associated to statistically significant grouped random parameters (period of the day 717 for both segments and intersections, traffic volume per entering lane), substantial variability of their 718 effect was identified across the crash observations. Occasionally, the direction of the effects of some 719 variables is the opposite of what holds to all the other elements in the population. In these cases, the 720 further analyses conducted on these particular sites have revealed the influence of some local factors on 721 the estimation of the parameters with different sign. The disclosure of possible local relationships 722 constitutes a direct implication of the grouped random parameter approach and corroborates the choice 723 of such approach. In fact, differently than in the conventional mixed logit, the grouped random 724 parameter approach can capture not only unobserved effects varying across the crash population, but 725 also systematic variations arising from the unobserved interaction between the geometric or traffic 726 characteristics of these sites and the drivers' behavioural response against them (Fountas et al., 2018b). 727 In addition, the estimation of individual parameters can help better identify the potential sources of 728 these unobserved interactions at a segment or intersection level. 729 Hence, this study contributes to the existing body of research since it is the first to show, to the authors' 730 knowledge, how the grouped random parameter multinomial logit structure can be implemented to 731 account for unobserved and grouped heterogeneity in crash type prediction. The introduction of the 732 grouped random parameters to the multinomial logit formulation constitutes a significant comparative 733 advantage of the presented models relative to state-of-practice approaches. In fact, the presented 734 approach allows for capturing the impact of unobserved factors that may vary across the 735 segments/intersections (i.e., unobserved heterogeneity) as well as grouped effects arising from the 736 presence of multiple crash observations per segment or intersection. Over the last few years, the impact 737 of segment-or intersection-specific grouped heterogeneity has been recognized in various safety 738 dimensions, such as the accident occurrence (Fountas et al. 2018b) or the injury severity (Fountas et al.,739 2018a); however, the implications of grouped heterogeneity on crash type probability have not been 740 thoroughly explored to date. It should be noted that the formulations of SPFs or other state-of-practice 741 modeling approaches do not typically take into account unobserved or grouped heterogeneity, hence 742 resulting in less accurate parameter estimates and statistical inferences (Washington et al., 2020). 743 Moreover, the results from the empirical analysis can be practically used to highlight high-risk segments 744 or intersections with specific regard to given crash type outcomes, differentiated by particular scenarios 745 (obtained as combinations of contributing factors, as for example, specific time of the day or traffic 746 conditions). This can be considered as a step forward for the selection of appropriate and individual 747 countermeasures at sites, based on their predicted crash type outcomes and considering other influential 748 conditions. 749 The present study is not without limitations. Firstly, as most of research in road safety, the transferability 750 of the estimated models to other contexts requires further investigation. Secondly, the sample size used 751 for this study was deemed large enough for the exploratory purposes of this research, but it should be 752 enlarged for prediction purposes. Moreover, several other variables (i.e., related to human factors or the 753 role of vulnerable road users) may affect the crash types. However, the employed grouped random 754 parameter approach can account for this limitation to a reasonable extent (Mannering et al., 2016). Note 755 that even incorporating year specific effects in the discrete outcome models may add further value to 756 this modelling approach, which could be considered for further research. Nevertheless, since the 757 grouped random parameters follow pre-determined distributions, the practical application of these 758 models is not as straightforward as in cases of more parsimonious models (such as the SPFs), where the 759 parameter estimates have fixed values regardless of the characteristics of the segment/intersection. 760 However, this limitation of the grouped random parameters models stems from their generalized 761 formulation, which has been set to account for various layers of heterogeneity. As a concluding note, 762 given the exploratory nature of this study, further research should deepen these findings, by possibly 763 using larger datasets and different contexts, in order to compare results. and Chemistry, Polytechnic University of Bari, Italy, is acknowledged for having partially funded the 772 stay of the first author at the Edinburgh Napier University, Scotland (UK), essential for the realization 773 of this article. 774