TransXML Home
Project Information
Links
Contacts
Sources and Resources
GML Experiment
Construction/Materials Schema
Bridge Structures Schema
Survey/Design Schema
Safety Schema

Search:
Go 

Login
Register
NCHRP 20-64 XML Schemas for the Exchange of Transportation Data     
TransXML Home > Safety Schema > Safety Schema Discussion
Discussion Forum

Author Thread: Comments on Crash Records UML Models
Scott Came
Comments on Crash Records UML Models
Posted: Thursday, March 10, 2005 12:09 AM (EST)
Hello everyone, I'm new to this group (though I've been watching for awhile), so if my comments are out in left field, forgive me. I've been quite involved over on the Justice XML side of things, mostly using the vocabulary as a basis for building schemas for individual information exchanges. I've also been working on how an XML vocabulary fits into an overall service-oriented architecture. On both of these fronts, I've been making good use of UML to communicate about data structures with non-technical stakeholders...then we map the UML entities into GJXDM structures (sometimes cleanly, sometimes not.) So in my view, FWIW, you're going down the right path modeling this first in UML, then deriving schemas from the models. One comment on your model... I would recommend that you use inheritance very sparingly. In fact, I've come to believe that it's better not to use it unless there is really no other way to represent the semantics in a situation. Despite its promise of increasing reuse, in my experience inheritance hierarchies actually hinder reuse. This is because the amount of reuse inherent in a model is directly proportional to how flexible and interchangeable the parts are...that is, how resilient the components are to change. When you establish deep or wide inheritance hierarchies, changing the classes at the top of those hierarchies becomes very painful. The problem is compounded by the fact that (rightfully) most programming languages, as well as XML Schema, do not allow multiple inheritance. Here is an example. In the VehicleDetail model, we currently have CommercialVehicle extending CrashVehicle extending Vehicle. At some point, someone is going to want to use the transxml vocabulary, but will want to extend it with their own additional vehicle information. So, their first inclination will be to extend the Vehicle class (or the schema representation of it). But that extension will have no impact on the CrashVehicle or CommercialVehicle classes, because they already extend the transxml Vehicle class, not the new derived Vehicle class. If this hypothetical user also has some extensions to CrashVehicle, then he/she will have to decide whether to extend the derived Vehicle class, or the transxml CrashVehicle class...it will not be possible to extend both. There is also a semantic argument to consider. We are often drawn to represent associations as inheritance, when really we should just represent them as associations. In one sense, a CrashVehicle is a kind of Vehicle...that is, the kind of vehicle that has been in a crash. But that's not the most direct way to describe the semantics. In actuality, what we're after here is an association between a crash that has happened, and one or more vehicles. A similar argument applies to person-role relationships. This might be a bit far-fetched, but...what if a firefighter is involved in a crash? (I'm looking here at the properties of a NonMotorist in the model.) The same human being may have some of the characteristics of an occupant (i.e., play the role of occupant), but may also have some of the characteristics of a non-motorist, such as taking actions during the crash (i.e., may play the role of a non-motorist.) Because of single inheritance, it's very difficult to allow something to be more than one kind of thing. That's why we should consider modeling these as people-playing-roles. Finally, associations map much more cleanly to the hierarchical nature of XML and XML Schema. So life becomes easier downstream as well. One other general comment...I think the transxml vocabulary's potential to foster reuse would be enhanced by breaking up some of the larger entities (e.g., Driver, CrashVehicle) into smaller components that can be independently composed and extended if necessary. In the ideal world, the name of a class should be an abstract encapsulation of what the properties and associations of that class represent. I've often found that to be a good litmus test for when my entities are getting too heavy. I hope these comments are useful, or at least food for thought. Looks like really exciting, useful work is happening here. Thanks. --Scott Scott Came Principal Consultant Justice Integration Solutions, Inc.


Comments:

Author Thread:
Scott Came
Comments on Crash Records UML Models
Posted: Thursday, March 10, 2005 12:16 AM (EST)

Sorry for the run-on nature of my somewhat lengthy post.  Firefox doesn't present the nice wysiwyg editor.  Lesson learned for next time!

--Scott

     

Al Butler
Comments on Crash Records UML Models
Posted: Wednesday, March 16, 2005 4:59 PM (EST)

I agree that smaller classes would be useful.  For example, we could decompose Crash into several classes that store various elements of the crash and the conditions existing at the time of occurrence.  To some degree, this change would accommodate some of the other comments made about the need to store data on multiple intersection approaches (not whole streets), for the reason of differences among approaches of the same street on either side of the intersection.  The class could be designed to allow the user to describe any number of street approach instances for a crash event, and we could offer the option of allowing the entry for one instance to apply to multiple approaches--or to at least serve to populate the data for another approach with identical characteristics. 

The proposed Approach class would be derived from a combination of RoadCrashSite and IntersectioCrashSite classes.  For instance, there would be just one numberOfLanes field for each approach.  We would need a general CrashSite class, which could be instantiated once for several crash events at the same location, with time stamps to separate descriptions related to changing conditions.  If an existing record contains the same information, then there would be no need to duplicatively enter the data again for the next crash. 

We also need to add date fields to the site descriptions, if we are to use a single instance rather than keep entering the same data for each crash at a given location.  We don't want to show an approach as being four lanes wide (current condition) for crashes that occurred when it was two lanes wide.  Approach would work for all crash types (bridge, rail, intersection, work zone, and midblock).  A single vehicle crash could suffice with only one Approach instance.

On a separate matter, I would like to suggest that the Injury and EMSResponse classes be tied to Person, not to the Crash class, since: (a) people are injured, not crashes; and (b) there may be multiple EMS service providers for a single crash.  There could be a severity indicator at the Crash level to save some searching for analysis. 

Do we need to distinguish between bicycle riders and pedestrians in the NonMotorist class?  Low numbers may not make it much of an issue in many states, but it quite the local subject of concern in Florida.

I can supply a sample UML model if there is any interest in these suggestions.