Temporal, Geographical and Categorical Aggregations Viewed through Coordinated Displays:

A Case Study with Highway Incident Data
 
 

Anna Fredrikson
 
 

Draft version, 1999-08-23







Abstract

An information visualization display can hold a certain amount of data points before it gets crowded. One way to solve this problem with larger and larger data sets is to create aggregates. An aggregate is a group of data points that is used as summarization. The aggregates are used in the visualization instead of all the data points to simplify the display.

The concept of aggregation can be used together with the Snap-Together Visualization to tightly couple and coordinate different displays.

This paper presents a case study with incident data from the highways in Maryland where both aggregation and coupled displays are used. More than ten different prototypes with different kinds of aggregates have been developed and analyzed. Recommendations and advice regarding the use of aggregation will be given to people working with transportation systems, and to developers of coupled displays and visualization applications. Some general principles about aggregates found during the development process will also be presented.
 

Keywords

User interface, Information visualization, Aggregation, Multiple views, Tightly coupled.
 

Sammanfattning

En display med en informationsvisualisering kan innehålla ett visst antal datapunkter innan den blir överfull. Ett sätt att lösa problemet med allt större datamängder är att skapa aggregeringar (grupperingar). Ett aggregat används som en summa för en grupp av datapunkter. Grupperna används i visualiseringen istället för datapunkterna för att göra displayen enklare och tydligare.

Konceptet med aggregering kan användas tillsammans med Snap-Together Visualization för att koppla ihop och koordinera fönster med olika visualiseringar.

Den här rapporten presenterar en fallstudie med olycksdata från motorvägarna i Maryland, USA, där både aggregering och kopplade displayer har använts. Fler än tio olika prototyper har utvecklats och analyserats. Rekommendationer och råd gällande användning av aggregering ges till personer som arbetar med transportsystem samt till utvecklare av kopplade displayer och visualiseringsverktyg. Några generella principer om aggregering, som utkristalliserade sig under utvecklingsarbetet, kommer också att presenteras.
 

Nyckelord

Användargränssnitt, informationsvisualisering, aggregering, multipla vyer, kopplade fönster.
 

Preface

This report is a Master’s thesis in Computer Science and Engineering at Chalmers University of Technology, located in Göteborg, Sweden. The thesis work was supported by Spotfire AB and done in cooperation with the Human-Computer Interaction Laboratory (HCIL) at University of Maryland, College Park, where most of the work was done. I spent the summer of 1999 at University of Maryland’s HCIL.

Acknowledgements to everyone at HCIL for making my stay in the US such a great time.

I would also like to thank Deepak Kumar Shrestha for providing me the incident data for this study, and Phil Tarnoff for his valuable comments.
 

Anna Fredrikson
 
 

Table of Contents
 

INTRODUCTION *

BACKGROUND *

Information visualization and dynamic queries *

Multiple window coordination *

Snap-Together Visualization *

Aggregation *

Using aggregation with coupled displays *

CASE STUDY * Application domain background *

The incident report form *

Users of the highway incident data *

The incident database and its quality *

PROTOTYPES * The first prototype *

Exit number aggregation *

Zooming via exit number aggregates *

Weekday aggregation *

Calendars *

Incident code aggregation *

RECOMMENDATIONS * Users of CHART *

Developers of coupled displays *

Spotfire *

General principles *

FUTURE WORK *

REFERENCES: *

APPENDIX A: The database fields *

APPENDIX B: System specifications and how to run demos *

APPENDIX C: Notebook *
 

INTRODUCTION

In the society of today, we are surrounded by a lot of information all the time. Information visualization is a method to facilitate the exploration and investigation of a data set by showing a view over the data in a starfield display. This type of display maps multi-dimension data into a two-dimension visualization where any of the records’ attributes can be used as axis. A starfield display is a scatterplot with additional features to support selection and zooming. Every point in the visualization represents a record in the database. This method works very nicely, but when the number of records in the database grows it is more and more difficult to show all the records in a single display. The display will be filled with data points and you won’t be able to distinguish between the different records.

There are different methods to use when managing displays with large quantities of data. The idea of aggregation is to create new meaningful groups (aggregates) of data that is used as summarization of other data sets. The aggregates are used instead of the data points to make is easier to visualize the data set and to find trends in the data.

A single visualization is sometimes not enough for exploring the database. Several different views of the data can be connected to work as a unit. This can be done with the Snap-Together Visualization, a tool for coordinating several independent displays. Tightly coupled displays can help a user to seek information and explore data in a new way.

This purpose of this report is to investigate the different aspects of aggregations in tightly coupled information visualizations and to give recommendations on how to use them successfully.
 

BACKGROUND

This chapter describes how the situation is today and gives you background information on information visualization, dynamic queries, aggregation and coupled displays.
 

Information visualization and dynamic queries

When analyzing and exploring a database it is not easy to only look at the database tables and ask question by using SQL (Structured Query Language). Even though SQL is a powerful query language it is not easy to learn and use. Therefore information visualization and dynamic queries were developed. They let the user manipulate a database query without knowing anything about SQL. Two early applications of dynamic queries were built at University of Maryland’s Human-Computer Interaction Laboratory. They were a periodic table of elements [AWS92] and the HomeFinder [WS92], which showed real estate data. The HomeFinder shows a map over Washington, DC and 1100 points of light indicating homes for sale. Users can mark the workplace for both members of a couple and then adjust sliders to select circular areas of varying radii. Other sliders select number of bedrooms and cost, with buttons for air conditioning, garage, etc. Within seconds users can see how many homes match their query, and adjust accordingly.

Figure 1: The HomeFinder shows homes for sale in the Washington, DC area.

Figure 2: The Dynamic Periodic Table of Elements highlights the selected elements in red.

The next step for HCIL was to make a display where you could zoom, filter and color-code the data points. These ideas were first implemented in the FilmFinder [AS94], where you can manipulate several different sliders to find a movie.

Figure 3: The FilmFinder showing all movies starring Michelle Pfeiffer and directed by George Miller.

After the FilmFinder was developed, Christopher Ahlberg continued to work with dynamic queries and created an application with starfield displays called IVEE (Information Visualization and Exploration Environment), which later was renamed Spotfire [AW95].

In Spotfire users can change 2D and 3D visualizations by manipulating range sliders and different kinds of buttons. Any of the data record’s fields can be set as X, Y or Z-axis in the visualization and the information can be presented in a scatterplot (2D or 3D), histogram, bar chart or pie chart. The data points can be color coded, size and shape coded and an image can serve as background for the visualization.

Figure 4: Spotfire showing four different visualizations of a gene database.

Spotfire has an API (Application Programming Interface) that lets other applications start up Spotfire and control the program (including the visualizations, query devices and the settings) by calling various functions.

Other interactive visualization tools for exploring data include DEVise [LRB97] and Visage [DKR97A] [DKR97B]. DEVise has the powerful ability to handle distributed data sets, and in the Visage system queries and visualizations are dynamically linked together and it is possible to create visualizations that are derived from multiple objects.
 

Multiple window coordination

In current windowing environments, each window is treated as independent and isolated.

This makes it difficult for users to coordinate information across multiple windows, even when separate windows are related by content or task. In designing user interfaces for information exploration, a strategy that has been proven very effective is multiple window coordination [Shn98]. In this strategy two or more separate windows containing related information are used cooperatively for accomplishing a task. The windows respond in the same manner to user actions in a certain task domain. This strategy breaks the traditional windows paradigm with independent windows by tightly coupling interface components. User interaction in one window has direct visual effect in other coupled windows. Research shows that potential gains from coordination strategies are significant and improves user performance up to ten times [KS97].

Several existing software packages use this strategy today. For example, Microsoft Word’s ‘Document Map’ feature displays the table of contents of the text in a frame to the left. Selecting a heading in the map scrolls the document text directly to that section. Likewise, scrolling the document text highlights the current section in the map. A second example, Windows Explorer 4.0 actually has 3 views: The left panel contains the directories, the right panel shows the detailed contents of a selected directory, and (with "View as Web Page" on) also displays details of a selected file including a miniature quick-view.
 

Snap-Together Visualization

Chris North at HCIL has built a tool called Snap-Together Visualization (STV) that allows users to tightly couple different displays and to define actions [NS99] [Nor98]. STV was developed from LinkKit, an image-browser tool that was built to coordinate multiple images [NS98]. STV helps users to create the coordinated visualizations they need. Users query their relational database and load results into desired visualizations. Then they define the coordination between the different visualizations for selecting, navigating, or re-querying.

The user interface consists of two different parts: a menu that shows the tables and queries in the database together with a listing of visualizations, and a dialog for specifying the coordination between two visualizations.

Figure 5: The Snap-Together Visualization menu lists the tables and queries in the incident database and displays a menu over available visualizations.

When users open a database with STV, the menu (Figure 5) displays the tables and the queries defined in the database. The available visualization tools are shown to the right. Users can drag-and-drop a table or a query onto one of the visualizations to open up a new window with the data loaded in it. STV only adds a menu to the new visualization window for STV-related actions (see below).

Figure 6: STV adds this menu to the visualizations.

The new window displays the data and can be used as a stand-alone application without any coordination features.

To coordinate any two views users make a drag-and-drop action between the two STV menus of the visualization windows. This action causes the Snap Specification dialog (Figure 7) to open, and users can now select the actions to coordinate in each window.

Figure 7: In the Snap Specification dialog, users select how two views should be coordinated. In this figure, selecting an exit in Spotfire will load the incidents that occurred near that exit in a textual list view.

When users click on the Snap button in the dialog, the two views are tightly coupled so that interaction in one window causes the desired effects in the other. Any number of visualizations can be coordinated this way, allowing users to build their own interfaces for exploring data.

The STV application is developed on the Windows platform using Visual Basic 6.0 and the APIs of the visualization applications. ODBC is used for data access and COM for inter-process communication.
 

Aggregation

When managing displays with larger and larger quantities of data, it’s becoming difficult to fit all the data points in one display. The display will become cluttered and confused when it gets too crowded (see Figure 8). One of two ideas to solve this problem [Li99] is to use aggregation. The other approach is data reduction, i.e. to reduce the size of the graphics.

Figure 8: A Spotfire display with incidents around Baltimore marked on a map. The display is crowded with data points and it is difficult to identify any high hazard locations.

An aggregate is a group of data points that is used as summarization. They can be formed as a result of decomposition or aggregation [GR94]. The aggregates are used in the visualization instead of all the data points to simplify the display. The aggregates can be divided into at least 3 different categories depending on the kind of data records that are summarized.

The aggregates have data characterizations that are derived from the data characterization of the elements, and they can be defined in advance in the database or specified on the fly. In traditional databases, aggregation is specified as a query with a group function that is submitted to the system. The system processes a large volume of data and delivers the answer. Executing the following SQL query created the automobile aggregates in one of the prototypes:

SELECT Auto, Count(Auto) AS [Number of incidents]

FROM RECORDS

GROUP BY Auto;

The different aggregate functions used in SQL queries are: average, count (as in the example above), minimum and maximum value, sum, variance, and standard deviation.

Online aggregation [HHW97] is a new interaction interface that lets users observe the progress of the aggregate query execution and to control it on the fly. Another tool for aggregates is Aggregation Eye [Moc98], which is used for manipulating the extent of an aggregate dynamically.

One of the interesting problems about aggregation is to decide the granularity of the aggregate. Depending on the task and the application domain, different aggregates are needed. For example, in an application with highway incident data it is interesting to look at both the number of incidents per year and the average number of incidents per hour on one day.
 
 

Using aggregation with coupled displays

The traditional starfield displays are, as mentioned before, not enough for the databases with many records. The display will be too crowded with data points and it will be difficult to make any conclusions about the data set and possible trends in the data. Aggregation can be used to provide an overview over the data, and together with another coupled display the details from the aggregates can be explored. This will allow the user to maintain the overview and at the same time look at the details. An analogy that is often used as an example of this, is referred to as the galaxy – stars analogy. There are too many stars in the sky to explore and look at, so you must first decide which galaxy the interesting stars belong to. After that, you can look at the individual stars.

The visualization displays are tightly coupled, so that when the user selects an aggregate the details in the aggregate are shown in the other display. This drill down technique is also used in the Visage system [LR96], but the user has to drag and drop the aggregate onto a new display to see the details.

This study is the first to utilize both the strategy of aggregation to handle displays with a large amount of data and the Snap Together Visualization to coordinate actions in multiple displays.
 

CASE STUDY

Application domain background

CHART, or Chesapeake Highway Advisories Routing Traffic, is the highway incident management program of Maryland State Highway Administration (MSHA). The program utilizes advanced technologies, including closed-circuit television cameras, traveler advisory radio (TAR), variable message signs (VMS) and pavement weather sensors to assist in monitoring responses and clearing roadway incidents and backups. It is a statewide program with its headquarters in Hanover, Maryland, where the newly built and integrated statewide operations center (SOC) is located. Figure 9 shows the room where the traffic operators monitor the traffic. The SOC is supported by several traffic operations centers (TOC) located near College Park, Baltimore, Rockville, and Annapolis.

Figure 9: The Statewide Operations Center in Hanover, Maryland.

The current network covered by CHART consists of 375 miles of freeways and 170 miles of highway arterial roads, mostly on the Washington, Baltimore, Annapolis and Frederick metropolises.

CHART is comprised of four major components: traffic monitoring, incident response, traveler information, and traffic management. Among those four components, the incident response and traveler information systems have received an increasing attention from the general public, media, and transportation professionals. More information about the CHART program and live traffic maps (Figure 10) can be found on the CHART web site (http://www.chart.state.md.us).

Figure 10: CHART provides live traffic maps with information about incidents on the Internet.
 
 
 
 

The incident report form

When an incident occurs, a traffic operator at one of the centers fills in an incident report form. The form (Figure 11) is divided into different sections representing different type of information about the incident, including location, time and date, weather conditions, vehicles involved in the incident etc. On the backside of the report (Figure 12), notes about actions and events during the incident are written.

All the data in this study is based upon incident report forms from CHART. They were obtained from SOC, TOC-3, and TOC-4. Today, the forms are only used internally at the Maryland State Highway Administration to evaluate the management program. All information and statistics provided to the public are based upon the official police reports. As one can see from the figures, it is not very easy to read and interpret the incident reports. Ambiguous statements, missing fields and bad writing are very common.

When the forms are filled in, the reports are put away and stored. There exist no digital versions of the reports and this is one of the reasons why they are practically not used at all.
 
 
 
 


 
 

Figure 11: The front page of a CHART incident report form. The information about the incident is divided into different sections. This report is from TOC-4, and shows an incident with an overturned car that took 1 hour to clean up.

Figure 12: Notes from the incident are written on the backside of the incident report form.
 
 

Users of the highway incident data

If the highway incident data from the reports existed in digital form, four different classes of potential users of the data can easily be found: system operators, traffic engineers, transportation planners, and the public.
 
 

System operators

The system operators monitor and operate the traffic management system on a day-to-day basis. They must identify incidents, dispatch appropriate resources, and create messages for variable message signs and highway advisory radio.

Operators would like to have the data as a reference guide regarding the actions to be taken for the current incident and also as an aid in comparing actions taken during an incident.
 
 

Traffic engineers

The traffic engineers are responsible for managing the freeways by setting speed limits, designing road parameters etc. They identify high hazard locations, trends by season/day-of-week/weather, and accidents by type of vehicles among other things.

An application for doing incident data exploration would be very good for traffic engineers, since all their work is performed to identify the need for physical, operational, regulatory or enforcement changes.
 
 

Transportation planners

Planners are responsible for identifying the need of new roadways based on capacity and safety.

The planners may use incident data to identify the need for new facilities (traffic congestion is traditionally used for this) and to calculate benefits of new facilities.

The public

Information provided to the public may influence their travel behavior by influencing their choice of route, time of the day, and type of vehicle.
 
 

The incident database and its quality

The first database in the study was based on 40 incident reports. Information from the paper reports was typed in by hand and only a small fraction of all the information was included to make it simpler. Only date, location, critical times, blockages, type of vehicles involved, weather conditions and type of incident for each incident was included in this database.

Later on, contacts with the Civil Engineering Department of the University of Maryland were established. They had done and are doing studies on the performance of CHART [CP98] and [Shr99], so they had a database that incidents that occurred during 1997 from SOC, TOC-3 and TOC-4. There were 5,401 incident reports in the database with 3,352 records from SOC, 1,450 records from TOC-3, and 599 records from TOC-4. Each record had 59 different fields. A listing of the fields and their descriptions are found in

APPENDIX A: The database fields
.

Since the incident report forms are so difficult to interpret and read, the database was unfortunately filled with errors. All the information about an incident was typed in as text even if the fields only contained numbers. It took two days of work to correct the mistakes in the database and to redefine some of the database fields.

Some other changes were also made to the database:

After these changes the database contained 1,611 records. A subset of the records (incidents from Baltimore Beltway, road number 695) was used in most of the prototypes. This choice was motivated by the fact that the incidents in the first database were from 695 and therefore all the data about the exits on the Baltimore Beltway had already been obtained. The database with the incidents from 695 contained only 175 records.
 
 

PROTOTYPES

This study includes more than ten different prototypes of coupled visualization displays with highway incident data. There were aggregates present in almost every prototype and the aggregation query was always pre-defined in the database, never specified dynamically. I first defined the aggregates in the database by writing SQL queries, and then created the views with the specification of the coordination between the different views. For each prototype I took a snapshot and documented it by writing down advantages, disadvantages, and other details about the view or the aggregation.

This chapter presents brief descriptions of the prototypes together with what’s good and bad about them. A more detailed description of the prototypes can be found in APPENDIX C: Notebook.
 
 

The first prototype

The first prototype consisted of two coupled Spotfire displays. One of the displays showed the 40 incidents in the first database, and the other display showed a map with the exits on the Baltimore Beltway. When the user selected an incident, the exit closest to the actual location of the incident highlighted on the map.

Figure 13: The first prototype highlights the exit closest to where the selected incident occurred.

It was very easy to look at the map and see where an incident occurred, but a disadvantage was that is was not possible to select an exit and highlight all the incidents at that exit number. Snap Together Visualization only highlights one object per display and doesn’t support highlighting multiple records, which was needed in this case. After improvements of STV this prototype could highlight several records and the user could select an exit number or an incident.

Another variant of this prototype was showing the map with exit numbers and a table grid. If the user selected an exit, data about the incidents was loaded into the table. This prototype did not include any aggregates, but it gave me proof that these coupled displays were working and looked promising.

Exit number aggregation

The next step was to create aggregates for the exit numbers (geographical aggregation) and to use the incident database with the records from the Baltimore Beltway. I added data about the exits and calculated how many incidents that occurred close to each of them. To show Spotfire’s ability to color-code markers, a fictional distance to a response unit for each exit number was added. When an exit was selected all the incidents were shown in a table grid at the bottom of the screen (Figure 14).

Figure 14: With exit aggregates on a map it is easy to see where most of the incidents occurred. The size of the markers depends on the number of incidents and the color depends on the distance to a response unit.

The map makes it clear and simple to see where most of the incidents occurred, since the size of each exit depends on the number of incidents close to that exit. The distance to a response unit is used as color-coding, with dark blue as the longest distance and white as the shortest. For users working with CHART this view could serve as an aid in placing the response units where they are most needed. The exits with dark blue color and rather large size are probably in need of an extra unit.
 
 

Zooming via exit number aggregates

The idea with exit number aggregates was very nice, but there was a need for another way to show the incidents at each exit. The table grid was primitive and a detailed map over each exit would be much better. This discussing led to my next prototype that was used as a zooming map.

Figure 15: Zooming in at exit number 19 on the Beltway, where five different incidents occurred.

When an exit number was selected on the map, the detailed view over the exit was loaded in the other display. The exact location of each incident was shown on the detailed map as a green triangle. This prototype makes it very simple to zoom in and still maintain the overview of all the incidents, and it is easy to find an answer to questions like the following: Where are the high hazard locations at exit 25?
 

Weekday aggregation

Full of inspiration from my prototypes with geographical aggregation, my focus changed towards temporal aggregations. Since traffic during one week is similar to traffic during other weeks, it seemed promising to try and group the incidents by day of the week. The number of incidents each day was shown in a display with bar charts. Each bar represented one day of the week. When a bar was selected a map with markers of the incidents was loaded in the other display. The size of the markers in this display depended on the duration of the incident. Surprisingly, there were much less incidents on the weekend compared to the weekdays.

Figure 16: A bar chart displays the distribution of incidents during the week. All the incidents on Mondays are shown on the map.

The next problem to tackle concerned the update of the aggregates. Imagine you only want to look at the truck incidents on Mondays. You can filter the incidents on the map to show only incidents with trucks, but the aggregates in the bar chart won’t be updated accordingly. To support this recalculation and updating of the aggregates STV had to be changed (see page *, Recalculation of aggregates).

After changing STV to support update and recalculation of the aggregates the user could filter the incidents on the map and the aggregates in the bar chart were updated accordingly. In the figure below only the incidents on Mondays with trucks involved are displayed on the map.

Figure 17: When the map with incidents is manipulated to show only truck incidents, the aggregates in the bar chart are updated accordingly. The user can now see that most of the truck incidents occurred on Fridays.
 

Calendars

Instead of grouping the incidents by day, they were grouped by date in this prototype. A calendar was shown in one display and if a date was selected, information about the incidents was loaded into a table grid. Two different prototypes were developed to show the importance of the aggregates’ granularity. The first one showed a calendar over July 1997 and the other displayed a calendar over one entire year. In the view over one year, data from two different years (in different colors) was shown together to simplify the task of comparing the number of incident during a period of two years. Another approach would be to show data from one year together with the average number of incidents during a 5-year period.

Figure 18: Incident data from 1997 (blue markers) together with fictional data from 1996 (red markers).

It was clear that the view over one entire year was much more appealing than the view over only a single month.
 
 

Incident code aggregation

When an incident occurs it is classified depending on its seriousness. Different codes are used for the classification:

1050PD = Property Damage

1050MPI = Minor Personal Damage

1050SPI = Serious Personal Damage

1050F = Fatal incident

1046R = Roadway incident

CF = Car Fire

OTHER = Other type of incident

ROADWORK = Road work

This prototype was developed to detect differences between the incident types and also to see what types of incidents were most common. Like in the other prototypes, the details of the incidents are loaded in the upper display (Figure 19) when one of the aggregates is selected.

Figure 19: The incidents are aggregated by the incident code. The incidents of 1050PD-type are shown in the upper display, where the user can examine the incidents closer.

One of the interesting things found during the development of this prototype was that the duration of an incident was increasing by number of cars if the incidents were of 1050SPI-type, but decreased if there were 1050PD incidents. If the incident isn’t so serious, it obviously does not matter how many cars that were involved in the incident.
 

RECOMMENDATIONS

While developing the different prototypes several problems arose and some of them were solved at once by changing the STV application. This chapter will discuss those changes and also give recommendations about the prototypes and how to use aggregation. Descriptions on what developers of coupled displays and visualization tools need to think about regarding aggregation, and some general advice about the use of aggregates will also be presented.
 

Users of CHART

The incident report forms are today only for internal use at the center in Hanover, Maryland. The primary goal is to calculate response time for the incidents and to evaluate the benefits of CHART. For other statistical tasks the official police reports are used and thus about 30% of the incidents are not covered.

The prototypes show that the highway incident data can have many different purposes. They seem very promising for exploring data and the personnel working with collecting data should be encouraged to save it electronically. Even though a lot of information is available for each incident there is data missing. If the incidents’ exact location were saved in a proper format, it would make the exploring both more interesting and easier. Other examples of valuable data include the time and distance response units have to travel, average traffic volumes, and speed limits.

If the system operators can explore databases on how incidents were handled and taken care of, this might influence their way of working. The problem is that the data is not gathered today, so we don’t really know what we might be able to do with it. My beliefs are that these coupled displays together with aggregation will be useful for people working with CHART.

The idea with geographical aggregation implemented in the prototype with exit number aggregates worked very nicely. The display gives users a quick view over the most dangerous spots and it is easy to investigate these areas in more detail. When traffic engineers want to know where the high hazard locations are, I think this type of tool will be very helpful.

Transportation planners identify the need of new roadways. If data about the roads was available this could be displayed together with the highway incident data. Selecting an incident could bring up the parameters of the road in another display and the other way around, selecting parameters of the road could bring up the incidents that occurred on roads with those parameters.
 

Developers of coupled displays

Software developers of linked or coupled displays need to focus their attention on a couple of important things. This section will include explanations about the different things that have found through this case study: both specific problems about STV and general aspects about coupled displays. When the problems have been solved a solution is presented, and in other cases only a description of the problem is given.
 
 

Primary key

When a table or a query is loaded into a visualization display through STV one of the fields of the records is set to be the primary key. The values from this field will be passed over to other linked windows if a coordination link between the displays is specified. For example, when a user selects the aggregate for exit number 19 the incidents close to exit 19 will be loaded into the grid (Figure 14). The field with exit numbers is the primary key and the value (19) is passed to the query "Load incidents with exit number = x".

STV sets the field containing the substring ID to be the primary key. This is not always appropriate and desirable. In the first prototype (Figure 13) the field ExitNumber is a better field to use as a parameter than the primary key CaseID.

A solution to this problem was to let the user choose which of the fields is the primary key. The list shows all the fields in the data set and the user has to choose one of them as the primary key (Figure 20). Every time a new data set is loaded into a Spotfire display the user must choose a primary key (or parameter), except when the display is used for loading details of an aggregate.

Figure 20: A dialogue lets the user choose one of the fields to be primary key.

This new dialogue makes it possible to combine the two first prototypes into one application. Totally different data sets can be linked together as long as they have one common field. Another advantage is that the database designer does not have to give one of the fields a name with ID in it.

With the improvements the primary key is the same for one data set, but it is probably better to choose a new key for each link. This will enable users to coordinate windows in a new creative way.
 
 

Multiple selection

STV does not support multiple selection today; i.e. enables the user to select a group of records and coordinate actions with all of them. It is, however, possible to highlight multiple records when you select one single record. For example in the variant of the first prototype, selecting an exit number will highlight all the incidents with that exit number. This improvement to STV was made after the first prototype when it was clear that this feature was really needed.

If a coordination tool like STV should support multiple selection, there is one very important issue to consider. Imagine two visualizations that are tightly coupled so selection in one display causes selection in the other display. The user selects several records in display A and this causes the corresponding records in display B to highlight. The highlighting of the records in display B may in their turn highlight new records in display A, thus creating a cascade effect. It is not obvious how the cascading should be prevented (if it should be prevented) and a solution to this problem is still unknown.
 
 

Loading data

In many of the prototypes one of the coupled displays is used for showing details of an aggregate. The details are loaded depending on the selected aggregate. Every time new data is loaded in a Spotfire display because an aggregate is selected, the background settings are reset. This includes the settings of the color-coding, size coding, image background, and the X and Y-axis. The coordination application must remember the preceding settings and apply them to the newly loaded data set. Otherwise, the user must redo the settings over and over again.

An option was added to STV so that the settings concerning the color-coding, size coding and the X and Y-axis are restored when new data is loaded into a Spotfire display. Unfortunately, the background image could not be saved the same way because the Spotfire API functions were not implemented (see page *, API functions). To avoid setting the background image every time a new data set was loaded, one ad hoc solution was implemented. Since a file in the Spotfire file format includes all information about the current visualization and the background image, a temporarily file was saved. When the new data set was loaded the settings from the file was applied to the data set. This was very slow but it worked.

The two different approaches of saving the settings were added as options in the STV menu (Figure 21). Restoring the settings is of great importance during the exploration of the aggregates.

A display that is reloaded with data and used for showing details of an aggregate can have couplings to other windows than the display with the aggregates. These links can either be lost or kept as if nothing has happened when the display is reloaded. Since the coordination actions should be set up once and not several times, it was easy to make a choice in this case. STV was changed so the other links remained when an aggregate was selected and new data was loaded into the display.
 
 

Recalculation of aggregates

When two displays are tightly coupled the actions taken in one display should have direct visual effect in the other display. Selection of an aggregate in one display will immediately load the details in the other display.

Consider the following scenario: The incidents in the database are aggregated by weekday and Monday is selected. All the incidents on Mondays are displayed in the other window. The user now filters out the incidents with only trucks, and wants to see an update of the aggregate to reflect the filtering action. The update can today not be done without sending a new query to the database to recalculate the aggregate.

An implemented solution of this problem was to take the SQL query from the display with the details and combine it together with the aggregate query. The database executed the query and the display with the aggregates was reloaded with the new results. Every time the user changes the query devices in the Spotfire display the aggregates are recalculated. An extra option for this feature was added in the STV menu (Figure 21).

Updating the aggregates correctly is a very difficult thing to do. It is conceptually also difficult to understand what is happening with the aggregate query when the details are filtered.

If tightly coupled displays are going to be used together with aggregation, this problem has to be solved. The tight coupling is to help users explore the data and updating the aggregates is a very important thing for this task.

Figure 21: The STV menu with the new options for updating aggregates and restoring the settings added at the bottom.
 
 

Screen space

When exploring large sets of data and using visualization tools the screen space is a limiting factor. At HCIL, two-monitor workstations are often used for demonstrations of STV to show its full potential. The coordination tool itself can not occupy too much space of the screen and should maybe only be visible when the coordination specifications are defined.
 

Spotfire

Spotfire is a very good application for visualizing data but here are some recommendations on how to improve the program. Many of these things concern the available Spotfire API (Application Programming Interface), that enables developers to control the program’s visualizations and interface. The API is used in STV to make Spotfire visualizations.
 

Bar charts

When a record is selected in a scatterplot the API fires an event that let other applications know about the selection. Unfortunately, no events are fired when a user selects a bar in a bar chart. This makes it impossible to coordinate selections from a bar chart in one display to another display. Figure 16 and Figure 17 shows a prototype with bar charts, but the windows in this figure is actually not tightly coupled. I notified Spotfire about this problem and their development team had not thought about it before.
 

Loading data

If Spotfire is going to be used as a display for showing details of aggregates the loading of data must be much faster. When a user selects an aggregate and wants to see the details in the other display, the user must wait for the contents to be loaded.
 
 

Painting markers

When Spotfire is painting the data records, the program paints the points with one color first. It should be possible to paint them in an order that depends on the size of the markers. In the prototype showing data from an entire year (Figure 18), all the red markers are probably painted first since some of the smaller ones are hidden behind the bigger blue markers.
 
 

API functions

Today there are many things you can do with the Spotfire interface and the perfect API would let you do exactly the same things or maybe have even more functions. This is not the current state of the API, since there are many important functions that are not implemented yet. It is for example not possible to manipulate the query devices, to set the range of a range slider or to check the buttons. Functions for saving settings from visualizations would have been very useful in my prototypes. You can save the settings for the X and Y-axis, color-coding, shape coding and size coding, but the settings in categorical color-coding can not be saved. It is not either possible to save the background image, or to set the background image.
 
 

General principles

Aggregation is a good way of reducing the amount of data points in a display. Instead of showing all the points, groups of data points (aggregates) are created to simplify the view. Tightly coupled display works very nicely together with aggregation since the coupling makes it easy to explore the aggregates in detail.
 
 

Pre-calculated versus dynamic

Aggregates can be pre-calculated and defined in the database, but they can also be specified on the fly. If they should be calculated dynamically, the database manager system must be fast enough so the user doesn’t have to wait for the results. The advantage with specifying the aggregates on the fly is that the user can decide the extent and the granularity of the aggregation. When the aggregation is done in advance you have to be careful so you specify the most useful aggregates. In most cases it is probably possible to predict what kind of aggregates that will be used, but there are situations where many different combinations of aggregates could be needed. Another possibility is to specify and calculate the aggregates dynamically and save them for future reference.
 
 

Resetting the displays

In some of the prototypes the initial views of the displays are the aggregates and all the incidents. When an aggregate is selected the view with all the incidents is changed to only show the incidents from that aggregate. There is currently no way to come back to the view with all the incidents. This could be an interface problem or it could be a matter of the data. One solution is to add an extra aggregate containing all the incidents in the view with the aggregates. Another solution would be to add a reset button in the interface.
 
 

Speed

The visualization application that is used for exploring the details of an aggregate must be fast enough to switch between different data sets. It is frustrating to wait for data to load all the time.
 
 

Differences between aggregates

It is not easy to find any obvious differences between different kinds of aggregation, but there are anyway certain prototypes that seemed more promising and valuable than others. The geographical aggregation by exit number with a map in the background was very easy to understand and work with. Another prototype that seemed very promising was the date aggregation displayed over one year. It seemed much better than the display with weekday aggregation, but it all depends on what the task is.
 
 

FUTURE WORK

If user interfaces with aggregation and tightly coupled views are going to be successful more research in this area has to be conducted. Here are a few suggestions for future research areas.

REFERENCES:

Ahlberg, C., Shneiderman, B., "Visual Information Seeking: Tight Coupling of Dynamic Query Filters with Starfield displays", Proceedings of ACM CHI’94 Conference, pp. 313-317, (April 1994).

Ahlberg, C., Wistrand, E., "IVEE: An information visualization & exploration environment", Proceedings of IEEE Information Visualization '95, IEEE Computer Press, pp. 66-73, (1995).

Ahlberg, C., Williamson, C., Shneiderman, B., "Dynamic Queries for Information Exploration: An Implementation and Evaluation", Proceedings of ACM CHI’92 Conference, pp. 619-626, (May 1992).

Card, S., Mackinlay, J., Shneiderman, B., Readings in information visualization using vision to think, Morgan Kaufmann Publishers, Inc., San Francisco, California, (1998).

Chang, G-L., Point-Du-Jour, J. Y., "Performance and Benefit Evaluation for CHART (Incident Management Program in 1996)", (1998).

CHART homepage: http://www.chart.state.md.us

Derthick, M., Kolojejchick, J. A., Roth, S., "An Interactive Visualization Environment for Data Exploration", Proceedings of Knowledge Discovery in Databases, AAAI Press, pp. 2-9, (August 1997).

Derthick, M., Kolojejchick, J. A., Roth, S., "An Interactive Visual Query Environment for Exploring Data", Proceedings of the ACM Symposium on User Interface Software and Technology (UIST '97), ACM Press, pp. 189-198, (October 1997).

Goldstein, J., Roth S. F., "Using aggregation and dynamic queries for exploring large data sets", Proceedings of ACM CHI’94 Conference, pp. 23-29, (April 1994).

Hellerstein, J. M., Haas, P.J., Wang H. J., "Online Aggregation", Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 171-182, (1997).

Kandogan, E., Shneiderman, B., "Elastic Windows: Evaluation of Multi-Window Operations", Proceedings of ACM CHI'97, pp. 250-257, (March 1997).

Li, J, "Aggregation in Information Visualizations", In preparation, (1999).

Lucas, P., Roth, S., "Exploring Information with Visage", Conference Companion of ACM CHI’96 Conference, (April 1996).

Livny, M., Ramakrishnan, R., Beyer, K., Chen, G., Donjerkovic, D., Lawande, S., Myllymaki, J., Wenger, K, "DEVise: Integrated Querying and Visual Exploration of Large Datasets'', Proceedings of ACM SIGMOD’97, (May 1997).

Mockus, A., "Navigating Aggregation Spaces", Information Visualization'98, (1998).

North, C., "Robust, End-User Programmable, Multiple-Window Coordination", Proceedings of ACM CHI'98 Conference, (April 1998).

North, C., "Generalized, Robust, End-User Programmable, Multiple-Window Coordination", Research Proposal, University of Maryland Computer Science Dept, (1997).

North, C., Shneiderman, B., "Snap-Together Visualization: Coordinating Multiple Views to Explore Information", University of Maryland Computer Science Dept. Technical Report #CS-TR-4020, (1999).

North, C., Shneiderman, B., "LinkKit: 2D and 3D Image Browsing with Multiple-Window Coordination", University of Maryland Computer Science Dept, (1998).

North, C., Shneiderman, B., "A Taxonomy of Multiple-Window Coordination", University of Maryland Computer Science Dept. Technical Report #CS-TR-3854, (1997).

Shneiderman, B., Designing the User Interface: Strategies for Effective Human-Computer Interaction, Third Edition, Addison-Wesley, (1998).

Shrestha, D., "Performance Evaluation of CHART in 1997", Draft Report on Data Analysis, (1999).

Williamson, C., Shneiderman, B., "The Dynamic HomeFinder: Evaluating Dynamic Queries in a Real-Estate Information Exploration System", Proceedings of ACM SIGIR’92, pp. 338-345, (June 1992).
 
 

APPENDIX A: The database fields

Field Description Data Range

CaseID Primary key 1 – 5 479

Source Reporting center SOC, TOC3, TOC4

Date Date 1/1/97 – 12/31/97

Location Description of location
RoadName Road with direction (e.g. north-bound)
RoadNumber Number of the road 0 – 895

LocationNumber Unknown (only empty values)
DetectedBy Person/organization detecting the incident

StartTime Time when the incident occurred 00:00 – 24:00
ReportingTime Time when CHART was notified 00:00 – 24:00

DispatchTime Time when resources were dispatched 00:00 – 24:00
ArrivalTime Time when the unit(s) arrived on 00:00 – 24:00

PartialOpenTime Time when some (not all) lanes opened 00:00 – 24:00
FullyOpenTime Time when all the lanes opened 00:00 – 24:00

DepartureTime Time when the unit(s) left the incident 00:00 – 24:00

Description Short description of the incident

LanesBlocked Number of blocked lanes 0 – 12

TotalLanes Total number of lanes 0 – 12

ShoulderBlocked Number of blocked shoulders 0 – 1

ShoulderUtilized Number of shoulders used for cleaning up 0 – 3

RampBlocked Number of blocked ramps 0 – 2

QLength The length of the queue in miles 0 to 6 miles

QlengthTime Time when the queue length was measured 00:00 – 24:00

Auto Number of cars that were involved 0 – 9

Pickup/Van Number of pickups/vans 0 – 4

SUTruck Number of single unit trucks 0 – 5

TTrailer Number of tractor trailers 0 – 4

Motorcycle Number of motorcycles 0 – 3

Other Other vehicles involved

WeatherCode Rain, snow, fog, wet or dry pavement.

Hazmat Hazardous material involved 0 or 1

FITM Freeway Incident Traffic Management 0 – 2

Medivac 0 – 3

SignalOPS 0 – 1

IncidentCode The type of the incident (seriousness) Codes

SHA State Highway Administration Responded / N

MSP Maryland State Police R / Notified

FireDept Fire Department R / N

LocalPolice Local police R / N

SOC Statewide Operations Center R / N

TOC3 Traffic Operations Center # 3 R / N

TOC4 Traffic Operations Center # 4 R / N

TOC5 Traffic Operations Center # 5 R / N

MdTa Maryland Transportation Authority R / N

MDE R / N

Shop Maintenance office (shop) of SHA 0, 1, 2, R / N

HT Highway Technician 0 – 9

ERU Emergency Response Unit 0 – 3

ETP Emergency Traffic Patrol 0 – 3

PortVMS Portable Variable Message Sign 0 – 2

DumpTruck 0 – 7

ArrowBoard Number of arrow boards that were used 0 – 6

SandTruck Number of sand trucks 0 – 2

Sweeper Number of sweepers 0 or 1

PortTAP 0 or 1

LightPlan 0 or 1

FrontEndLoader Number of front end loaders 0 – 2

OtherEquip Other equipment that was needed

District District where the incident occurred (only empty values)
 
 

APPENDIX B: System specifications and how to run demos

System specifications

Windows 95/98/NT and Visual Basic 6.0 is needed to run STV, and Spotfire 3.0 or higher must be installed on the machine to show the visualizations.
 
 

Demos

In APPENDIX C: Notebook there is a description on how to run a demo for each one of the prototypes.

The descriptions list the used database, the visualization tools, the parameters and the specified coordination actions. For example, the table below shows Exit aggregation, where the first query (Exit aggregates) is shown in Spotfire and the second query is shown in a table grid. When the user selects an exit the incidents should be loaded. The parameter ExitID should be chosen when the Spotfire display is loaded with data and for the table the parameter does not matter.

Database: Exit aggregation.mdb
 
Table or Query Visualization Parameter Action
Exit aggregates Spotfire ExitID Select
Incidents at an exit Table --- Load

 

The option Updating aggregates on the STV menu can be demonstrated with weekday aggregates. Open the database Weekday aggregation, load the query Weekday aggregates in a Spotfire display and choose WeekdayID as the parameter. Then load the table Only 695 in another Spotfire display and the parameter as Weekday. Snap the two windows so selection of an aggregate loads the incidents.

With the option set, the aggregates will be recalculated when a user filters the incidents to show for example only incidents with cars. The recalculation is done when any of the query devices in the display with incidents are manipulated.

All image files and databases are located in the directory g:\users\anna\Transportation\.

The enhanced version of STV is called linkit.exe and located in the directory g:\users\anna\devel_snap\.
 
 

APPENDIX C: Notebook

06/01/99 – 08/23/99
 
 

  1. OPEN HOUSE DEMO 1 – 06/17/99

  2.  

     
     
     

    Figure 22: Highlighting the exit closest to a selected incident.
     
     

    Questions

    Which was the closest exit to this incident?

    Which type of incident blocked 3 lanes?

    Did the duration time increase with the number of lanes that were blocked?

    Did the type of incident influence the number of blocked lanes?
     
     

    Description

    The incidents are shown in Spotfire to the right. The duration of an incident is used to set the size of a marker and the type of incident is used for coloring. The date is on the X-axis and the number of blocked lanes is on the Y-axis. In the display to the left there is a map over the exits of the Baltimore Beltway. If you select an incident in the rightmost display, the exit closest to the incident will be highlighted in the other display.
     
     

    Demo

    Accident data.mdb
     
    Table or Query Visualization Parameter Action
    Accident data Spotfire ID Select
    Exits Spotfire CaseID Select

     

    Advantages

    It is easy to see where an incident occurred and how long it took to handle it.
     
     

    Disadvantages (not anymore…)

    It is not possible to select an exit and see the incidents at that exit (a question of multiple selections, and that is not supported by Snap Together). The exits do not have any parameters (number of accidents etc) so you can not easily see where most of the incidents occurred.

    Figure 23: After improving the Snap with choosing the passing parameter, Demo 1 and Map to Grid can be combined into one. The exit number is passed from Spotfire to the map and the list.

  3. MAP TO GRID – 06/17/99

  4.  

     
     
     

    Figure 24: The grid shows the incidents close to exit 17.
     
     

    Question

    What incidents (and how many) occurred at this exit?
     
     

    Description

    Spotfire shows a map over the exits on the Baltimore Beltway. If you select an exit, you can see the incidents at that exit in a grid display. In the picture above exit number 17 is selected and the grid shows that there were three incidents near that exit.
     
     

    Demo

    Database: Accident data.mdb
     
    Table or Query Visualization Parameter Action
    Exits and their coordinates Spotfire ExitID Select
    Accidents at an exit Table --- Load

     

    Advantages

    It’s easy to see what incidents occurred at a selected exit.
     
     

    Disadvantages

    In this view you can’t see how many incidents occurred at each exit. The Snap Together interface does not support selection of multiple records (it can only pass one exit number to the query), so it was not possible to show the incidents from two or more exits.

  5. EXIT AGGREGATION – 06/29/99

  6.  

     
     
     

    Figure 25: The map shows exit aggregates. If you select an exit, the grid shows the incidents.
     
     

    Questions

    Where are the exits with most number of incidents?

    Are the exits with many incidents far away from a response unit?
     
     

    Description

    A map over the exits on the Beltway is shown in Spotfire. The size of the exits depends on how many incidents occurred near that exit. The color depends on the distance to the closest response unit (fictional data). When you click on an exit you can see those incidents in a grid. The number of incident at an exit ranges from 0 to 13. The total number of incidents is 175. The aggregate has the parameters Number and Distance.
     
     

    Demo

    Database: Exit aggregation.mdb
     
    Table or Query Visualization Parameter Action
    Exit aggregates Spotfire ExitID Select
    Incidents at an exit Table --- Load

     

    Advantages

    It is easy to see the frequency of incidents (most of the incidents are on the northwest parts of the Beltway) and to look at the incidents in detail. You can also see which exits that are far away from the fictional response units.
     
     

    Disadvantages

    You can’t change the parameters of an incident dynamically, for example to see where the incidents with trucks were. The aggregates must then be recalculated.
     
     

    Note

    Fictional data about the distances to the response units
     
     

  7. ZOOMING VIA SPOTFIRE (GEOGRAPHICAL AGGREGATION) – 06/30/99

  8.  

     
     
     

    Figure 26: Zooming in at exit 17. There are 7 different incidents close to exit 17 on the Beltway.
     
     

    Questions

    Where are the high hazard locations?

    Where exactly did the incidents occur at this exit?

    Were they all on the ramp to the exit or on the inner loop?

    Are there one or several intersections that seem to be more dangerous?
     
     

    Description

    A map over the exits on the Beltway is shown in Spotfire (the same view as in the previous example). When you select an exit a more detailed view over the exit is shown in another display. You can now see exactly where the incident occurred. Exit 17 is shown in the figure above and exit 19 is shown below.
     
     

    Demo

    Database: Exit aggregation.mdb
     
    Table or Query Visualization Parameter Action
    Exit aggregates Spotfire ExitID Select
    Zooming Spotfire 17, --- Load

     

    Advantages

    It’s an excellent way of zooming at a selected exit. While you zoom in you can still have the overview in the other display.
     
     

    Disadvantages

    When you select an exit it takes time to load the incidents in Spotfire. You also have to apply a map as a background image each time a new set is loaded into Spotfire. To do this smoothly we’ll probably need some other display to view the detailed map.

    Figure 27: Zooming in at exit 19. There are five incidents that occurred close to this exit.
     
     

    Improvements

    Instead of having the color depend on the distance the response units it can depend on the average traffic volume at each intersection.

    If the Snap Together could manipulate the interface and not only the objects, a map with all the incidents could be shown in Spotfire. When you select an exit you just zoom in to that intersection.
     
     

    Note

    Only exit 17 and 19 have map backgrounds.

    Don’t move the mouse to the beltway map once you have loaded the background map… or it will load another data set.

  9. WEEKDAY AGGREGATION – 06/30/99

  10.  

     
     
     

    Figure 28: Weekday aggregation. The incidents on a selected weekday are shown in another display.
     
     

    Questions

    On which day of the week did most of the incidents occur?

    Where did the incidents on Fridays occur?

    How many truck incidents were there on Mondays and where did they occur?
     
     

    Description

    The display to the left shows the different days of a week, Monday is marked in red. The size of the marker depends on how many incidents occurred on that day of the week. If you select one of the days you can see where the incidents occurred in a different display. In this display the duration of the incidents decides the size of the markers and the color of the markers depend on the type of incident (the seriousness).
     
     

    Demo

    Database: Weekday aggregation.mdb
     
    Table or Query Visualization Parameter Action
    Weekday aggregate Spotfire WeekdayID Select
    Incidents on a weekday Spotfire 2, Weekday Load

     

    Advantages

    It is easy to see that very few incidents occurred during the weekends. Seems to be a nice idea to be able to manipulate different time aspects in one display and show the locations of the incidents in another display. For example show only the incidents on Monday mornings and afternoons. You can’t choose two different time intervals in Spotfire….
     
     

    Disadvantages

    It takes time for Spotfire to load the data set for a selected day. The map has to be set but this can be avoided if the Snap can manipulate the interface. You will then be able to quickly filter out the incidents you want to see.
     
     

    Improvements

    Show the different days as bars instead of just a marker. Bar charts don’t generate any highlight events and are not included at all in Snap, but I made a fake version of bars charts anyway.

    Figure 29: Weekday aggregates are shown with bars and Monday is selected. The map shows the incidents on Mondays, but the displays are unfortunately not coupled.

    But what if I only want to see the truck incidents on Mondays? I can manipulate the map and only show the incidents with trucks, but the aggregates will not be recalculated and the display not updated. To show how it could look like, I made a fake version of this. The map shows only the truck incidents on the selected weekday (Monday) and the aggregates are recalculated.

    Figure 30: The map shows only the truck incidents on Mondays and the weekday aggregates are updated according to truck incidents.
     
     

    Question

    Is there a day where a kind of vehicle is more likely to be involved in an incident?
     
     

  11. CALENDAR OVER JULY (TEMPORAL AGGREGATION) – 07/01/99

  12.  

     
     
     

    Figure 31: Temporal aggregation by date. A calendar of July 1997 is shown in Spotfire and if you select a date the incidents on that day are shown in a grid.
     
     

    Questions

    Which day in July 1997 was the day with most number of incidents?

    How many incidents occurred on 4th of July 1997?
     
     

    Description

    A calendar over a month is shown in Spotfire. There were incidents on the dates with markers and the size and color of the marker depends on the number of incidents. If you select a date, a grid will show details of the incidents on that date. In the figure above the incidents on July 16, 1997 is shown in the grid.
     
     

    Demo

    Database: Calendar.mdb
     
    Table or Query Visualization Parameter Action
    Calendar aggregate Spotfire DateID Select
    Incidents on a date Table --- Load

     

    Advantages

    Could be used with data from different years to make comparisons.
     
     

    Disadvantages

    Probably better to show a year instead of only a month – next example shows this.
     
     

    Improvements

    The average number of incidents each date could be added to the display. This makes it easier to compare and see if the number of incidents has increased or decreased. The color of the markers could depend on the weather that date.
     
     

    Note

    Because of some strange reason you can’t show the incidents in Spotfire (failed to open the data table). It does not work on Yosemite either…. so it is not a Spotfire Pro 4 thing.
     
     

  13. DATE AGGREGATION (AND COMPARISON) – 07/02/99

  14.  

     
     
     

    Figure 32: Date aggregation. Spotfire shows data from both 1996 (red) and 1997 (blue), but the displays are not coupled.
     
     

    Questions

    Has the number of incidents in July increased compare to last year?

    Compared to the average incidents in July?

    Which months or dates are more dangerous?
     
     

    Description

    Spotfire shows data from both 1996 (red markers) and 1997 (blue markers). The size of the markers depends on the number of incidents on that day. On the Y-axis are the months of a year and on the X-axis are the days. By selecting a day the incidents on that day (independent of year) are shown in a grid.
     
     

    Demo

    One-year-view.mdb
     
    Table or Query Visualization Parameter Action
    Date aggregate Spotfire --- ---
    Incidents on a date Table --- ---

     

    Advantages

    It is easy to see if the number of accidents increased or decreased and if certain months or days have more incidents.
     
     

    Disadvantages

    The Snap Together interface is not enough in this case. It only allows one parameter to be passed to another display and in this case both the day and the month are needed.

    Spotfire shows December (12) at the top and January (1) at the bottom. Adding a new day column that contains a number from 1 to 365 could solve this problem.
     
     

    Improvements

    Show the data from a year compared to the average over 5 years. The question is that the weekends and holidays are not on the same date. But on the other hand you want to see for example the 4th of July every year. How do you display it?

    Note

    In the figure above the displays are not coupled.

    The data from 1996 is actually the data from 1997 with road number = 0.
     
     

  15. AUTOMOBILE AGGREGATION – 07/06/99

  16.  

     
     
     

    Figure 33: Automobile aggregation. The incidents with 2 cars are shown in the display at the bottom.
     
     

    Questions

    How many incidents involved 5 cars?

    Did the incidents with two cars last longer than the incidents with four cars?
     
     

    Description

    At the top, a Spotfire display shows automobile aggregates. The X-axis shows the number of cars and the Y-axis the number of incidents. The size of the marker depends also on the number of incidents. If you select the marker with 4 cars all the incidents with 4 cars are shown in the other display. In that display, the date and the duration of an incident are on the X and Y-axis. The color of the markers depends on which day of the week the incident occurred (blue = weekday, green = Saturday, red = Sunday).
     
     

    Demo

    Automobile aggregation.mdb
     
    Table or Query Visualization Parameter Action
    Auto aggregate Spotfire Number of cars Select
    Incidents with x cars Spotfire --- Load

     

    Advantages

    You can see that the number of incidents with 0-2 cars were the most frequent. At the same time you can examine if the duration is less with less number of cars.
     
     

    Disadvantages (not anymore…)

    It was not possible to add a description display to the Spotfire at the bottom (if you click on an incident you can see its description in a third display). It works fine with two displays, but when you have three displays and reload the incidents in Spotfire the connection between the description and the incidents seem to disappear.

    This is fixed now!
     
     
     
     
     
     

  17. INCIDENT CODE AGGREGATION – 07/07/99

  18.  

     
     
     

    Figure 34: Incident code aggregates. The upper displays shows all incidents of type 1050PD (Property Damage). The color depends on weekday and most of the incidents occurred on Mondays and Fridays.
     
     

    Questions

    Which is the most common incident type?

    What are the characteristics of an incident type?

    Does it take longer time to clear up a serious incident than a minor one?

    Is there a day of the week when all the car fire incidents occur?
     
     

    Demo

    IncidentCodes.mdb
     
    Table or Query Visualization Parameter Action
    Code aggregate Spotfire IncidentCode Select
    Incidents with code x Spotfire --- Load

     

    Description

    The lower displays shows incident code aggregates and when you select a type you can se all the incidents of that type in the upper display. In the upper display the X-axis holds number of cars and the Y-axis the duration of the incident. The color of the markers depends on the weekday. You can see that the duration time is less when there are more cars involved. This is not very surprising since the incidents are all of the1050PD type and they are not very serious.

    The difference between the incident types is obvious if you also look at the figure below. In this figure the incidents of 1050SPI (Serious Personal Injury) type is shown and the duration time seems to increase with the number of cars involved.

    Figure 35: The incidents of type 1050SPI are shown in the upper display.
     
     

    Disadvantages

    The problem with updating the aggregates according to the other display is still present. The markers in the lower display should probably be exchanged to bar charts to make the view simpler.

    For each data set that is loaded in the upper display, the color-coding must be set. This is not fixed yet, since the functions in the API are not working correctly.

  19. DURATION AGGREGATION – 07/08/99

Figure 36: Incidents with duration time between 30 and 60 minutes are shown in the lower display. The green marker in the upper right corner is selected and the location is marked with a star on the map.
 
 

Questions

How many incidents take between 30 and 60 minutes to clean up?

Where are the incidents that cause the longest queues?
 
 

Demo

Duration aggregation.mdb
 
Table or Query Visualization Parameter Action 1 Action 2
Interval aggregate Spotfire IntervalID Select  
Incidents Spotfire CaseID Load Select
Map Spotfire ---   Load

 

Description

The upper display top the left shows the number of incidents that had a duration time between 0-30 min, 30-60 min, 1-2 hours, 2-3 hours, and more than 3 hours.

When a time interval is selected the incidents are shown in the lower display. On the Y-axis is the traffic volume (fictional) at the time the incident occurred and on the X-axis is the duration time within the selected interval. The incidents with long duration time and high traffic volume will most likely cause the longest queues. The color of the markers in the upper display depends on weekday.

When an incident in the lower display is selected the location of the incident is marked on the map.
 
 

Disadvantages (not anymore…)

The windows in the figure are not fully coupled. The map is not coupled with the lower display and this is a probably something wrong with Snap, see also Automobile aggregation.

The problem arises when there are 3 displays and when the lower display (in this case) is reloaded with data. The link between that display and the map gets somehow lost when you reload is with data. It works just fine if you only have that display and the map, and not reload the display with incidents from another duration interval. This is also fixed now when the visualization object is updated correctly.
 
 

Note

The fictional traffic volume is a constant at each exit and not depending on the time at all.