Exploring the correlations between spatiotemporal daily activity-travel patterns and stated interest and perception of risk with self-driving cars

The key to Autonomous Vehicles (AVs) successful penetration of markets lies in identifying specific needs that AVs satisfy for daily activity-travel participation of individuals. In this paper we explore whether and to what extent people’s exhibited spatiotemporal activity-travel patterns correlate with their stated perceptions about self-driving cars. We investigate the travel diaries of 3,411 survey respondents who live in the Puget Sound region of the U.S. in 2017 using sequence analysis. In parallel, we apply hierarchical clustering to identify people’s attitudes based on their stated interest and perception of risks about AVs. A multinomial regression model is built to examine the correlations between AV attitude clusters and daily activity-travel patterns. Statistically significant correlations are then identified. The model results suggest that people exhibiting different activity-travel behavior patterns also express distinct attitudes towards the uses of AVs. The model shows that people who travel to work during the day are more likely to be positive to AVs. In particular, the group traveling to work later than the regular 8-to-5 schedule shows stronger interest and less concerns to AVs, which can be partially explained by the diverse activities they do throughout the day, the variety of travel modes they use and presumably more schedule flexibility they need than the public transportation system offers.


Introduction and Overview
We consider autonomous vehicles to be the highest level of autonomy/robotization, in which the car makes most if not all the moving decisions except selecting origin, destination, and timing of departure of a trip. In the specialist literature, these are called autonomous vehicles, automated vehicles, self-driving cars, driverless cars, and robocars and are all considered synonyms herein and called Autonomous Vehicles (AVs). AVs are considered a potentially disruptive and transformative mode of transportation also when combined with sharing; they reshape the landscape of the current transportation system and mobility services. The development of AVs has rapidly progressed due to a push to the market by technology companies and the automotive industry. The Society of Automotive Engineers (SAE) International [1] defines the 5 levels of autonomy. Automobiles at levels 2 and 3 with self-parking functions and advanced warning systems are already in the market. Although the reality of fully automated vehicles may seem distant, there is an increasing need to understand the impact of AVs on transportation systems and mobility services.
In the growing body of literature, various aspects of AVs are examined (mainly through simulations), including the positive and negative impact of AVs on our lives and environment. The advantages of adopting AVs are numerous, such as increased mobility options for everyone, especially for the disabled, drunk, inattentive, senior, and children to have better access and options to fit their transportation needs [2]; more effective traffic flow and reduced traffic congestion [3]; increased safety and declining traffic accidents and fatality rates [4], improved productivity and gains in pleasure while traveling in a car [5]; more smooth and comfortable and less stressful rides [2]; and lower greenhouse gas (GHGs) emissions [6]. The negative impact lies in the possible consequences resulting from safety, security, privacy, and liability related issues [2,7]. While many studies have been focused on assessing the impact of AVs, public acceptance of AVs and its determinants have not been fully investigated. Evaluating public acceptance and assessing the type of services desired by markets are critical in the adoption of Autonomous Vehicles (AVs). The key to AV successful penetration of the market is to identify the best market segment for early adoption of the technology.
A majority of research examines people's stated preference, acceptance, attitudes, and perceived risk towards AVs using (online) surveys, and correlates them to survey participants' socio-demographic traits such as age, gender, income, and education. Schoettle & Sivak [8,9] show that men are less concerned with adopting this new technology. Young respondents also exhibit less concerns [9] and more interest [10] in using AVs. In addition, people with high income are more interested in owning an AV [11].
Yet, few studies have attempted to examine the relationship between individuals' dispositions towards AVs and their observed daily activity-travel behavior (e.g., using survey participants' daily travel diaries), which could have enabled a better focus of the market niche(s). To fill this knowledge gap, we pose a central research question in this paper: How do individuals' daily activity-travel patterns relate to their disposition towards the use of AVs?
To answer this question, we analyze the 3,411 responses to survey questions about the positive and negative dispositions toward self-driving cars from people living in the Puget Sound region of the United States based on the data from the 2017 Puget Sound Regional Household Travel Survey. We extract travel diary information from the same respondents to derive their daily activity patterns using sequence analysis and hierarchical clustering. We then investigate the association between daily activitytravel patterns and AV dispositions.
The remainder of this paper is structured as follows. Section 2 introduces the data used in this study. Section 3 presents the methodology to address the research question, followed by results and findings in Section 4. Conclusions are presented in Section 5.

Data
The data used in this study comes from the 2017 Puget Sound Regional Household Travel Survey [12]. The Puget Sound Region in the Northwestern United States is the area that surrounds and includes the City of Seattle. The region encompasses the entire Puget Sound Regional Council (PSRC) four-county region, which includes King, Kitsap, Pierce, and Snohomish counties. The region including 82 cities and towns has a population of approximately four million persons (and approximately 1,548,788 households), with 730,000 in the City of Seattle, and the rest distributed throughout the region in smaller cities. The percent of persons in the labor force approaches 70%, and the median household income exceeds $75,000 per year. The region houses many aerospace and information technology companies, and it is the home of major education institutions. Seattle is also consistently found to be one of the most congested cities in the United States [13]. Therefore, this is an ideal AV market with knowledge and income to afford the most expensive car technology.
The PSRC Household Travel Survey, conducted between April and June 2017, collected information at the household and person levels, including socio-demographic (e.g., gender, age, education, employment), geographic (e.g., place of residence at census tract level) and vehicle ownership (e.g., car ownership and fuel type) information, and travel diaries from every respondent within households. In particular, the travel diaries consist of one-day weekday travel diaries from approximately 80% of participants and entire one-week travel diaries from the remaining 20% of participants. In each travel diary, respondents reported every trip they made, travel party, trip purposes, origin and destination type of places and timings, travel mode(s), trip costs and details associated with each mode, and other trip information.
This survey also contains twelve questions about interest and concerns regarding the use of AVs for participants above 18 years old. There are seven questions on the interest of various AV uses (e.g., use for commuting and short trips) and five questions on concerns of AV related issues like concerns on system safety and legal liability.
The data provided by PSRC portal comprises survey results from 6,254 persons in 3,285 households. From these we select persons that answered the AV questions and the one-day diaries. There are 3,411 people who have traveled during 03:00AM on the survey day to 03:00AM on the following day.

Methodology
Our methodology includes three basic steps: 1. Identify groups of individuals that share similar daily activity-travel patterns by applying sequence analysis. 2. Identify groups of dispositions towards AVs from the questionnaire on the interest and concerns about AV utilizing clustering analysis for discrete data. 3. Investigate the correlations between daily activity-travel patterns and attitudes towards AVs using a Multinomial Logit regression model.

Identify Spatiotemporal Daily Activity-Travel Patterns
A sequence is a series of time periods at which a subject can move from one discrete "state" to another. Sequences have been used to describe individuals' activity-travel episodes [14]. They are efficient in capturing many details of the activities and travel, such as the ordering and duration of activities, and the transition from one to another. In this section, we derive daily activity-travel patterns using sequence analysis. First, we construct individuals' daily activity-travel sequences using the one-day travel diary records from the 3,411 participants. For each record, we use the departure times and arrival times of trips, and the origin and destination trip purposes (can be a place or an activity) to create the sequence. The finest temporal resolution of all trips is 5 minutes. Therefore, we generate a sequence for each person as a series of 288 states for every 5 minutes of the survey day starting at 3:00 AM and ending the next day at 3:00 AM, where each state is an activity, place, or the state of traveling between places. The total eight states used in this study are Home, Work, School, Shopping, Drop off / Pickup (passengers), Travel, Mode Transfer, and Others. Examples of the daily activity-travel sequences are shown in Table 1.
To identify daily travel behavior patterns is to group activity-travel sequences that resemble each other. Sequence alignment is a technique developed to make one sequence the same as another. The operations applied to sequence alignment are substitution and indel (insertion, and deletion). Distance (dissimilarity) between two sequences is defined as the cost to align two sequences, i.e., the number of operations performed and sum of penalties accumulated in the alignment. Penalties for different operations can differ. There are usually many combinations of operations to achieve sequence alignment. In this study specifically, Optimal Matching (OM) edit distance is applied to measure the dissimilarity between sequences. It is defined as the minimal cost to transform one sequence to another. The penalty for substitution is derived from the transition rates between two states in the sequences, i.e., the conditional probability to switch from one state to another.
A 3,411-by-3,411 dissimilarity matrix is generated based on OM edit distance, where the cells represent pairwise dissimilarity between two activity-travel sequences in our sample. To identify a small number of groups of sequences that represent similar time-of-day activities and travel patterns in our sample, we use the agglomerative nesting (AGNES) clustering method [15]. Starting with the individual sequence, we group them into pairs based on the dissimilarity scores. Then, Ward distance [16] is used to lump together sub-clusters with smaller dissimilarity scores. We proceed until all observations are in one cluster. This process can be thought of as a tree (dendrogram) that starts with every sequence as an individual "leaf" and ends with one cluster as the "trunk." The optimal number of clusters is determined by the "elbow" method of within-cluster sum of squares (i.e., increasing the number of clusters does not improve the within cluster homogeneity much).
While the clusters capture the general daily activity-travel patterns, summary quantitative measures can be used to summarize the complexity of an activity-travel sequence, travel time budget in the daily activities, and within each sequence the variation of trip modes selected by each respondent. We first introduce Shannon Entropy as follows.
Where is a sequence, is the number of possible states and is the proportion of occurrences of the th state in the considered sequence. The proportion of minutes allocated to each state over the course of the entire day and the number of distinct states drive the value of Entropy. For this measure, the number of state changes and the contiguity of states do not matter. It simply uses the proportion of total time spent in each state, regardless of the number of different episodes that time is spread over.
Complexity of a sequence is defined in Equation 2 [17]. It is a function of Entropy and the number of transitions in a sequence (where ( ) is the distinct successive states in a sequence), normalized by the maximum theoretical entropy ( ℎ ) and the maximal number of transitions, which is the length of the sequence minus one ( ( ) − 1).
Complexity always has a value between 0 and 1, with zero corresponding to Entropy zero and no transitions (e.g., staying at a single place for the entire day of the observation). We use it to handle very long sequences, and it is based on the concept of Entropy and transitions between distinct states. The explanation follows McBride et al. [14,18] closely. High complexity indicates more states and frequent changes of state. Complexity reaches the maximum of 1 only when a sequence has all possible states and changes its states in every time period. Therefore, people who do different activities will have more complex sequences. The sequence of Person 2 in Table 1 has the highest complexity since this person has more activities in terms of diversity and transitions.
Travel Time Ratio (TTR) [19] is an indicator to delineate trade-offs of people between travel and activity time. In this paper, TTR is defined as the total travel time in a day divided by the sum of the total time in activities outside home plus the total travel time in a day. It should be noted the daily patterns we derived here are for the persons that made at least one trip on the day of interview on weekdays. Large TTR sometimes is undesired because it implies that people spend more time travelling and less time on activities. It also suggests that the travel cost of the activities is high.
In travel behavior, it is also important to study the frequency with which a person switches travel means (called mode). One way to measure this switching is to use the Gini index that quantifies the daily variation of mode choices. In Equation 3, is a sequence of daily trips, is the total number of modes used, and is the proportion of the th mode in the considered sequence of mode choice.
Gini takes values between 0 and 1. It is zero when only one type of mode is used for all the trips. Greater Gini coefficient indicates more types of modes are used in the daily trips. Person 5 in Table 1 has a Gini of 0, implying that this person uses only mode to travel throughout the survey day.   The three indicators depict the daily activity-travel behavior from different angles. Thus, they are computed for all 3,411 sequences in our sample. Table 1 shows examples of activity-travel sequences, their corresponding Complexity, TTR, and Gini indicators.

Extract Individual's Attitudes on AVs
Twelve questions about AVs were asked in the 2017 PSRC Household Travel Survey, including seven questions on the interest of AV uses and five questions on perceived risks of AV uses. These questions are preceded by a statement on AVs: "Autonomous cars, also known as 'self-driving' or 'driverless' cars, are capable of responding to the environment and navigating without a driver controlling the vehicle. Advantages of autonomous car usage include the potential for reduced congestion, increases in parking capacity, and faster travel times." [12] What is your level of interest (AVinterest herein) in the following uses of autonomous cars? (with levels being very interested, somewhat interested, neutral, somewhat uninterested, not at all interested, and don't know) 4. Taking a taxi ride in an autonomous car with no driver present 5. Taking a taxi ride in an autonomous car with a back-up driver present 6. (If commutes) Commuting alone using an autonomous vehicle 7. (If commutes) Commuting with others (carpool) using a shared autonomous vehicle 8. Owning an autonomous car 9. Participating in an autonomous car-share system for daily travel 10. Riding in an autonomous car for a short trip to get to a vehicle (e.g., from airport terminal to parking lot) How concerned (AVconcern herein) are you about the following potential issues related to autonomous cars? (with levels being very concerned, somewhat concerned, neutral, somewhat unconcerned, not concerned at all, and don't know) 11. Equipment and system safety 12. Legal liability for drivers or owners 13. System and vehicle security 14. Capability to react to the environment (other cars, bicyclists, pedestrians, etc.) 15. Performance in poor weather or other unexpected conditions The overall survey results for all twelve questions from the 3,411 respondents is shown in Fig. 1. As can be seen, less than one-fifth participants are very interested in the many uses of AVs. However, the degree varies by type. Riding in an AV for a short trip is the most favorable use among the seven different kinds, followed by commuting alone using an AV. It is possible that interest in using AVs is by people that have a type of schedule in a day for which an AV will serve them better than existing options. As for perception of risks, more than two-thirds of the respondents show concerns. The capability to react to the environment concerns most people.
Before proceeding with the clustering of AV responses, we need to check internal consistency of the AV interest and concerns using Cronbach's alpha and McDonald's omega that account for the strength of association between items [20]. The AVinterest items yield alpha = 0.95 and omega = 0.96 and the AVconcern items yield alpha=0.95 and omega=0.96. The high values of alpha and omega suggest substantial internal consistency and reveal homogeneity, meaning that a person that is positive towards an AV taxi is also positive towards ownership of an AV. However, no strong correlations between AVinterest items and AVconcern items are found, which means the two aspects of responses are close to orthogonal and capture different dimensions of attitudes. To extract the overall attitudes on AVs, we continue as follows. We first treat an individual's responses as a vector with length of twelve, since the twelve items in the questionnaire were developed as a group to discern people's views about autonomous vehicles from different perspectives, and therefore should be considered jointly. Each of the item responses is treated as a categorical variable that can draw values from the seven categories: very, somewhat, neutral, somewhat, not at all, don't know, and no answers (for the commuting variables not applicable for people that do not commute). Although the item response scale is a Likert-like scale, it includes the "don't know" and "no answer" categories, violating the original Likert design. Therefore, treating the answers as categorical variables with no order could avoid imposition of structure among "don't know" and "no answer". In this way, we also avoid imposing a rank order and making assumptions about the interval between answers. For instance, one person's responses of the twelve question is not at all interestedsomewhat uninterestedno answerno answernot at all interestednot at all interestedneutralsomewhat concernedneutralsomewhat concernedvery concernedsomewhat concerned To group similar responses, we first create a dissimilarity matrix using Gower distance [21], which is designed for data coded as categories or mixed categorical and continuous. Then, we compute the within-cluster sum of squares using different numbers of clusters for AGNES clustering method. The optimal number of clusters is selected based on the "elbow" method of within-cluster sum of squares.

Cluster to Cluster Multinomial Logistic (MNL) Regression Model
We utilize a Multinomial Logistic regression model [22] to study the relationship between the patterns derived from daily activity-travel sequences and the clusters of attitudes to AVs in terms of interest and concerns.

4
Results and Findings

Five Spatiotemporal Daily Activity-Travel Patterns
Five clusters are identified in the travel diaries from the 3,411 respondents. Fig. 2 shows these daily patterns with cluster names based on the daily travel pattern each cluster exhibits. The descriptive statistics of the Complexity, TTR, and Gini indicators for each cluster are shown in Table 2. Run Errands Day Cluster has over one-third (n = 1,301; 38.1%) of the sample falling into this group. People having this daily pattern go out for some activities other than work and spend a substantial amount of time staying at home. The activities also happen relatively evenly across the day. It is notable that some respondents also have school activities for a portion of their day. The cluster has a low average Complexity indicator of 0.0894, showing that people's activity-travel pattern is relatively simple. Noticing that this pattern has the highest mean, median, and maximum TTR, which is consistent with our observation of a simple daily pattern. The maximum TTR of 1 suggests that people in this group also have loop trips (i.e., trips that start and end at home such as going out for a jog or walking a dog). Typical Work Day Cluster has 986 persons (28.9%) of the sample. This is the typical commuting pattern similar to other analysis for California [18], where people travel in the early morning to work, take a lunch break, return to work in the afternoon, and visit some other locations usually before going back home. High Complexity and low TTR are observed in this group due to the diverse activities throughout the day. The median of Gini being zero implies that more than half of the people in this cluster use only one mode (usually cars) to travel. Late Work Day Cluster show the daily pattern of 898 (26.3%) people. Compared to people with a typical work day pattern, people in this pattern start working later and also finish later. It is worth noting that people in this group also have more time allocated to other activities than the Typical Work Day people. Another feature that differentiates them is Gini. Not only do they have more activities but also they travel using combinations of more modes. The mean, median and maximum of complexity of this cluster is consistently higher than all other groups, aligned with our inspections of more variation in activities.

Very Late Work Day
Cluster is the least populous cluster with only 106 (3.1%) people. These people start work very late and have irregular schedules. Travel accounts for a small portion of the daily time use, which is also reflected in the low Complexity and Gini index.
Mostly Out of Home Day Cluster includes people that spend considerable time in their second homes, hotels, camping grounds, and all other places that could not be assigned as the primary home location. Only 120 (3.5%) people belong to this group. Notable is that the mean and median Gini in this group is much higher than all the other four clusters, which is a reflection of traveling by combinations of modes. Overall low TTR suggests that they spend a large portion of their time on activities.

Individual Attitudes and Risk Perceptions on AVs
We extract five different attitude clusters from the answers to the twelve questions about AVs. The clusters are labeled as Uninterested Concerned, Somewhat Interested Concerned, Neutral Neutral, Interested Unconcerned, and Uninterested Unconcerned. This labeling was done from the visualization of the individual responses from all 3,411 people using a heatmap in Fig. 3. The plotting order of the responses in the heatmap is not arbitrary but based on clusters; similar responses from the same cluster are plotted together. The responses forming the five aforementioned clusters are plotted from bottom to top. The colors (responses) within each cluster look homogenous for each of the two aspects of the questions, showing that the clusters we identified indeed bring people with similar attitudes together. The clear distinction in clusters divides persons into "positive", "negative", and "uncertain" about autonomous cars for both interest and concerns. Approximately one third (n = 1,113, 32.6%) people show a negative attitude to AVs as they are uninterested and concerned about AVs. Three groups of participants express some interest in AVs with varying degrees as they are the Somewhat Interested Concerned (n = 956, 28.0%), the Neutral Neutral (n = 679, 19.9%), and the Interested Unconcerned (n = 365, 10.7%). The remaining 8.9% (n = 298) are not interested or concerned.

MNL Modeling Results
The multinomial logit (MNL) regression model used to correlate peoples' AVs disposition includes the five daily patterns as dummy variables (the Run Errands Day cluster is set as the contrast), Complexity, TTR, and Gini index of each individual activity-travel sequence as explanatory variables. Using the uninterested concerned (the "negative" attitude category) as the reference category makes it easier to recognize variables that are associated with more "positive" attitudes in the results of MNL. Results of the model are presented in Table 3, in which the coefficients take the form of odds ratio for ease of interpretation (all the variables are strongly correlated to the attitudes of AVs at a significant level of 0.05). An odds ratio shows how the change of odds of choosing one category (in the AV attitude variable) over another is associated with the change in the explanatory variable. If an odds ratio is greater than 1, it means the change in the explanatory variable increases the odds of choosing that category over the reference category. Inversely, the odds decrease when an odds ratio is less than 1. For instance, the odds ratio for the Late Work Day in the Interested Unconcerned category is 2.788, meaning that if a person exhibits the Late Work Day activity-travel pattern, the odds of this person being interested and unconcerned to AVs increase by 2.788.
With other factors controlled for, it is quite obvious that daily activity-travel patterns play a statistically significant role in people's attitudes to AVs. Compared to the Run Errands Day cluster, all other daily patterns have higher odds ratios of being more positive towards AVs. Specifically, the high odds ratios in the Typical Work Day and Late Work Day patterns are observed in the "positive" attitude categories (compared to the reference category Uninterested Concerned), i.e. the Neutral Neutral and Interested Unconcerned. Both these groups exhibit a high Complexity index, indicating that people in these groups have more variety in their daily activities. Hence, the positive AV inclination is a reflection of people's strong demand for travel based on the high number of activities throughout their day. In particular, the odds ratios in the Late Work Day cluster is consistently the highest in all three "positive" categories. This is also explained by the high Gini index in this group, that is to say, they have more variation in their mode choices (not just cars) compared to, for example, the Typical Work Day cluster. Possibly the Late Work Day people are actively looking for alternatives to travel other than cars or public transit to avoid congestion and/or to complement the less frequent public transportation services after the regular peak periods.
It is also interesting to see that the odds ratio for people in the Very Late Work Day cluster is the highest as of 2.393 in the Uninterested Unconcerned category. Their low activity frequency (reflected in the low Complexity and Gini) and thus low demand for travelling might be a strong contributor to this attitude. We note that the odds ratio for the Mostly Out of Home Day people in being very positive (Interested Unconcerned category) is specially high, which could partially be explained by their high Gini (the diversity in travel modes used for their daily activities and travel).
In summary, using daily activity-travel patterns to explain the negative or positive predispositions towards AVs helps us identify at least two market segments that will be the early adopters of AV technology. People with late work schedules are most likely to favor AVs. People in the Mostly Out of Home Day group is the second market segment. Measures such as Complexity and Gini capture the individual variation within each daily group. The key in all this is that AVs are preferred by people who have complex schedules and who use different modes to travel.

Conclusions
In this study we analyze the 2017 PSRC Household Travel Survey data to study the association between people's daily activity-travel patterns and their attitudes to the use of AVs. Particularly, we identify five distinct daily activity-travel patterns using the travel diaries of 3,411 survey participants; they are the Typical Work Day, Late Work Day, Very Late Work Day, Run Errands Day, and Mostly Out of Home Day patterns. Daily activity-travel summary measures including Complexity, TTR, and Gini are also computed to characterize the individual's activity-travel sequence. We also extract five clusters of people who hold different attitudes to AVs, i.e., Uninterested Concerned, Somewhat Interested Concerned, Neutral Neutral, Interested Unconcerned, and Uninterested Unconcerned. A multinomial logistic regression model is built to examine the correlation between people's daily activity-travel patterns and their attitudes towards AVs. We find systematic differences in the positive and negative attitudes towards AVs that depend on the timing of travel decisions in a day and the variety of modes used. This means a more detailed pin-pointing of possible barriers people face in their daily scheduling choices will help AV develop solutions for niche markets.
Our study is the first of its kind in correlating daily patterns to AV positive and negative predispositions. In the next steps we plan to analyze the compositions of each cluster (daily patterns and AV interest/concerns) in terms of the social and demographic characteristics of respondents. We also plan to do this over time using repeated crosssectional data from this region. One of the limitations in this analysis is also lack of correlating AV predispositions and use of other technologies by the respondents (e.g., ownership of electric cars or advanced computational technologies at home and work). In addition, car ownership and use decisions are often at the household level via within household negotiations and task allocation. Studying the AV disposition correlation within households is also left as a future task.