The bike sharing system (BSS), also known as the public bicycle system, has been introduced as a part of urban transport systems to provide the missing link between public transport facilities and the desired destinations[1-4]. The number of the cities which have started bike sharing programs is estimated at over 1 000 around the world[5]. The dock-based BSS has been implemented in hundreds of cities in China until 2016 when Mobike, a bike sharing company, announced its operation in the dockless BSS. For the dockless BSS, instead of picking up or returning bikes at bike stations, there are no bike stations and users can park them anywhere and lock them to finish their trip. Due to the flexibility of dockless shared bikes, the dockless BSS has become a prevailing transfer mode to connect to public transport, especially for urban rail transit[6]. The huge demand and the high turnover rate of dockless shared bikes make those bikes around urban rail transit stations form a certain regularity in the temporal and spatial characteristics. Analyzing the temporal and spatial usage patterns of the dockless BSS can promote the cooperation between shared bikes and urban rail transit.
Due to its relatively short history, there are very few studies focusing on the dockless BSS. Pal and Zhang[7] made the first effort to solve the static rebalancing problem of the dockless BSS using single and multiple vehicles. Wang and Ouyang[8] indicated that dockless shared bikes presented disequilibrium in rail transit station areas and explored the influencing factors of the disequilibrium using 5-weekday dockless bike-sharing OD data in Beijing. However, few studies have analyzed the spatiotemporal characteristics of dockless shared bikes. Fortunately, some earlier studies regarding the modelling of usage patterns for a traditional dock-based BSS provide us with inspiration to analyze the usage patterns of the dockless BSS. Froehlich et al.[9] developed the normalized available bicycles (NAB) to characterize bike stations, and applied the K-means clustering method to identify usage patterns of the dock-based shared bike. The criteria, normalized available bicycles (NAB), was further used by Lathia et al.[10] to analyze the impacts on the change of the user-access policy to the shared bike scheme in London. In addition, O’Neil and Caulfield[11] examined how bike sharing users integrate their trips with public transit and classified bike stations into three types: go-from stations, go-to stations and self-sustainable stations. Chabchoub and Fricker[12] claimed that NAB had the drawback that rebalancing operations will affect the clustering of BSS stations. They used cumulative arrivals and departures to characterize stations and normalized the criteria by dock station capacity. Similar criteria were also put forward by de Chardon et al.[13] who defined rebalanced effective usage (REU).
The studies mentioned above analyzed the usage patterns of dock-based shared bikes based on the station level data. The capacity of a bike station plays a critical role in the clustering process. However, as there are no bike stations for the dockless BSS, the temporal and spatial distribution of the bikes are more irregular. Given this context, this study focuses on the surrounding area of urban rail transit stations where the demand of shared bike trips is huge and the bike turnover rate is high nowadays. The temporal and spatial decomposition methods are proposed to explore the usage patterns of the dockless BSS. The results can provide helpful information for planners and managers to improve the service quality and transfer efficiency of both systems.
By the end of June, 2017,six urban rail transit lines had been built and put into operation in Nanjing, China, namely line 1, line 2, line 3, line 10, line S1 and line S8. The total length of the rail transit network was 225 km and the number of rail transit stations reached 113. The urban rail transit station data used in this paper included the name of the station, the corresponding rail transit line, station number, longitude coordinates and latitude coordinates, as shown in Tab.1.
Tab.1 The urban rail transit station data
StationnumberName of thestationRail transit lineLongitudecoordinateLatitudecoordinate1Olympic Centre10118.718 30632.009 0782Yuantong10118.721 45431.995 4743Zhongsheng10118.727 72231.990 1124Xiaohang10118.738 28131.984 4975Andemen1118.762 11331.990 8636Zhonghuamen1118.774 47032.006 751︙︙︙︙︙
The shared bike data used in our research was the automatically collected transaction information of Mobike, the major dockless BSS in Nanjing, of five weekdays in one week from Sept 18 to Sept 22, 2017. The dataset included user id, bike id, trip start time, trip end time, and the origin and destination of each trip with latitude and longitude coordinates, as shown in Tab.2. On the whole, the average daily activate users and bikes’ number are about 300 000 and 150 000, respectively. We found that in each weekday, 8:00—9:00 and 18:00—19:00 are the peak time periods for picking up and returning bikes. It can be clearly inferred that the BSS is mainly used for work related trips as bike usage has a strong link to weekday peak hours.
Tab.2 The dockless shared bike trip data
User idBike idTrip starttimeTrip endtimeOriginlongitudeOriginlatitudeDestinationlongitudeDestinationlatitude 45e98…88bd2862…6818:25:3518:28:39118.808 25432.052 393118.815 08332.231 037c2d73…2e849862…9718:25:4718:34:38118.733 42231.987 824118.730 41732.226 8530e114…57b96025…4018:25:5218:49:07118.743 04132.093 384118.742 26232.039 587d5059…71feb025…3418:25:3418:58:34118.750 06732.064 541118.74818531.999 629
To intuitively describe the geographic distribution of dockless shared bike use in Nanjing, we plotted the kernel density map of the dockless shared bike use on Sept 22, as shown in Fig.1. The dockless shared bike use is mainly distributed around the central urban area of Nanjing, and also represents a considerable proportion around urban rail transit stations. In this paper, the rail transit station area (RTSA) is defined as a round area with a central geographical point of the rail transit station and a radius of 150 m. In Ref.[11], a radius of 200 m is used to calculate public transport facilities around dock-based BSS stations. As dockless shared bike users can park the bike as close as possible to the destination, we can infer that the value of the radius for the dockless BSS is less than 200 m. Moreover, we further explored the distribution of shared bike trips with the trip end falling into the 100, 150 and 200 m area around rail transit stations in Nanjing, respectively. We found that the distribution of bike trips is the most intensive in 150 m area, and the distribution becomes uneven and sparse beyond 150 m. Therefore, the value of a radius of 150 m is finally adopted in this study.
Fig.1 Kernel density map of dockless shared bike use
The RTSA is used for extracting those shared bike trips with trip ends falling into this area. The shared bike trip database contained 1 830 733 bike trips in Nanjing on five weekdays. Moreover, the bike trips in the RTSA contained 448 530 records, which accounted for 24.5% of the total trips in Nanjing. Each trip in the RTSA can then be identified as an arrival or departure state based on which trip end is located in the RTSA. For example, when the trip end of origin falls into the RTSA, the trip is identified as a departure state. Similarly, the trip end of the destination corresponds to the arrival state.
The general spatiotemporal characteristics of shared bike trips (trip time and trip distance) in the RTSA can be calculated as shown in Tab.3. The average trip time ranges from 6 to 15.1 min with a mean of 8.6 min and the average trip distance ranges from 640 to 1 978 m with a mean of 1 024 m. We also calculate the average value of the parking time of shared bikes in the RTSA, which ranges from 6 to 8 h, indicating that the turnover rate is approximate 3 to 4 times/day.
Tab.3 Description of trip time and trip distance
ParameterMinimum1st quarterMedianMean3rd quarterMaximumStandard deviationTrip time/min67.78.38.69.215.11.3Trip distance/m6409229891 0241 1061 978193
To demonstrate the temporal dynamics of shared bikes usage, we discretized 24 h of one day into 15-minute bins, resulting in 96 bins per day. The criterion defined in this paper, dynamic variation of bikes (DVB), was the cumulative sum of bike arrivals minus departures within time intervals. The average of DVB for a weekday can be calculated by averaging DVB at the same time bins of each day. In Chabchoub and Fricker’s research[12], the capacity of the dock-based shared bike station was used for normalization to balance the difference of all bike stations. Since there is no bike station in a dockless BSS, the maximum values of DVB were used as the denominator to compare different rail transit stations on the same scale. For each station, we divided the average DVB series by its maximum values for normalization. The result, termed as the normalized dynamic variation of the bikes (NDVB) ranges from [-1, 1]. For each station i, λi,t and μi,t are the average numbers of arrivals and departures in time interval t (between t and t+1) of weekdays. λi,t-μi,t is the corresponding dynamic variation of the bikes and the normalized dynamic variation of bikes (NDVB) can be given as
Fig.2(a) presents an illustration about the distribution of the number of arrival trips and departure trips. Some features can be found by examining the NDVB curve as shown in Fig.2(b). For instance, a positive NDVB value (NDVB∈[0, 1]) means more arrivals than departures and the shared bikes in the RTSA are accumulating. Similarly, a negative NDVB value ( NDVB∈[-1, 0]) indicates that the shared bikes in the corresponding RTSA are dispersing. The slope of the NDVB curve is the rate of accumulation or dispersion and the area of s1 and s2 are the total arrivals and departures, respectively.
(a)
(b)
Fig.2 Using arrivals and departures to generate the distribution of NDVB.(a) Distribution of arrivals and departures; (b) Distribution of NDVB
The criterion for spatial characteristics of shared bikes is further defined as the spatial distribution of trips (SDT). It indicates the number of shared bike trips within space partitions. For each station i, the whole space around the station is divided into eight parts evenly, i.e., p1, p2,…,p8, as shown in Figs.3(a) and (b). si,p denotes the number of trips falling into part p. We took the average of the number of trips with the trip end (the starting point or the end point) falling into part p on 5 weekdays as si,p. As the number of trips with the starting and end point coordinates in the same dimension was very small, this type of trip was not considered in this paper. Then, an eight-dimensional vector was used to represent the distribution of shared bike trips around the rail transit station i. In order to compare different stations of the same scale, the vector was reordered according to the values of si,p to eliminate the direction effects, generating si,o where o is ordered indices 1,2,…,8. Next, each vector was divided by the maximum value of si,o to eliminate the scale effects and the normalized spatial distribution of trips (NSDT) is obtained. Fig.3(c) presents the calculation process.
Therefore, for each station i, si,o is the number of shared bike trips with one of the trip ends falling into the space partition o(o∈1,2,…,8). maxsi,o is the maximum value of si,o The normalized spatial distribution of trips can be given as
(a)
(b)
(c)
Fig.3 A numerical example of the spatial decomposition method.(a) Space partitions of Station A; (b) Space partitions of Station B; (c) Calculation process of the NSDT value
Obviously, NSDT is positive and the range of the values is [0, 1]. The value of NSDT closer to 1 indicates that the distribution of the shared bike trips in the RTSA is uniform. Instead, the value of NSDT closer to 0 illustrates that the distribution of the shared bike trips in RTSA is extremely non-uniform and there are some major directions for shared bike trips.
The K-means clustering algorithm is applied in this paper to classify RTSA into groups and each group shares similar bike usage patterns. The two criterions, the Silhouette coefficient (SC) and Calinski-Harabaz index (CHI) are used to obtain the optimal number k of clusters. For the two criterions, the corresponding number of clusters achieves an optimum result when the value of the criterion becomes the maximum.
3.1.1 Number of clusters
The Silhouette coefficient and Calinski-Harabaz index are calculated for different number k of clusters in R software and the results are shown in Figs.4(a) and (b). The clustering algorithm finally identified 90 rail transit stations. Some stations in the exurban area were not identified since very few dockless shared bike trips were distributed in the corresponding RTSA. As the optimal cluster number is not the same for the two criterions, the value of SC tends to be stable at 4 and the value of CHI reaches a maximum of 3. It indicates that the optimal cluster number should be over 3. We further utilized the method of visualization and compared the cluster numbers of 3, 4 and 5 by visualizing the distribution of NDVB. Finally, four clusters were determined in this paper to represent the temporal usage patterns of dockless bike sharing, since each cluster has relatively unique characteristics of accumulation and dispersion in the RTSA. The number of the rail transit stations for different clusters are 15, 33, 16 and 26, respectively.
(a)
(b)Fig.4 Results of cluster number for temporal usage.(a) SC in NDVB; (b) CHI in NDVB
3.1.2 Temporal usage pattern analysis
The visualization results of NDVB with the cluster number of 4 are shown in Figs.5(a) to (d). The temporal characteristics of the four clusters are described as follows:
Cluster 4-Ⅰ Shared bikes accumulate in the early morning and the trend holds until the beginning of the evening peak hours then disperses late at night. Therefore, daytime on a weekday presents an accumulating pattern of shared bikes and morning peak hours is the time when the accumulation rate is the highest. When it comes to evening (after 16:00), the situation is just the opposite.
Cluster 4-Ⅱ It is shown as a mirror of Cluster 4-Ⅰ in a vertical direction. During morning peak hours, the shared bikes around the rail transit station disperse after a small accumulation at the beginning, while the case is the opposite in the evening time. Similar to Cluster 4-Ⅰ, the highest accumulation/dispersion rate appears during peak hours.
Cluster 4-Ⅲ In the morning hours (6:00—10:00), shared bikes first accumulate and then disperse on nearly the same scale, indicating that there is a balanced state during this time period. The same trend can be found in the evening hours (16:00—20:00) on the same scale. In other words, the RTSA for Cluster 3 shows the temporal characteristics of accumulation and dispersion both in the morning and evening.
(a)
(b)
(c)
(d)
Fig.5 NDVB visualization of 4-cluster cases. (a) Cluster 4-Ⅰ; (b) Cluster 4-Ⅱ; (c) Cluster 4-Ⅲ; (d) Cluster 4-Ⅳ
Cluster 4-Ⅳ It has the same trend with Cluster 2 before 16:00. While for the time period after 16:00, the NDVB values fluctuate slightly around 0. Therefore, the temporal characteristics of accumulation and dispersion only emerge in the morning, while the distribution of NDVB becomes irregular with abnormal fluctuation in the evening.
It is worth noting that the completely opposed trends in Cluster 4-Ⅰ and Cluster 4-Ⅱ are probably due to different land uses and job-housing relationships in Nanjing. For Cluster 4-Ⅰ, rail transit stations with temporal characteristics of morning accumulation and evening dispersion are likely to be located in the workplace area. Similarly, for Cluster 4-Ⅱ, those stations with features of morning dispersion and evening accumulation may be located in the residence area. Unlike Cluster 4-Ⅰ and Cluster 4-Ⅱ, there are no significant land use characteristics for Cluster 4-Ⅲ and Cluster 4-Ⅳ. One possible reason is that residents frequently make some business trips or shopping trips in the corresponding RTSA, since the bike trips first accumulate then disperse and there is a balanced state during the whole day.
3.1.3 Geographic distribution of 4-cluster temporal usage patterns
Based on the cluster results,the distribution of NDVB for the four clusters shows some geographical agglomerating regularity on temporal usage patterns of the dockless BSS. The geographical distribution of the identified rail transit stations in Nanjing is shown in Fig.6. The stations of Cluster 4-Ⅰ, shown as yellow dots, are mainly located around the business-oriented urban center. The stations of Cluster 4-Ⅱ are mainly located in the residence-oriented area. It is accordance with our previous inference as Cluster 4-Ⅰ and Cluster 4-Ⅱ characterize a commuting related tidal pattern on weekdays. However, four exceptional stations in Cluster 4-Ⅱ are located in the urban center area. The reason is that the urban center in Nanjing, as well as in other Chinese large cities, shows a highly-mixed land use context, resulting in a highly residential population in the urban center. Cluster 4-Ⅲ and Cluster 4-Ⅳ, shown as red dots and green dots, are mainly located in the mixed land use area around the urban center and exurban area.
Fig.6 Geographic distribution of 4-cluster temporal usage patterns
3.2.1 Number of clusters
For the spatial usage patterns,the results of the Silhouette coefficient and Calinski-Harabaz index are presented in Figs.7(a) and (b), respectively. Both criterions reach the maximum value with the cluster number of 2. Therefore, two clusters are determined in this paper to characterize spatial usage patterns of dockless bike sharing. The number of the rail transit stations for the two clusters are 39 and 51.
(a)
(b)Fig.7 Results of cluster number for spatial usage. (a) SC in NSDT; (b) CHI in NSDT
3.2.2 Spatial pattern analysis
The distributions of NDVB with the cluster number of 2 are shown in Figs.8(a) and (b). For Cluster 2-Ⅰ, the values of NSDT in nearly all spatial parts are higher than 0.5. Therefore, Cluster 2-Ⅰ presents a relatively balanced spatial distribution of shared bike trips. For Cluster 2-Ⅱ, Fig.8(b) shows that there is a tremendous decrease of NSDT after the first two ordered spatial parts, and the NSDT values are less than 0.5 in more than half of all the directions. Morevoer, the variances of 2 clusters are calculated. The variance of Cluster 2-Ⅰ is much lower than that of Cluster 2-Ⅱ. The low values of NSDT and high variances for Cluster 2-Ⅱ indicate that the bike trips are more concentrated in limited directions.
The spatial characteristics of 2 clusters are described as follows:
Cluster 2-Ⅰ There is a relatively even distribution of shared bike trips around the rail transit station with values of NSDT over 0.5. It indicates that there is a homogenous service in the corresponding RTSA.
Cluster 2-Ⅱ The values of the NSDT are less than 0.5 in more than half of all the directions. Therefore, the RTSA for Cluster 2-Ⅱ may provide a heterogeneous service and most shared bike users travel in limited directions.
(a)
(b)
Fig.8 NSDT visualization of 2-cluster cases. (a) Cluster 2-Ⅰ; (b) Cluster 2-Ⅱ
3.2.3 Geographic distribution of 2-cluster spatial usage patterns
Fig.9 shows the geographical distribution of the spatial usage patterns. The distributions of the stations for the two clusters are relatively discrete, since both clusters are distributed in either the urban center or the suburban area.
Fig.9 Geographical distribution of 2-cluster spatial usage patterns
Two reasons can be given to explain the discrete phenomenon. The first reason is that other stations may act as a competitor, which attracts more bike trips. The other reason is that for some certain stations, there are not enough roads connected to them due to the relatively low street density around the station.
1) In this paper, 5-weekday bike-sharing trip data is used to explore the temporal and spatial usage patterns of the dockless BSS around rail transit stations in Nanjing, China. The rail transit station area is defined by extracting the shared bike trips with trip ends falling into the area. To characterize the arrival and departure activities for the dockless BSS, temporal and spatial decomposition methods are developed by calculating two criterions, namely, the normalized dynamic variation of bikes (NDVB) and normalized spatial distribution of trips(NSDT). Furthermore, the K-means clustering algorithm, combined with the visualization method is applied to explore the spatiotemporal usage patterns and geographical distributions of dockless shared bikes. Four temporal usage patterns and two spatial usage patterns are identified. Temporal usage patterns show a strong relationship with area type, i.e., urban center and suburb, while spatial usage patterns are irregular depending on limited directions.
2) The results can provide helpful information for both bike sharing operators and local governments to initiate the relevant measures or policies. For instance, from the temporal usage patterns perspective, the rebalancing strategies can be taken by observing the accumulation and dispersion characteristics of each station. For the two spatial usage patterns, the problem of limited directions inspires us to reconsider the arrangement of bike facilities based on the main directions that the bikes arrive/depart. Some practical improvements such as optimizing street networks and widening bike lanes, can be introduced to improve the service of both the rail transit and shared bikes.
3) This paper mainly focuses on the new methods of analyzing temporal and spatial usage patterns of shared bikes, while the analysis of the influencing factors of these patterns has not been carried out thoroughly. Land use, street network and other factors such as built environment factors, also impact the spatiotemporal distribution of shared bikes. A questionnaire should be designed in further study to explore the influencing factors of the spatiotemporal usage patterns.
[1]Parkes S D, Marsden G, Shaheen S A, et al. Understanding the diffusion of public bikesharing systems: Evidence from Europe and North America[J]. Journal of Transport Geography, 2013, 31: 94-103. DOI:10.1016/j.jtrangeo.2013.06.003.
[2]Ji Y J, Fan Y L, Ermagun A, et al. Public bicycle as a feeder mode to rail transit in China: The role of gender,age, income, trip purpose, and bicycle theft experience[J]. International Journal of Sustainable Transportation, 2017, 11(4): 308-317. DOI:10.1080/15568318.2016.1253802.
[3]Kaltenbrunner A, Meza R, Grivolla J, et al. Urban cycles and mobility patterns: Exploring and predicting trends in a bicycle-based public transport system[J]. Pervasive and Mobile Computing, 2010, 6(4): 455-466. DOI:10.1016/j.pmcj.2010.07.002.
[4]Lin J R, Yang T H. Strategic design of public bicycle sharing systems with service level constraints[J]. Transportation Research Part E: Logistics and Transportation Review, 2011, 47(2): 284-294. DOI:10.1016/j.tre.2010.09.004.
[5]Jia Z L, Xie G, Gao J, et al. Bike-sharing system: A big-data perspective[C]//First International Conference on Smart Computing and Communication. Shenzhen, China, 2016:548-557.
[6]Ji Y J, Ma X W, Yang M Y, et al. Exploring spatially varying influences on metro-bikeshare transfer: A geographically weighted poisson regression approach[J]. Sustainability, 2018, 10(5): 1526. DOI:10.3390/su10051526.
[7]Pal A, Zhang Y. Free-floating bike sharing: Solving real-life large-scale static rebalancing problems[J]. Transportation Research Part C: Emerging Technologies, 2017, 80: 92-116. DOI:10.1016/j.trc.2017.03.016.
[8]Wang J C, Ouyang S S. Disequilibrium of bicycle-sharing in rail transit station areas in Beijing[J]. Journal of Transportation Systems Engineering and Information Technology, 2019, 19(1): 214-221. DOI:10.16097/j.cnki.1009-6744.2019.01.032. (in Chinese)
[9]Froehlich J, Neumann J, Oliver N. Sensing and predicting the pulse of the city through shared bicycling[C]//Proceedings of the 21st International Joint Conference on Artificial Intelligence. Pasadena, CA, USA, 2009: 1420-1426.
[10]Lathia N, Ahmed S, Capra L. Measuring the impact of opening the London shared bicycle scheme to casual users[J]. Transportation Research Part C: Emerging Technologies, 2012, 22: 88-102. DOI:10.1016/j.trc.2011.12.004.
[11]O’Neil P C, Caulfield B. Examining user behaviour on a shared bike scheme: The case of Dublin bikes [C]//The 13th International Conference on Travel Behaviour Research. Toronto, Canada, 2012.
[12]Chabchoub Y, Fricker C. Classification of the vélib stations using Kmeans, dynamic time wraping and DBA averaging method[C]//International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM). Paris, France, 2014: 14863476. DOI:10.1109/IWCIM.2014.7008802.
[13]de Chardon C M, Caruso G, Thomas I. Bike-share rebalancing strategies, patterns, and purpose[J].Journal of Transport Geography, 2016, 55:22-39. DOI:10.1016/j.jtrangeo.2016.07.003.