Data on successful bike pickups/drop-offs censor the demand from customers/riders that were unable to pickup/drop-off a bike due to bike/dock unavailability (i.e., balks). The objective of this paper is two-fold: (1) provide a formal comparison between the distribution of satisfied bike/dock demand and the true (latent) demand in bike-sharing systems through simulation experiments and nonparametric bootstrap tests to show when and how the two may differ; and, (2) propose a novel methodology combining simulation, bootstrapping, and subset selection that harnesses the useful partial information in every bike pickup/drop-off observation (even if it is subject to censoring) to estimate the true demand in situations where data filtering/cleaning approaches commonly used in the bike-sharing literature fail due to lack of valid data. The results reveal that the distribution of inter-pickup/drop-off times may differ (statistically) from the distribution of the actual inter-arrival time of customers/bikes primarily for higher percentile values and even if the demand rate is slower than the supply rate, especially if customer/bike inter-arrival times follow a heavy-tailed distribution. The statistical power of the proposed demand estimation approach in identifying an appropriate model for the underlying demand distribution is tested through simulation experiments as well as a real-world application. The paper has important academic and practical impacts by providing additional means to obtain and use statistically valid demand estimates, enhancing decision-making related to the design and operation of bike-sharing systems.
All Science Journal Classification (ASJC) codes
- General Computer Science
- Modeling and Simulation
- Management Science and Operations Research
- Information Systems and Management