Many online social networks feature restrictive web interfaces that only allow the query of a user's local neighborhood. To enable analytics over such an online social network through its web interface, many recent efforts use Markov Chain Monte Carlo (MCMC) methods such as random walks to sample users in the social network and thereby support analytics based on the samples. The problem with such an approach, however, is the large amount of queries often required for a random walk to converge to a desired (stationary) sampling distribution. In this article, we consider a novel problem of enabling a faster random walk over online social networks by "rewiring" the social network on-the-fly. Specifically, we develop a Modified TOpology Sampling (MTO-Sampling) scheme that, by using only information exposed by the restrictive web interface, constructs a "virtual" random-walk-friendly overlay topology of the social network while performing a random walk and ensures that the random walk follows the modified overlay topology rather than the original one. We describe in this article instantiations of MTO-Sampling for various types of random walks, such as Simple Random Walk (MTO-SRW), Metropolis-Hastings Random Walk (MTO-MHRW), and General Random Walk (MTO-GRW). We not only rigidly prove that MTO-Sampling improves the efficiency of sampling, but we also demonstrate the significance of such improvement through experiments on real-world online social networks such as Google Plus, Epinion, Facebook, etc.
All Science Journal Classification (ASJC) codes
- Information Systems