We propose a novel approach for multi-camera autocalibration by observing multiview surveillance video of pedestrians walking through the scene. Unlike existing methods, we do NOT require tracking or explicit correspondences of the same person across time/views. Instead, we take noisy foreground blobs as the only input and rely on a joint optimization framework with robust statistics to achieve accurate calibration under challenging scenarios. First, each individual camera is roughly calibrated into its local World Coordinate System (lWCS) based on analysis of relative 3D pedestrian height distribution. Then, all lWCSs are iteratively registered with respect to a shared global World Coordinate System (gWCS) by incorporating robust matching with a partial Direct Linear Transform (pDLT). As demonstrated by extensive evaluation, our algorithm achieves satisfactory results in various camera settings with up to moderate crowd densities with a large proportion of foreground outliers.