Protecting citizens from crimes is one of the core responsibilities of governments. However, as the complexity of crimes grows, resources of law enforcement become insufficient. Therefore, current practice of crime analytics calls for the optimal allocation of limited resources to achieve faster responses, reduced costs, and highly efficient operations. A variety of analytical methods have been used to investigate crime data. However, very little has been done to develop data-driven methods to optimally allocate law enforcement resources for coverage control of city crimes. In this paper, we develop a new optimal learning algorithm to characterize multi-scale distributions of crimes and then determine an optimal policy for coverage control of city crimes. First, we categorize crimes into low, medium, and high severity levels. Then, we model crime distributions for various severity levels. Second, we develop an optimal policy for coverage control to allocate limited resources of the law enforcement in areas of interest. Third, the model performance is measured based on the response time of an agent to reach crime scenes. Experimental results demonstrate that the proposed algorithm can effectively and efficiently optimize law enforcement allocation and show a better performance in terms of average response time to crime scenes.