Writing efficient software is difficult. Design and implementation defects cancause severe performance degradation. Unfortunately, existing performance diagnosis techniques like profilers are still preliminary. They can locate code regions that consume resources, but not the ones that waste resources. In this paper, we first design a root-causeand fix-strategy taxonomy for inefficient loops, one of the most common performance problems in the field. We then design a static-dynamic hybrid analysis tool, LDoctor, toprovide accurate performance diagnosis for loops. We further use sampling techniques to lower the run-Time overhead withoutdegrading the accuracy or latency of LDoctor diagnosis. Evaluation using real-world performanceproblems shows that LDoctor can provide better coverage and accuracy thanexisting techniques, with low overhead.