Finding the strongly connected components (SCCs) of a directed graph is a fundamental graph-theoretic problem. Tarjan's algorithm is an efficient serial algorithm to find SCCs, but relies on the hard-to-parallelize depth-first search (DFS). We observe that implementations of several parallel SCC detection algorithms show poor parallel performance on modern multicore platforms and large-scale networks. This paper introduces the Multistep method, a new approach that avoids work inefficiencies seen in prior SCC approaches. It does not rely on DFS, but instead uses a combination of breadth-first search (BFS) and a parallel graph coloring routine. We show that the Multistep method scales well on several real-world graphs, with performance fairly independent of topological properties such as the size of the largest SCC and the total number of SCCs. On a 16-core Intel Xeon platform, our algorithm achieves a 20X speedup over the serial approach on a 2 billion edge graph, fully decomposing it in under two seconds. For our collection of test networks, we observe that the Multistep method is 1.92X faster (mean speedup) than the state-of-the-art Hong et al. SCC method. In addition, we modify the Multistep method to find connected and weakly connected components, as well as introduce a novel algorithm for determining articulation vertices of biconnected components. These approaches all utilize the same underlying BFS and coloring routines.