Direct numerical simulations of large numbers of planetary systems are key to the study extrasolar planetary systems. We investigate the potential of Graphics Processing Units (GPUs) to dramatically accelerate such calculations and enable new types of analyses. Here we report on a CUDA implementation of two different integrators, Verlet and Bulirsch-Stoer; that is, we are contrasting the relative efficiency of a low-memory-footprint, small-time-steps integrator with a more complex but highly stable integrator for best use of GPU resources. We verify that GPUs can serve as highly efficient dedicated parallel computing co-processors of the CPU host: the wall-clock time of integrating a large number of systems is reduced by two orders of magnitude compared to a comparable implementation we used so far on a modern CPU. We verify further that it is not only possible to port a complex integration algorithm like Bulirsch-Stoer to the restrictive GPU environment but that it can outperform the simpler Verlet integrator.