Quantum approximate optimization algorithm (QAOA) is a promising quantum-classical hybrid algorithm to solve hard combinatorial optimization problems. The two-qubits gates used in quantum circuit for QAOA are commutative i.e., the order of gates can be altered without changing the logical output. This re-ordering leads to execution of more gates in parallel and a smaller number of additional gates to compile the QAOA circuit resulting in lower circuit depth and gate-count which is beneficial for circuit run-time and noise. A lower number of gates means a lower accumulation of gate errors, and a lower circuit depth means the quantum bits will have a lower time to decohere (lose state). However, finding the best re-ordered circuit is a difficult problem and does not scale well with circuit size. This paper presents a compilation flow with 3 approaches to find an optimal re-ordered circuit with reduced depth and gate count. Our approaches can reduce gate count up to 23.21% and circuit depth up to 53.65%. Our approaches are compiler agnostic, can be integrated with existing compilers, and scalable.