Age of information (AoI) has been recently considered as a performance metric to measure the freshness of information for time-critical wireless communications application. In this paper, we consider AoI minimization in a wireless ad hoc network, where nodes exchange status updates with one another over shared spectrum. The network needs to be formed in a dynamic fashion in the sense that each node either broadcasts or receives updates in a slot and attempts to keep the updates in both directions fresh. We aim to minimize the average AoI of each node by a joint broadcast scheduling and power control policy. Each node decides its transmitting/receiving mode and the transmission power based on its local observation of the system state. We formulate a Markov game and develop a multi-agent deep reinforcement learning algorithm based on deep recurrent Q-network. The simulation results show that the proposed approach outperforms the baselines significantly.