In this chapter, we leverage reinforcement learning as a unified framework to design effective adaptive cyber defenses against zero-day attacks. Reinforcement learning is an integration of control theory and machine learning. A salient feature of reinforcement learning is that it does not require the defender to know critical information of zero-day attacks (e.g., their attack targets, and the locations of the vulnerabilities). This information is difficult, if not impossible, for the defender to gather in advance. The reinforcement learning based schemes are applied to defeat three classes of attacks: strategic attacks where the interactions between an attacker and a defender are modeled as a non-cooperative game; non-strategic random attacks where the attacker chooses its actions by following a predetermined probability distribution; and attacks depicted by Bayesian attack graphs where the attacker exploits combinations of multiple known or zero-day vulnerabilities to compromise machines in a network.