Abstract: Training a reinforcement learning agent to learn network penetration testing is challenging due to the partially-observable, non-deterministic environment. The large action space leads to ...