Evolution Strategies as a Scalable Alternative to Reinforcement Learning