learning to optimize with reinforcement learning