Universal reinforcement learning