Dawn of the Cuatro

In a clip from the final episode, there is more reference to the theory behind Reinforcement Learning. That reference to Solomonoff induction over programs probably refers to Marcus Hutters AIXI, a theoretically optimal reinforcement learning agent that's totally impossible to compute in practice which always learns from the rewards its given as efficiently as possible.

There are lots of intelligent systems capable of learning. Some are very simple - a chess-playing program is incredibly 'narrow' because it can only learn to play chess, not anything else. Drop the best chess-playing program in the world in a game of poker and it would be lost. You, a human, are a lot more 'general' because if we dump you in a game of chess or a game of poker or a game of football or the hunger games you would learn how to play all of them eventually.

AIXI is a design for a learner that is as general as possible - put it in any environment and it will see the rewards it gets, and learn the patterns that give it more reward, and pursue that reward as efficiently as possible. The only problem is it requires an infinitely powerful computer to run, and the algorithm it runs is called 'solomonoff induction'.

So, AIXI is the ideal learning agent. You could never implement it, unless you approximate it with the help of some magically effective compression... which I assume is what that whiteboard meant by 'solomonoff middle-out'. We really are in the realms of fantasy there, but maybe a useful approximation of AIXI could be done if you used middle-out compression on a significant fraction of all the world's devices to link them into one giant processor...


We're not getting a normal ending. We are, I think, headed for either a catastrophe, a death so rapid your neurons don't have time to register it, or for cuatro commas.

/r/SiliconValleyHBO Thread Parent Link - i.imgur.com