I am Jürgen Schmidhuber, AMA!

Thanks a lot - I greatly appreciate it!

Let me enumerate my answers to your 4 questions.

  1. Evolutionary computation is one of the most useful practical methods for direct search in policy space, especially when there is no teacher who knows which output actions the system should produce at which time. Especially in partially observable environments where memories of previous events are needed to disambiguate states, this often works much better than other reinforcement learning techniques based on dynamic programming.

  2. Evolutionary techniques are not "necessary." Other black box optimizers such as policy gradients sometimes work fine. Evolutionary algorithms and policy gradients are related - see papers on Natural Evolution Strategies. I am also a big fan of asymptotically optimal program search, which in may ways goes beyond evolution etc.

  3. In case of teacher-given desired output actions or labels, gradient descent such as backpropagation (also through time) usually works much better, especially for NNs with many weights. This is not always true though. For example, evolution-trained hidden LSTM units combined with an optimal linear mapping (e.g., SVM) from hidden to output units outperformed a pure gradient-based LSTM even on certain supervised sequence learning tasks. See the EVOLINO papers since 2005.

  4. Your last question is about search for shorter programs in program space. The recent Compressed Network Search combines evolution and compression to achieve something along the lines you may have in mind.

More on all of this in Sec. 6 of the survey.

/r/MachineLearning Thread