phd thesis: Reasoning about Software Security via Synthesized Behavioral Substitutes [PDF]

so I tried to parse this and I wrote papers too...

3.4.5 Backpropagation After obtaining a score by random playout, we do the following for the selected node and all its parents, up to the root: (1) We update the node’s average reward. This reward is averaged based on the node’s and its successors’ total number of random playouts. (2) If the node is fully expanded and its children are all inactive, we set the node to inactive. Finally, we set the current node to its parent node.

machine learning?!

I assume this thesis isnt about reverse engineering in particular, as I dont see meta data extraction, but more about transforming obfuscatde code into debuggable code... or what is the end goal here? or better what is the output? this sounds like machine learning with backpropagation, so its an optimization problem? for a start, what exactly can I reason from your example with the vm obfuscations? is the result a representation of the vm that I can use to emulate it, or what? how would this be used in a real world example to achieve what?

/r/ReverseEngineering Thread Link - synthesis.to