This comment was posted to reddit on Aug 08, 2018 at 12:26 pm and was deleted within 1 day, 3 hour(s) and 7 minutes.

First question, why sample beta from a Normal with non-zero mean?

Why not? Shrinkage "works" regardless of whether you are shrinking towards zero or not. If he has good reason to believe that coefficients are more likely to be positive than to be negative, I don't see a problem with the chosen prior (since, as long as you have enough data and the prior has enough density over the whole of the "feasibility" space, the specific prior chosen is not that critical).

Yes, it seems like an odd choice, but there's nothing actually wrong with it, is there?

Trying to answer OP's question...

Adding gaussian noise to your data is equivalent to L2-regularization (where the weight of the regularization term is related to the variance of the noise injected), which is equivalent to imposing a N(0,sigma) gaussian prior on your coefficients. Adding gaussian noise to your data WHILE also imposing some sort of prior on your coefficients... I don't know how to interpret that, but perhaps it simply results in L2-regularization/gaussian prior but with higher "regularization strength".