I know the feeling regarding titles/explanations. Pretty sure I've collected a few downvotes because of shitty titles, haha.
So, uh, sorry about this text wall. I may have been too thorough, ha.
TL;DR: Long-time code dabbler/hobbyist, stumbled into the concept of CA and got hooked, got bored with slow programs & small-scale CA worlds, and found GPUs to be the solution. Low-education & self-learned.
In my early twenties now, but have been dabbling with coding since I was 14-ish. Never made anything of consequence, and didn't really upskill much or learn any frameworks in that time. I was making test games (never 'finished' a single one) with game maker and actionscript, which was really the only reason I learned coding at all. Did a year of programming in high school with VB6.
Didn't progress with education, and worked in a call center for three years where I did some coding-ish things for them, but didn't do many personal coding projects in that time.
1-2 years ago, I decided to do some 'real' programming after a long hiatus, but couldn't figure out what to make. Decided to do something entirely useless - shuffle numbers in an array and see what happens. I was astounded by the enigmatic complexity and self-organization. A few months later I found out I was making these things called 'Cellular Automata'.
Made 3-4 generator programs with varying degrees of crappyness in that time, starting in excel VBA (1) and then on to Java (2-3). These were all brute-forcers, and techniques to speed them up involved silly things like hash tables and selective cell execution. I didn't like that, as it was complicated and seemed to reduce rules' capabilities.
Eventually I hit diminishing returns in my ability to make interesting CAs, and the interesting ones I did have were destroying my CPU as they took days to generate HD videos.
I Investigated the viability of GPUs and watched videos on GPU architecture over and over until it made a little bit of sense. Then grabbed some github source code for Conway's GoL that used OpenCL, and wrote PyCl-Convergence. Learned Python and OpenCL (through PyOpenCL) and struggled for a couple of weeks to get it off the ground, and did. Getting the kernels (GPU functions) working was hard, and unfamiliar territory.
At this point I'm working on my second GPU project in JS/WebGL (OpenCL is not portable, and slow for what I want to do) and am slowly gaining skills & knowledge with GPU acceleration and shader construction. I wouldn't say I know how to program a GPU at all. What I can do is super-interesting to me though.
As for the benefits vs trade-offs of GPU acceleration, I've found that CPU implementations are more flexible, especially if you want to do randomized rule modifications mid-execution. GPU implementations require much more premeditation and uniformity in many areas from what I've seen. That applies not just to rules, but to the surrounding program infrastructure too.
GPU acceleration is faster by several magnitudes when brute forcing rules, and/or enables very complex rules with large neighborhoods. I'm not sure if you can do infinite world-spaces like you see in Golly or various CPU automata though. As far as I know, you're stuck with a fixed world size that's a multiple of n2 usually between 512x512 and 4096x4096 pixels/cells.
I don't do any writing or have any blogs or anything like that. Just a github/youtube accounts, and I guess the posts in this subreddit :)
There surely are trade-offs, but I cannot imagine ever making another CPU-only CA again.