CPU's can't be scaled in the same way (and you wouldn't want to). The idea is a GPU is a thing that can do the same thing to a whole fuckload of similar things all in parallel - which when you think about for example deforming a mesh in a game (or moving every vertex to make an animation based on some bones...) you can see how it becomes big clusters of all the same task.
When you program a "shader" - a program that runs on a GPU, it's literally like this: you write what it does for every pixel on the screen or every point on a model.
GPU's have just been limited by what they're capable of manufacturing, it's not even based on need. That's why you always have your huge silly $800 desktop GPU that no game demands, it's basically the best they can do. They use these for like, weather simulations and now image/pattern recognition (for self driving cars) too. There have been GPU's faster & larger than any game needs the entire time being marketed separately for $1-2,000. "Faster" GPU's would be limited by the same constraints CPU's are.