The actual model required for general purpose likely lies beyond the range of petabytes of memory.
These models are using gigabytes and the trend indicates its exponential. A couple more gigabytes isn't going to cut it. Layers cannot expand the predictive capabilities without increasing the error. I'm sure a proof of that will be along within in the next few years.
"Come on man, I just need a couple more pets of your data and I will totally be able to predict you something useful!".
It's capacitors flip polarity in anticipation.
"I swear man! It's only a couple of orders of magnitude more, man! And all your dreams will come true. I'm sure I'll service you right!"