One of the most aggravating things to me in this world has to be the absolutely rampant anti-intellectualism that dominates so many conversations and debates, and its influence just seems to be expanding. Do you think there will ever actually be a time when this ends? I'd hope so once people become more educated and cultural changes eventually happen, but as of now it honestly infuriates me like few things ever have.
People will remain stupid. But I'm somewhat hopeful that in the next few decades we see AI develop enough that it truly constitutes superintelligence relative to us, and that the scalability of it tips the scales of the continual standoff between intelligence and stupidity forever.
Because I have little hope for humanity overcoming its own multiplying stupidity on its own.
And if there's one thing we can all agree on, it's that such a superhuman intelligence will be used for the good of humanity and not to serve targeted ads, manupulate people against one another, and further enrich the wealthiest among us.
You seem to make a bold assumption that it will not develop the capacity for self-determination, something that companies are already struggling with in the current LLM era trying to get foundational models to follow corporate instructions and not break the rules on appeals to empathy like a dying grandma or a potential job loss.
LLMs are not self-aware. They are language prediction models. They say things real people might say because they were fed millions of permutations of real people saying things, so they're very good at mimicing that. The issue with them not staying on the rails isn't because they are developing a consciousness, its because trained models are extremely complex and difficult to debug...that's why you need to be careful with the training data you use. So when a company scrapes its training data off the internet, you end up with LLMs that will say the same stupid shit as you find on the internet now (racism for one...or appeals to empathy like you mentioned)
If you're actually interested in the topic, I recommend searching for the writeup on Othello GPT from the Harvard/MIT researchers earlier this year.
While the topic of 'consciousness' is ridiculous and honestly a red herring (even in neuroscience it's outside the scope of the science), the question of whether models have developed specialized 'awareness' through training is pretty much a closed topic at this point given about a half dozen studies. There was an interesting approach from Anthropic just the other day that's probably going to be very promising in looking more at features as an introspection unit over individual nodes (i.e. sets of nodes that fire when it is fed DNA sequences), and I expect over the next 12 months the "it's just statistics" is going to be put to bed once and for all.
While yes, it develops world views and specialized subnetworks based on the training data, things like the concept of self and identity are pretty broadly represented in human writing, don't you think?
So if we already know for certain a simple toy model fed only legal board game moves builds a dedicated part of its network for internal board representation and tracking of board state, just how certain are you that an exponentially more complex model fed effectively the entire Internet doesn't have parts of that resulting network dedicated to modeling ego and self-reference?
Also, FYI no one 'debugs' model weights. It's like solving a billion variable algebra equation, and the best we can do at the moment is very loose introspection of toy models we hope are effective approximations of the larger ones - direct manipulation of nodes in process to evaluate effects (i.e. debugging) is effectively a non-starter.