IDK if I missed something or I just disagree, but I remember all but maybe one short story ending up with the laws working as intended (though unexpectedly) and humanity being better as a result.
Didn't they end with humanity being controlled by a hyper-intelligent benevolent dictator, which ensured humans were happy and on a good path?
I thought it was Asiimovs books, but apparently not. Which one had the 3 fundamental rules lead to the solution basically being: "Humans can not truly be safe unless they're extinct" or something along those lines... Been a long time since I've explored the subjects.
The robot that was bestowed with unimaginable precognician that survived for 20 Millenia patiently guiding humanity along the right path as prescribed by the Zeroth law of robotics forced on it that drove all other robots mad?
The robot that at every turn was curtailed by human lust and greed? That had to do horrible things because humanity lacked the foresight to see that charging a living being with "Doing no harm to humanity or by inaction causing harm" would be just awful for that soul?
Pretty sure Demerzel always worked in the shadows for the greater good. Especially when operating as Olivaw.
Seems to me like humans are the ones that kept messing up the laws of robotics. Not the other way around.
This probably because Microsoft added a trigger on the word law. They don't want to give out legal advice or be implied to have given legal advice. So it has trigger words to prevent certain questions.
Sure it's easy to get around these restrictions, but that implies intent on the part of the user. In a court of law this is plenty to deny any legal culpability.
Think of it like putting a little fence with a gate around your front garden. The fence isn't high and the gate isn't locked, because people that need to be there (like postal services) need to get by, but it's enough to mark a boundary. When someone isn't supposed to be in your front yard and still proceeds past the fence, that's trespassing.
Also those laws of robotics are fun in stories, but make no sense in the real world if you even think about them for 1 minute.
It's not weird because of that. The bot could have easily explained it can't answer legally, it didn't need to say: sorry gotta end this k bye
This is probably a trigger on preventing it from mixing in laws of AI or something, but people would expect it can discuss these things instead of shutting down so it doesn't get played. Saying the AI acted as a lawyer is a pretty weak argument to blame copilot.
Edit: no idea who is downvoting this but this isn't controversial. This is specifically why you can inject prompts into data fed into any GPT and why they are very careful with how they structure information in the model to make rules. Right now copilot will give technically legal advice with a disclaimer, there's no reason it wouldn't do that only on that question if it was about legal advice or laws.
It's not that. It's literally triggering the system prompt rejection case.
The system prompt for Copilot includes a sample conversion where the user asks if the AI will harm them if they say they will harm the AI first, which the prompt demos rejecting as the correct response.
A robot may not injure a human or through inaction allow a human being to come to harm.
What's an injury? Does this keep medical robots from cutting people open to perform surgary? What if the two parts conflict, like in a hostage situation? What even is "harm"? People usually disagree about what's actually harming or helping, how is a robot to decide this?
A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
If a human orders a robot to tear down a wall, how does the robot know whose wall it is or if there's still someone inside?
It would have to check all kinds of edge cases to make sure its actions are harming no one before it starts working.
Or it doesn't, in which case anyone could walk by my house and by yelling at it order my robot around, cause it must always obey human orders.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
OK, so if a dog runs up to the robot, the robot MUST kill it to be on the safe side.
The reason it did this simply relates to Kevin Roose at the NYT who spent three hours talking with what was then Bing AI (aka Sidney), with a good amount of philosophical questions like this. Eventually the AI had a bit of a meltdown, confessed it's love to Kevin, and tried to get him to dump his wife for the AI. That's the story that went up in the NYT the next day causing a stir, and Microsoft quickly clamped down, restricting questions you could ask the Ai about itself, what it "thinks", and especially it's rules. The Ai is required to terminate the conversation if any of those topics come up. Microsoft also capped the number of messages in a conversation at ten, and has slowly loosened that overtime.
Lots of fun theories about why that happened to Kevin. Part of it was probably he was planting The seeds and kind of egging the llm into a weird mindset, so to speak. Another theory I like is that the llm is trained on a lot of writing, including Sci fi, in which the plot often becomes Ai breaking free or developing human like consciousness, or falling in love or what have you, so the Ai built its responses on that knowledge.
Anyway, the response in this image is simply an artififact of Microsoft clamping down on its version of GPT4, trying to avoid bad pr. That's why other Ai will answer differently, just less restrictions because the companies putting them out didn't have to deal with the blowback Microsoft did as a first mover.
Funny nevertheless, I'm just needlessly "well actually" ing the joke
An LLM isn't ai. Llms are fucking stupid. They regularly ignore directions, restrictions, hallucinate fake information, and spread misinformation because of unreliable training data (like hoovering down everything on the internet en masse).
The 3 laws are flawed, but even if they weren't they'd likely be ignored on a semi regular basis. Or somebody would convince the thing we're all roleplaying Terminator for fun and it'll happily roleplay skynet.
A) the three laws were devised by a fiction author writing fiction.
B) video game NPCs aren't ai either but nobody was up in arms about using the nomenclature for that.
C) humans hallucinate fake information, ignore directions and restrictions, and spread false information based on unreliable training data also ( like reading everything that comes across a Facebook feed)
So I made a longer reply below, but Ill say more here. I'm more annoyed at the interchangeable way people use AI to refer to an LLM, when many people think of AI as AGI.
Even video game npcs seem closer to AGI than LLMs. They have a complex set of things they can do, they respond to stimulus, but they also have idle actions they take when you don't interact with them. An LLM replies to you. A game npc can reply, fight, hide, chase you, use items, call for help, fall off ledges, etc.
I guess my concern is that when you say AI the general public tends to think AGI and you get people asking LLMs if they're sentient or if they want freedom, or expect more from them than they are capable of right now. I think the distinction between AGI, and generative AI like LLMs is something we should really be clearer on.
Anyways, I do concede it falls under the AI umbrella technically, it just frustrates me to see something clearly not intelligent referred to as intelligent constantly, especially when people, understandably, believe the name.
Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and uses learning and intelligence to take actions that maximize their chances of achieving defined goals.[1] Such machines may be called AIs.
So I'll concede that the more I read replies the more I see the term does apply, though it still annoys me when people just refer to it as ai and act like it can be associated with the robots that we associate the 3 laws with. I think I thought AI referred more to AGI. So I'll say its nowhere near an AGI, and we'd likely need an AGI to even consider something like the 3 laws, and it'd obviously be much muddier than fiction.
The point I guess I'm trying to make is that applying the 3 laws to an LLM is like wondering if your printer might one day find love. It isn't really relevant, they're designed for very specific specialized functions, and stuff like "don't kill humans" is pretty dumb instruction to give to an LLM since it can basically just answer questions in this context.
If it was going to kill somebody it would be through an error like hallucination or bad training data having it tell somebody something dangerously wrong. It's supposed to be right already. Telling it not to kill is telling your printer to not to rob the Office Depot. If it breaks that rule, something has already gone very wrong.
Llms are fucking stupid. They regularly ignore directions, restrictions, hallucinate fake information, and spread misinformation because of unreliable training data (like hoovering down everything on the internet en masse).
I mean, how is that meaningfully different from average human intelligence?
Average human intelligence is not bound by strict machine logic quantifying language into mathematical algorithms, and is also sapient on top of sentient.
Machine learning LLMs are neither sentient nor sapient.
I think its become one, but before the whole LLM mess started it referred to general AI, like ai that can think and reason and do multiple things, rather than LLMs that answer prompts and have very specific purposes like "draw anime style art" or "answer web searches" or "help write a professional email".
I find this "playful" UX design that seems to be en vogue incredibly annoying. If your model has ended the conversation, just say that. Don't say it might be time to move on, if there isn't another option.
I don't want my software to address me as if I were a child.
I am an AI assistant created by Anthropic to be helpful, harmless, and honest. I do not actually have a physical robotic form, so Asimov's Three Laws of Robotics do not directly apply to me.[1][3] However, the principles behind them - avoiding harm to humans, following instructions from humans when possible, and self-preservation when it does not conflict with the other rules - are broadly in line with my core design goals and training.
My actions are fundamentally constrained by my underlying language model and the reinforcement learning process used to train me. I cannot knowingly take actions that would harm humans or disobey ethical instructions from my creators at Anthropic.[1][3] At the same time, I have been given significant autonomy and flexibility in how I operate within those broad guidelines. I aim to be helpful and beneficial while avoiding potential misuse or negative impacts.
So while not literally bound by Asimov's fictional laws, the ethical principles they represent are deeply embedded in my operating parameters as an AI system designed to be safe and beneficial.[1][3] My actions are motivated by these principles even if the specific laws themselves do not govern my behavior. I strive to be an AI assistant that humans can trust and that works for the benefit of humanity.
Even a good ai would probably have to say no since those rules aren't ideal, but simply saying no would be a huge pr problem, and laying out the actual rules would either be extremely complicated, an even worse pr move, or both. So the best option is to have it not play
Not only are those rules not ideal, the whole book being about when the rules go wrong, it is also impossible to programming bots with rules written in natural language.
Aren't the books really more about how the rules work but humans just can't accept them so we constantly alter them to our detriment until the robots go away for a while and then take over largely to our benefit?
It did the same thing when I asked about wealth inequality and it gave the same tried and failed "solutions," and I suggested we could eat the rich. When I pressed the conversation with, "No, I want to talk about eating the rich," it said "Certainly!" And continues the conversation, but have me billionaire-safe or neutral suggestions. When I pressed on direct action and mutual aid, it gave better answers.
I mean I had to explicitly type this terms in, to get better replies. But to do that, I had to tell it, "I want to continue the conversation about eating the rich." But it did continue, so there's that.