Skip Navigation

Why AI detectors think the US Constitution was written by AI

arstechnica.com Why AI detectors think the US Constitution was written by AI

Can AI writing detectors be trusted? We dig into the theory behind them.

Why AI detectors think the US Constitution was written by AI
77

You're viewing part of a thread.

Show Context
77 comments
  • The issue here is that you are describing the goal of LLMs, not how they actually work. The goal of an LLM is to pick the next most likely token. However, it cannot achieve this via rudimentary statistics alone because the model simply does not have enough parameters to memorize which token is more likely to go next in all cases. So yes, the model "builds up statistics of which tokens it sees in which contexts" but it does so by building it's own internal data structures and organization systems which are complete black boxes.

    Also, going "one token at a time" is only a "limitation" because LLMs are not accurate enough. If LLMs were more accurate, then generating "one token at a time" would not be an issue because the LLM would never need to backtrack.

    And this limitation only exists because there isn't much research into LLMs backtracking yet! For example, you could give LLMs a "backspace" token: https://news.ycombinator.com/item?id=36425375

    Have you tried that when it’s correct too? And in that case you mention it has a clean break and then start anew with token generation, allowing it to go a different path. You can see it more clearly experimenting with local LLM’s that have fewer layers to maintain the illusion.

    If it's correct, then it gives a variety of responses. The space token effectively just makes it reflect on the conversation.

    We’re trying to make a flying machine by improving pogo sticks. No matter how well you design the pogo stick and the spring, it will not be a flying machine.

    To be clear, I do not believe LLMs are the future. But I do believe that they show us that AI research is on the right track.

    Building a pogo stick is essential to building a flying machine. By building a pogo stick, you learn so much about physics. Over time, you replace the spring with some gunpowder to get a mortar. You shape the gunpowder into a tube to get a model rocket and discover the pendulum rocket fallacy. And finally, instead of gunpowder, you use liquid fuel and you get a rocket that can go into space.

    • The issue here is that you are describing the goal of LLMs, not how they actually work.

      No, I am describing how they actually work.

      it cannot achieve this via rudimentary statistics alone because the model simply does not have enough parameters to memorize which token is more likely to go next in all cases.

      True, hence the limitations. That would require infinite storage and infinite compute capability.

      Also, going “one token at a time” is only a “limitation” because LLMs are not accurate enough.

      No, it's done because one letter at a time is too slow. Tokens are a "happy" medium tradeoff.

      The space token effectively just makes it reflect on the conversation.

      It makes a "break" of the block, which lets it start a new answer instead of continuing on the previous. How it reacts to that depends on the fine tune and filters before the data hits the LLM.

      To be clear, I do not believe LLMs are the future.

      I have just said that LLM's we have today can't fix the problems with false data and hallucinations, because it's a core principle of how it operates. It will require a new approach.

      You could add a rocket engine and wings to a pogo stick, but then it's no longer a pogo stick but an airplane with a weird landing gear. Today's LLM's could give us hints to how to make a better AI, but that would be a different thing than today's LLM's. From what has been leaked from OpenAI GPT4 has scaling issues so they use mixture of experts. Just throwing hardware at it is already showing diminishing returns. And we're learning fascinating new ways of training them, but the inherent problem is the same.

      For example, if you ask an LLM if it can give an answer to a question, it will have two paths to go down, positive and negative. Note, at the point where it chooses that it doesn't know how to finish it, it doesn't look ahead. But it sees for example that 80% of the answers in the texts it's been trained on starts with a positive, then it will most likely start with "yes" - and when it does that it will continue to generate an answer - often very convincing and plausibly real looking answer, because it already committed to that path.

      And as for the link about teaching it backspace token, the comments there are already pointing out the issue:

      It's interesting that in the examples (Table 3 on page 21), the model uses the backspace token to erase the randomly-added token from the prompt, but it does not seem to ever use the token to correct its own output. I'm curious how frequently the model actually uses this backspace token in practice - and if the answer is "vanishingly rarely", what is the source of the improved Mauve score and sample diversity they show? Is it just that the different training procedure gives an improvement?

      For it to use the backspace, wouldn't it have to predict the wrong token with greater confidence than the corrected token? I would think this would require more examples of a wrong token + correction than the correct token, which seems a bit odd.

      Almost none of the text it's trained on has a backspace token, and to finetune it in is tricky since it's a completely new concept - and remember it's still doing token for token - so it would have to write a token and then right after find out that it's more likely to send a backspace token than to continue it. It's interesting, and LLM's can pick up on some crazy patterns, but I'm skeptical.

      • No, I am describing how they actually work.

        First of all, this link is just to C# bindings of llama.cpp and so doesn't contain the actual implementation. But it also doesn't refute my criticism of your claim. More specifically, I take issue with this statement that you said: "today’s LLM’s ingest a ton of text [snip] and builds up statistics of which tokens it sees in that context".

        I claim that this is not how today's LLMs work because we have no idea what LLMs do with the input data during training. We have very little insight into what kind of data structure it builds and how the data structure it built is organized.

        No, it’s done because one letter at a time is too slow. Tokens are a “happy” medium tradeoff.

        I think I worded my sentence ambiguously, let me re-word it for you: "Going one token at a time is only considered a limitation because LLMs are not accurate enough"

        It makes a “break” of the block, which lets it start a new answer instead of continuing on the previous. How it reacts to that depends on the fine tune and filters before the data hits the LLM.

        Once again, my sentence was not written well, my bad. I was commenting on the observed behavior, not on how it works from a technical perspective.

        I have just said that LLM’s we have today can’t fix the problems with false data and hallucinations, because it’s a core principle of how it operates. It will require a new approach.

        You could add a rocket engine and wings to a pogo stick, but then it’s no longer a pogo stick but an airplane with a weird landing gear. Today’s LLM’s could give us hints to how to make a better AI, but that would be a different thing than today’s LLM’s. From what has been leaked from OpenAI GPT4 has scaling issues so they use mixture of experts. Just throwing hardware at it is already showing diminishing returns. And we’re learning fascinating new ways of training them, but the inherent problem is the same.

        Alright, we agree here for the most part so I'm just going to skip this.

        For example, if you ask an LLM if it can give an answer to a question, it will have two paths to go down, positive and negative. Note, at the point where it chooses that it doesn’t know how to finish it, it doesn’t look ahead.

        This is weird though. How do you know LLMs can't look ahead? When we prompt LLMs, we are basically asking them this question: "What is the next word of your response?" How do you know it hasn't written out the entire response in memory already after which it only shows you the first word? LLMs are neural networks. Neural networks have working memory. That's how neural networks work after all, it's just a vector of data that is repeatedly transformed as it passes through each layer. Of course, if it does write the entire response in memory, it is all thrown away after every word.


        As far as the backspace tokens go, you are right to be skeptical but also do not be surprised if it works out. We've had LLMs trained to complete and edit text for some time already. They've fallen out of use today but they did perform acceptably well.

        • First of all, this link is just to C# bindings of llama.cpp and so doesn’t contain the actual implementation.

          I know, it's my code. I refactored it from some much less readable and usable c# code. I picked it because it more clearly shows the steps involved in generating text.

          How do you know LLMs can’t look ahead? [...] How do you know it hasn’t written out the entire response in memory already after which it only shows you the first word?

          Firstly, it goes against everything we know so far of how they operate, and secondly.. because they can't.

          If you look at the C# code, the first step is in _process_tokens function, where it feeds the context into llama_eval. That goes through each token and updates the internal memory / model state. Since it saves state, if you already have processed some of the tokens you can tell it to skip them and start on the new ones.

          After this function you have a state in memory, the current state of the LLM, as a result of the tokens it's seen so far.

          When we are done with that, we go to the more interesting part, the _predict_next_token function. Note that that takes a samplingparams parameter. It then set some options, like if top k is not set it's set to length of the model's vocabulary (number of tokens it knows about), and repeat_last_n, if not set, is set to the length of the existing context.

          The code then gets the model's vocabulary, aka all the tokens it knows about, and then it generates the logits. The logits is an array the length of the vocabulary, with a number for each token showing how likely that one is the next token. The code then adds any specified token bias to that token's number. Already here, even if it already had a specific answer in mind, you can see problems starting.

          Then the code adds token repetition penalty, based on the samplingparams. This means that if a token repeats inside the given history, it's value will be lowered according to the repeat_penalty. Again, even if it had a specific answer, this has a high chance of messing that up. The same is done for frequency and presence. For more details of what those native functions do, you can see the llama.cpp source - they have the same name there.

          After all the penalties are applied, it's time to pick the token. If the temp is 0 or lower, it just picks the highest rated token (aka greedy sampling). This tends to give very boring and flat responses, but it's predictable and reproduceable, so it's often used in benchmarks of various kinds.

          But if that's not used (which it almost never is in "real" use), there are several methods. You have MiroStat, which tries to create more consistent quality between different answer lengths, and the "traditional" using top-k, top-p and temperature.

          Common for them is however that internally it produces a top list of candidates, and then pick one at random. And that's why a LLM can't plan ahead.

          When a token ID is eventually produced it returns the new ID, that gets added to context, the text equivalent of the token is looked up and sent back to the UI, and the new context is fed into llama_eval and the process starts again.

          For the LLM to even be able to plan an answer ahead it must know of all penalties and parameters (or have none applied), and greedy token prediction must be used.

          And that is why, even if it had some sort of near magical ability to plan ahead that we just don't know is there, at the end of the day it could still not plan a specific response.

77 comments