Think Before You Speak: Training Language Models with Pause Tokens
HN Discussion
Think before you speak: Training Language Models With Pause Tokens