Does my chatbot suffer from mood swings?  🤔 Why does it produce different output text for the same input prompt?

How is it possible that a computer algorithm can generate a different output for the same input? One plus one should always be two, however many times you run the calculation. A neural network is just a calculation. A big one for sure, but still, in the end it is just a large number of additions and multiplications, that’s it, no magic. 🚫✨

🔍 To understand why there is output variation we need to take a little peek under the hood. When we chat with a language model we see that it generates one word at a time. Behind the scenes the model works with a fixed list of tokens (words or parts of words). GPT-4, for example, has a vocabulary of around 100.000 tokens.

When it generates a new token for our chat, GPT-4 calculates 100.000 scores, one for each token in its vocabulary. 🧮

It now has to choose just one token to send to us. It turns out the output becomes really repetitive and predictable if you each time choose the token with the highest score. That is why programmers have put a bit of randomness in the choice, to give tokens with a lower score still a chance of being chosen. 🎲

💡 So tip of the day: ask the same question several times because in brainstorming you can get more diverse ideas and overall the correct answer has a higher likelihood of appearing.