Saturday, 3 May 2025

Why LLMs Aren’t Fully Deterministic?

LLM (Large Language Model) like ChatGPT are considered stochastic because their outputs can vary even when given the same input. This happens due to the probabilistic nature of how they generate responses.

1. What Does Stochastic Mean?

·      Stochastic means "random" or "involving chance."

·      A system is stochastic if it can produce different outcomes under the same conditions, influenced by probabilities rather than fixed rules.

 

2. Why Aren’t LLMs Fully Deterministic?

·      Deterministic means you get the same result every time for the same input. For example, 2 + 2 always equals 4 in a deterministic system.

·      LLMs, however, use probabilities to decide the next word or token in a response. These probabilities are based on patterns learned from massive amounts of text during training.

 

3. How LLMs Generate Text?

LLMs predict the next word or token in a sentence based on probabilities.

For example:

 

·       Given the input "The sky is", the model calculates probabilities for the next word:

o   "blue" (80% likely)

o   "clear" (15% likely)

o   "cloudy" (5% likely)

·       The model then chooses one option randomly, weighted by these probabilities.

 

4. Why Does This Lead to Variation?

·       Even with the same input, the model might pick a different probable word each time, leading to slightly different outputs.

·       For example:

o   Input: "What is the capital of France?"

o   Response 1: "The capital of France is Paris."

o   Response 2: "Paris is the capital of France."

o   Both are correct but phrased differently because of stochasticity.

o    

5. Temperature and Randomness

LLMs use a setting called temperature to control randomness

·       Low temperature (e.g., 0): Makes the model more deterministic by favoring the most probable word/token.

·       High temperature (e.g., 1): Increases randomness, allowing less probable words to be chosen.

 

6. Why Is Stochasticity Useful?

·       Creativity: Helps generate varied and creative responses rather than repetitive, predictable ones.

·       Adaptability: Makes the model more versatile in different contexts.

 

7. Can LLMs Be Made Deterministic?

Setting the temperature to 0 generally leads to more deterministic behavior, as it encourages the model to select the most probable output at each step. However, it's not guaranteed to be completely deterministic for all models and inputs.

 

Here are some factors that can influence the level of determinism.

  • Model Architecture: The specific architecture of the model can affect how temperature influences the output. Some models may be more susceptible to randomness even at low temperatures.
  • Numerical Precision: Floating-point arithmetic can introduce slight variations in calculations, which might lead to different results across runs, even with the same input and temperature setting.
  • Randomness in Tokenization: Tokenization processes can sometimes introduce randomness, especially when dealing with ambiguous inputs or rare words.
  • Hardware and Software Variations: Differences in hardware and software environments can also contribute to slight variations in output.

 

While setting the temperature to 0 is a good way to increase determinism, it's important to be aware of these factors and potentially experiment with different settings to achieve the desired level of consistency.

 

In summary, LLMs aren’t fully deterministic because they rely on probabilities to generate outputs. This stochastic behaviour makes them flexible and creative but also means they won’t always respond the same way.

Previous                                                    Next                                                    Home

No comments:

Post a Comment