Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

  • OwlPaste@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    23 days ago

    my use case was, i expect easier and simpler. so i was able to write automated tests to validate logic of incrementing specific parts of a binary number and found that expected test values llm produced were wrong.

    so if its possible to use some kind of automation to verify llm results for your problem, you will be confident in your answer. but generally llms tend to make up shit and sound confident about it