Is there a currently an accurate way to say how much power per prompt LLMs use?

SnausagesinaBlanket@lemmy.world · 7 hours ago

Is there a currently an accurate way to say how much power per prompt LLMs use?

Scrubbles@poptalk.scrubbles.tech · 6 hours ago

You’re looking for tokens. Prompts are broken down into tokens, which then are used to generate tokens in response. All are represented by large integers. The common metric is tokens/second, and if utilized correctly the GPU should pin at 100% usage while generating tokens. Calculate how many tokens per second it’s generating and how many tokens you’re using, times the wattage per second and you’re good.