/u/vast_ai on A cost-effective and convenient way to run LLMs on Vast.ai machines

/u/vast_ai · Friday at 11:42 PM

Our philosophy is to disclose as much information about the machine as possible. We are looking at better ways to benchmark and to publish all numbers beforehand.

100 tps to 20ish tps could mean there are other bottlenecks. Notably we do see a lot of unoptimized Python code that can cause the CPU to be a bottleneck on an instance. Including filter settings for CPU, system RAM and PCIE info in your search query could help filter out offers from machines with lower quality system components.

Thanks for your feedback and for using our service.

Continue reading...

/u/vast_ai on A cost-effective and convenient way to run LLMs on Vast.ai machines

/u/vast_ai

Guest