Show HN: TurboPrefill – Multi-GPU prefill acceleration for llama.cpp

TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill.


Comments URL: https://news.ycombinator.com/item?id=48390116

Points: 1

# Comments: 0