vllama is a hybrid server that brings together the best of two worlds: it combines Ollama's versatile model management with the high-speed GPU inference of vLLM. The result is an OpenAI-compatible API ...
* evalplus: scores are calculated based on test cases from both HumanEval and evalplus. ** basic: scores are calculated based on test cases from HumanEval only ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results