Advanced FeaturesInference Providersllama.cppOn this pagellama.cppOverview Nitro is an inference server on top of llama.cpp. It provides an OpenAI-compatible API, queue, & scaling. Nitro is the default AI engine downloaded with Jan. There is no additional setup needed.