Skip to main content

Jan

What is Jan?
Who we are
Wall of love

Product Ecosystem

Guides
Developer
API Reference
Changelog

Docs Guides Developer API Reference Changelog

Get Started
Guides
- Best Practices
- Thread Management
Advanced Features
Troubleshooting

Advanced Features
Inference Providers

Inference Providers

📄️ llama.cpp

Overview

📄️ TensorRT-LLM

Users with Nvidia GPUs can get 20-40% faster\* token speeds on their laptop or desktops by using TensorRT-LLM. The greater implication is that you are running FP16, which is also more accurate than quantized models.

Last updated on Mar 15, 2024 by Faisal Amir

Remote Server Integration

Jan

Jan is the open-source, self-hosted
alternative to ChatGPT.

For Developers

Documentation
Hardware
API Reference
Changelog

Community

Github
Discord
Twitter
LinkedIn

Company

About
Blog
Careers
Newsletter

©2024 Jan AI Pte Ltd.