llama.cpp — Ultimate Local LLM Inference Engine

The most popular open-source project for running AI language models locally on any hardware. Pure C/C++ implementation with no dependencies. Runs Llama, Mistral, Phi, Gemma, and hundreds of other AI models on CPU, GPU, or Apple Silicon. 75,000+ GitHub stars. Supports GGUF format for efficient offlin

View on AIWEBTOOLS.AI