LFM-2B (GGUF / CUDA)

Process

by mm-team · 5/18/2026

Liquid Foundation Model 2B in GGUF format via llama-server. Ultra-fast on any CUDA GPU. Only ~2 GB VRAM needed. OpenAI-compatible API on port 8080.

↓ 18 downloads ♥ 5 likes ⚡ 14 boots cuda

#gguf #lfm #lightweight #llama.cpp

config.json

{
  "server_type": "process",
  "hardware_type": "cuda",
  "script": "scripts/process/cuda/setup.sh",
  "env": {
    "MODEL_PATH": "models/lfm-2b-q8_0.gguf",
    "HOST": "127.0.0.1",
    "PORT": "8080",
    "GPU_LAYERS": "99",
    "CTX_SIZE": "8192"
  },
  "health_check": "http://127.0.0.1:8080/health",
  "startup_timeout_secs": 30
}