Qualifire is Now Available on LiteLLM

We are thrilled to announce that Qualifire is now a native guardrails provider for LiteLLM, the leading open-source LLM gateway.

While our partnership with Portkey focuses on enterprise infrastructure and governance, our work with LiteLLM is all about the builders—giving developers the flexibility to implement robust "Guardrails as Code" directly within their existing open-source workflows.

LiteLLM has become the industry standard for standardizing I/O across 100+ LLMs. With this integration, we are bringing that same level of standardization to AI Safety. You can now switch models, providers, and parameters instantly, while keeping a consistent, high-performance safety layer active across everything.

Guardrails as Code

The power of LiteLLM lies in its simplicity: a single config.yaml file controls your entire generative AI footprint. We designed the Qualifire integration to fit perfectly into this philosophy.

Instead of managing complex webhooks or separate proxy servers, you can now define your safety logic declaratively alongside your model definitions. This means your guardrails can be version-controlled, code-reviewed, and deployed just like any other part of your stack.

Here is how simple it is to add production-grade protection to any model:

guardrails:
  - guardrail_name: "qualifire-guard"
    litellm_params:
      guardrail: qualifire
      mode: "during_call" 
      api_key: os.environ/QUALIFIRE_API_KEY
      prompt_injections: true
      hallucinations_check: true

`‍`
Engineered for Latency and Flexibility

One size rarely fits all in production. Some checks need to happen before the model even sees the prompt; others need to verify the output before the user sees it.

Through LiteLLM’s flexible event hooks, Qualifire supports three distinct modes of operation, giving you granular control over the latency-safety trade-off:

pre_call: Runs on input before the LLM is hit. Perfect for blocking Prompt Injections or detecting PII to save costs and prevent malicious context from ever reaching the model.
post_call: Runs on the output. Essential for Hallucination Detection, Grounding, and Content Moderation to ensure the final response is safe and accurate.
during_call: The speed demon. Runs checks on the input in parallel with the LLM call. If a violation is found, the response is blocked before it reaches the user, but you don't pay a latency penalty for the check itself.

Comprehensive Coverage, Zero Lock-in

The beauty of using LiteLLM is the freedom to swap models—moving from GPT-5.2 to Claude 4.5 or a local models without changing your application code. Qualifire ensures your safety standards travel with you.

Whether you are using OpenAI, Anthropic, or a self-hosted vLLM instance, Qualifire provides a unified layer of protection including:

Prompt Injection Detection: Stop jailbreaks cold.
Hallucination & Grounding: Verify that answers are actually supported by your context.
Tool Use Quality: Specifically validate that your agents are calling tools correctly and with safe arguments.
Custom Assertions: Enforce business-specific logic (e.g., "Never offer any financial advice" or "No competitor mentions").