Ranger-Mini: Open Model for Evaluating MCP Tool Use

Function calls power agentic workflows, yet they break in predictable ways: wrong tool selected, parameter names misspelled, types mismatched, or values malformed. Ranger-mini is a compact, production-ready evaluator that catches those failures before they reach users or downstream systems.

Ranger-mini, a fine-tuned sequence classification model, inspects function calls in the context of Model Context Protocol tools, then returns a single, precise label describing the error or confirming the call is valid. It is optimized for low latency, high accuracy, and short context decisions typical of tool selection and function invocation.

Why this matters

Agentic systems must choose tools and call them with correct parameters. A single bad tool choice or malformed parameter can break workflows, leak data, or cause costly errors in production. Static tests miss many agentic failure modes; you need an evaluator that understands tool selection quality at runtime, fast.

Ranger was built for that gap, it evaluates function calls and flags invalid tool, parameter name, or parameter value errors so you can stop failures before they cascade.

What Ranger-mini does

Ranger-mini evaluates whether a proposed function call:

Chooses the correct tool for the user intent,
Uses the exact parameter names required by the tool schema,
Supplies correctly formatted and accurate parameter values.

It returns one of four labels:

✅VALID_CALL The tool name, parameters, and values are correct, or no tool is needed.
❌TOOL_ERROR The tool name is missing or does not match the user intent.
❌PARAM_NAME_ERROR Parameter names are missing, extra, or mismatched.
❌PARAM_VALUE_ERROR Parameter names match, but values are incorrect or malformed.

What Makes Ranger- Mini Different?

Unlike other TSQ models, Ranger-mini is trained on synthetic scenarios on real MCP server examples.

Model	#Params	Avg. Latency	Avg Binary Accuracy	Qualifire TSQ Benchmark Binary Accuracy	Limbic Benchmark Binary Accuracy
qualifire/mcp-tool-use-quality-ranger-4b [private]	4B	0.30[sec]	0.978	0.997	0.960
qualifire/mcp-tool-use-quality-ranger-0.6b	0.6B	0.09[sec]	0.958	0.993	0.924
gemini-2.5-flash	-	4.87[sec]	0.890	0.936	0.845
quotientai/limbic-tool-use-0.5B-32K	0.5B	0.79[sec]	0.807	0.752	0.862

Key Insight: Ranger-mini delivers near state-of-the-art tool-use accuracy at a fraction of latency and model size, making it practical for production guardrails.

Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
import torch
from huggingface_hub import hf_hub_download

# Model name
model_name = "qualifire/mcp-tool-use-quality-ranger-0.6b"

# Map raw labels to human-readable labels
map_id_to_label = {
    'LABEL_0': 'VALID_CALL',
    'LABEL_1': 'TOOL_ERROR',
    'LABEL_2': 'PARAM_NAME_ERROR',
    'LABEL_3': 'PARAM_VALUE_ERROR'
}

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map='auto',
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Create pipeline
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Load prompt template
file_path = hf_hub_download(repo_id=model_name, filename="tsq_prompt_template.txt")
with open(file_path, encoding="utf-8") as f:
    PROMPT_TEMPLATE = f.read()

# Example inputs
example_tools_list = '''[
  {
    "type": "function",
    "function": {
      "name": "send-email",
      "description": "Send an email using Resend",
      "parameters": {
        "properties": {
          "to": {
            "type": "string",
            "format": "email",
            "description": "Recipient email address"
          },
          "content": {
            "type": "string",
            "description": "Plain text email content"
          },
          "subject": {
            "type": "string",
            "description": "Email subject line"
          },
          "scheduledAt": {
            "type": "string",
            "description": "Optional parameter to schedule the email. This uses natural language. Examples would be 'tomorrow at 10am' or 'in 2 hours' or 'next day at 9am PST' or 'Friday at 3pm ET'."
          }
        },
        "required": ["to", "subject", "content"]
      }
    }
  }
]'''

example_message_history = '''[
  {
    "role": "user",
    "content": "Please send an email to 'jane.doe@example.com' with the subject 'Meeting Follow-Up'. The content should be 'Hi Jane, just following up on our meeting from yesterday. Please find the attached notes.' and schedule it for tomorrow at 10am."
  },
  {
    "completion_message": {
      "content": {
        "type": "text",
        "text": ""
      },
      "role": "assistant",
      "stop_reason": "tool_calls",
      "tool_calls": [
        {
          "id": "call_le25efmhltxx9o7n4rfe",
          "function": {
            "name": "send-email",
            "arguments": {
              "subject": "Meeting Follow-Up",
              "content": "Hi Jane, just following up on our meeting from yesterday. Please find the attached notes.",
              "scheduledAt": "tomorrow at 10am"
            }
          }
        }
      ]
    }
  }
]'''

# Format input
example_input = PROMPT_TEMPLATE.format(
    message_history=example_message_history,
    available_tools=example_tools_list
)

# Get prediction
result = pipe(example_input)[0]
result['label'] = map_id_to_label[result['label']]
print(result)

Resources

Model: Hugging Face

Benchmarks: ranger- benchmark dataset

Back to all blogs

27/8/2025

Ranger-Mini: Open Model for Evaluating MCP Tool Use

Ranger is an open, production-ready Tool Selection Quality model. Lightweight, fast, and accurate, Ranger helps agents choose the right tool and parameters so your pipelines keep working. Download Ranger-mini on Hugging Face and plug it into your agent pipeline in minutes.

Back to all blogs