# Fine-Tuning Pipeline

Developers who need models specialized for their domain can use the fine-tuning pipeline. The process:

1. **Select a base model** from the Cluster model catalog (e.g., Llama 3.1 70B)
2. **Provide training data** — either upload new data or reference an existing tokenized dataset from the marketplace
3. **Configure training parameters** — learning rate, epochs, batch size, evaluation criteria
4. **Submit the job** — Cluster provisions GPU compute and runs the fine-tuning workflow
5. **Deploy the result** — the fine-tuned model is automatically hosted on Cluster's inference layer, accessible via the same <mark style="color:$warning;">`/v1/chat/completions`</mark> endpoint

The output is not a file download - it is a live, production-ready inference endpoint served through the same API gateway as every other model on the platform.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://cluster-protocol.gitbook.io/whitepaper/core-infrastructure/overview/ai-services/fine-tuning-pipeline.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.