Aquin SDK — Build on Your Trained Models

Train a model on Aquin, generate an API key, and call it from anywhere — Python, JavaScript, curl. One line of code. No ML infrastructure required.

You trained a model. Now what? Most platforms stop there — you get a model file and a bill. Aquin goes further. Every trained model gets a live inference endpoint and an API key system, so you can actually use what you built.

install in seconds

The SDK is available on both PyPI and npm. Install whichever fits your stack:

bash

pip install aquin

bash

npm install aquin

generate your api key

Once you have a completed training run, go to the API Keys tab in your Aquin dashboard. Select the model you want to expose, hit Generate, and copy the key immediately — it's shown exactly once and never stored in plaintext.

Keys look like this: aq-m_xxxxxxxxxxxxxxxx

You can generate multiple keys per model, revoke any of them at any time, and track usage per key — calls made and credits spent.

call your model — python

Three lines. Initialize the client, call complete, get a response:

python

from aquin import AquinClient

client = AquinClient("aq-m_your_key_here")
res = client.complete("What is LoRA fine-tuning?")

print(res.text)
print(f"tokens in: {res.tokens_in}, out: {res.tokens_out}")

Need async? There's a native async method too:

python

import asyncio
from aquin import AquinClient

async def main():
    client = AquinClient("aq-m_your_key_here")
    res = await client.acomplete("Summarize this contract:")
    print(res.text)

asyncio.run(main())

call your model — javascript / typescript

typescript

import { AquinClient } from "aquin";

const client = new AquinClient("aq-m_your_key_here");
const res = await client.complete("What is LoRA fine-tuning?");

console.log(res.text);
console.log(`tokens in: ${res.tokens_in}, out: ${res.tokens_out}`);

Works in Node.js, Next.js, any modern JS runtime. The SDK uses the native fetch API — no extra dependencies.

or just use curl

No SDK needed. Every model endpoint is a plain HTTPS POST:

bash

curl -X POST https://www.aquin.app/api/v1/model \
  -H "Authorization: Bearer aq-m_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is LoRA?", "max_tokens": 256}'

Response:

json

{
  "text": "LoRA (Low-Rank Adaptation) is a fine-tuning technique...",
  "tokens_in": 6,
  "tokens_out": 42
}

parameters

Every completion call accepts three parameters:

—prompt — the input text. required.
—max_tokens — maximum tokens to generate. default 512.
—temperature — controls randomness 0.0–1.0. default 0.7.

error handling

The SDK raises typed exceptions so you can handle specific failures without inspecting raw HTTP status codes:

python

from aquin import AquinClient
from aquin import InvalidKeyError, InsufficientCreditsError, InferenceError

client = AquinClient("aq-m_your_key_here")

try:
    res = client.complete("Hello")
except InvalidKeyError:
    print("Key is invalid or has been revoked")
except InsufficientCreditsError:
    print("Top up your credits at aquin.app")
except InferenceError:
    print("Model inference failed — try again")

The full list of exceptions:

—AquinError — base class for all errors
—InvalidKeyError — key is wrong or revoked
—InsufficientCreditsError — owner account is out of credits
—RateLimitError — too many requests
—ModelNotFoundError — model hasn't finished training
—InferenceError — GPU server error

billing

API calls are charged to the model owner's credit balance based on inference time — the same credit system used for training. One credit equals one minute of compute. A typical API call takes 1–5 seconds, so costs are fractions of a credit per call.

You can monitor exactly how many calls each key has made and how many credits have been spent from the API Keys tab in your dashboard.

security

API keys are hashed with SHA-256 before storage. The raw key is never saved — not in your database, not in logs, not anywhere. If you lose it, generate a new one. Revoke keys instantly from the dashboard.

The inference VM is never directly exposed. Every request routes through Aquin's API layer where authentication, billing, and rate limiting are enforced before the model is ever touched.

what's next

Streaming responses, per-token billing, and a RAG API key for querying your training datasets directly are all coming. The SDK will stay in sync — version 1.x of the SDK will always talk to the v1 API.

pip install aquin. npm install aquin. build something.