You trained a model. Now what? Most platforms stop there — you get a model file and a bill. Aquin goes further. Every trained model gets a live inference endpoint and an API key system, so you can actually use what you built.
install in seconds
The SDK is available on both PyPI and npm. Install whichever fits your stack:
pip install aquinnpm install aquingenerate your api key
Once you have a completed training run, go to the API Keys tab in your Aquin dashboard. Select the model you want to expose, hit Generate, and copy the key immediately — it's shown exactly once and never stored in plaintext.
Keys look like this: aq-m_xxxxxxxxxxxxxxxx
You can generate multiple keys per model, revoke any of them at any time, and track usage per key — calls made and credits spent.
call your model — python
Three lines. Initialize the client, call complete, get a response:
from aquin import AquinClient
client = AquinClient("aq-m_your_key_here")
res = client.complete("What is LoRA fine-tuning?")
print(res.text)
print(f"tokens in: {res.tokens_in}, out: {res.tokens_out}")Need async? There's a native async method too:
import asyncio
from aquin import AquinClient
async def main():
client = AquinClient("aq-m_your_key_here")
res = await client.acomplete("Summarize this contract:")
print(res.text)
asyncio.run(main())call your model — javascript / typescript
import { AquinClient } from "aquin";
const client = new AquinClient("aq-m_your_key_here");
const res = await client.complete("What is LoRA fine-tuning?");
console.log(res.text);
console.log(`tokens in: ${res.tokens_in}, out: ${res.tokens_out}`);Works in Node.js, Next.js, any modern JS runtime. The SDK uses the native fetch API — no extra dependencies.
or just use curl
No SDK needed. Every model endpoint is a plain HTTPS POST:
curl -X POST https://www.aquin.app/api/v1/model \
-H "Authorization: Bearer aq-m_your_key_here" \
-H "Content-Type: application/json" \
-d '{"prompt": "What is LoRA?", "max_tokens": 256}'Response:
{
"text": "LoRA (Low-Rank Adaptation) is a fine-tuning technique...",
"tokens_in": 6,
"tokens_out": 42
}parameters
Every completion call accepts three parameters:
- —prompt — the input text. required.
- —max_tokens — maximum tokens to generate. default 512.
- —temperature — controls randomness 0.0–1.0. default 0.7.
error handling
The SDK raises typed exceptions so you can handle specific failures without inspecting raw HTTP status codes:
from aquin import AquinClient
from aquin import InvalidKeyError, InsufficientCreditsError, InferenceError
client = AquinClient("aq-m_your_key_here")
try:
res = client.complete("Hello")
except InvalidKeyError:
print("Key is invalid or has been revoked")
except InsufficientCreditsError:
print("Top up your credits at aquin.app")
except InferenceError:
print("Model inference failed — try again")The full list of exceptions:
- —
AquinError— base class for all errors - —
InvalidKeyError— key is wrong or revoked - —
InsufficientCreditsError— owner account is out of credits - —
RateLimitError— too many requests - —
ModelNotFoundError— model hasn't finished training - —
InferenceError— GPU server error
billing
API calls are charged to the model owner's credit balance based on inference time — the same credit system used for training. One credit equals one minute of compute. A typical API call takes 1–5 seconds, so costs are fractions of a credit per call.
You can monitor exactly how many calls each key has made and how many credits have been spent from the API Keys tab in your dashboard.
security
API keys are hashed with SHA-256 before storage. The raw key is never saved — not in your database, not in logs, not anywhere. If you lose it, generate a new one. Revoke keys instantly from the dashboard.
The inference VM is never directly exposed. Every request routes through Aquin's API layer where authentication, billing, and rate limiting are enforced before the model is ever touched.
what's next
Streaming responses, per-token billing, and a RAG API key for querying your training datasets directly are all coming. The SDK will stay in sync — version 1.x of the SDK will always talk to the v1 API.
pip install aquin. npm install aquin. build something.
