BERTA Embeddings API

768-dimensional text embeddings for Russian and English languages via sergeyzh/BERTA. Vectors are L2-normalized — use dot product as cosine similarity.

Endpoints

GET /health

POST /embed

Request — POST /embed

texts required str | str[] One text or a list of up to 64. Max 10 000 chars each.
prompt str Prefix added before each text. See prompts below.

Response

embeddings float[][] One 768-dim vector per input text.
model str Model identifier.

Example

POST /embed
{
  "texts": ["Первый текст", "Второй текст"],
  "prompt": "paraphrase: "
}
{
  "embeddings": [[0.023, -0.105, ...], [0.041, -0.098, ...]],
  "model": "sergeyzh/BERTA"
}

Python

import requests, numpy as np

r = requests.post("https://rustemgareev-berta.hf.space/embed", json={
    "texts": ["Первый текст", "Второй текст"],
    "prompt": "paraphrase: ",
})
emb = r.json()["embeddings"]
print(np.dot(emb[0], emb[1]))  # cosine similarity

JavaScript

const { embeddings } = await fetch("https://rustemgareev-berta.hf.space/embed", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ texts: ["Первый текст", "Второй текст"], prompt: "paraphrase: " }),
}).then(r => r.json());

const dot = (a, b) => a.reduce((s, v, i) => s + v * b[i], 0);
console.log(dot(embeddings[0], embeddings[1]));

Prompt prefixes

search_query:Search query
search_document:Document to retrieve
paraphrase:Paraphrase / deduplication
categorize:Classification