BERTA Embeddings API

768-dimensional text embeddings for Russian and English languages via sergeyzh/BERTA. Vectors are L2-normalized — use dot product as cosine similarity.

Endpoints

GET /health

POST /embed

Request — POST /embed

texts required	str \| str[]	One text or a list of up to 64. Max 10 000 chars each.
prompt	str	Prefix added before each text. See prompts below.

Response

embeddings	float[][]	One 768-dim vector per input text.
model	str	Model identifier.

Example

POST /embed
{
  "texts": ["Первый текст", "Второй текст"],
  "prompt": "paraphrase: "
}

{
  "embeddings": [[0.023, -0.105, ...], [0.041, -0.098, ...]],
  "model": "sergeyzh/BERTA"
}

Python

import requests, numpy as np

r = requests.post("https://rustemgareev-berta.hf.space/embed", json={
    "texts": ["Первый текст", "Второй текст"],
    "prompt": "paraphrase: ",
})
emb = r.json()["embeddings"]
print(np.dot(emb[0], emb[1]))  # cosine similarity

JavaScript

const { embeddings } = await fetch("https://rustemgareev-berta.hf.space/embed", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ texts: ["Первый текст", "Второй текст"], prompt: "paraphrase: " }),
}).then(r => r.json());

const dot = (a, b) => a.reduce((s, v, i) => s + v * b[i], 0);
console.log(dot(embeddings[0], embeddings[1]));

Prompt prefixes

search_query:		Search query
search_document:		Document to retrieve
paraphrase:		Paraphrase / deduplication
categorize:		Classification