Build a complete RAG pipeline using AIWave's embeddings and chat models. Cost-effective alternative to OpenAI embeddings.
Documents → Chunk → Embed → Vector DB
Query → Embed → Retrieve → LLM → Answer
from openai import OpenAI
client = OpenAI(
base_url="https://aiwave.live/v1",
api_key="sk-YOUR_KEY"
)
def embed(text):
resp = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return resp.data[0].embedding
def rag_query(question, vector_db):
q_embed = embed(question)
docs = vector_db.search(q_embed, top_k=3)
context = "\n".join(docs)
resp = client.chat.completions.create(
model="deepseek-v4-pro",
messages=[{
"role":"user",
"content": f"Context:\n{context}\n\nQuestion: {question}"
}]
)
return resp.choices[0].message.content
| Component | AIWave Model | Cost |
|---|---|---|
| Embeddings | text-embedding-3-small | Low |
| Chat/Generation | deepseek-v4-pro | $0.14/M input |
| Vector DB | Pinecone / Weaviate / Chroma | Free tier available |