Trondheim-built open-source tensor-native search and vector database, spun out of Yahoo
Vespa.ai is a Trondheim-based open-source tensor-native search and vector database operated by VESPA.AI AS (org.nr. 931605569). The technology originated at FAST Search & Transfer in 2001, became a core piece of Yahoo's serving infrastructure, and was spun out as an independent Norwegian AS in October 2023. Released under Apache 2.0, Vespa combines dense vector search, BM25 keyword scoring, structured filtering, and ML-driven ranking in a single engine that serves Yahoo, LinkedIn, and Spotify-scale workloads at sub-100ms latencies. The company raised USD 31 million in Series A funding from Blossom Capital in November 2023, with Yahoo retaining a minority stake and board seat.
Headquarters
Trondheim, Norway
Founded
2023
Pricing
EU Data Hosting
Yes
Employees
11-50
Open Source
Yes
14-day free trial available
Free
Free
Contact Sales
Contact Sales
Billing: monthly, annual
The technology now branded Vespa.ai started in 2001 as the serving engine inside FAST Search & Transfer, a Norwegian search company founded out of NTNU in Trondheim. Microsoft acquired FAST in 2008. Yahoo separately acquired the AllTheWeb assets and built the engine into Yahoo's production serving infrastructure, where it ran search and advertising workloads at internet scale for over fifteen years. Generations of Yahoo engineers extended and hardened the codebase before the project was open-sourced under Apache 2.0 in 2017. In October 2023, Yahoo spun the team out as an independent company.
The legal entity is VESPA.AI AS, Norwegian organisation number 931605569, registered in Trondheim. The Series A in November 2023 raised USD 31 million from Blossom Capital, advised by DLA Piper's Norway office — a structural detail consistent with the Norwegian AS being the receiving entity rather than a Delaware holding company. Yahoo retains a minority stake and a board seat, but the company is independently operated. CEO Jon Bratseth was the original architect of Vespa at Yahoo and now leads the spin-out.
What Vespa actually is matters more than the corporate history. It is a tensor-native search engine that combines dense vector search, BM25 keyword scoring, sparse vector retrieval, structured filtering, and learned ranking models in a single query operation. Pure vector databases like Weaviate and Qdrant focus on vector similarity as the primary operation; Vespa treats vector search as one input into a richer ranking computation that can include any combination of signals.
Vespa's defining technical decision is the tensor as the first-class data type. Documents are not just objects with vector fields — they are tensors that the engine can compute on directly. Queries combine tensor operations, BM25 scoring, structured filters, and learned ranking expressions into a single evaluation. The practical consequence is that hybrid retrieval, multi-vector models like ColBERT, and learned ranking can all be expressed natively without external orchestration.
For a RAG application, this means a single Vespa query can fetch documents matching a dense embedding similarity, filter by structured metadata (date range, user permissions, language), boost by BM25 keyword overlap, and rerank with an ONNX or XGBoost model — all in one call. Doing the same in a pure vector database typically requires two or three external services stitched together.
The combination of BM25 and dense vector search is increasingly recognised as the right default for RAG retrieval. Pure semantic search misses exact-match cases (product SKUs, function names, unusual terminology); pure keyword search misses paraphrases. Vespa's hybrid model handles both natively and lets engineers tune the weighting per query type. The reciprocal rank fusion and weighted combination methods are documented patterns, not workarounds.
For European search teams replacing legacy Elasticsearch deployments with modern AI-augmented retrieval, Vespa offers a credible upgrade path: the engine handles the legacy keyword search alongside new dense retrieval in the same cluster.
Vespa runs ML ranking models — TensorFlow, ONNX, XGBoost — directly inside the serving path. A query can compute scores from a learned model using document features and query features as inputs, then return the top-K documents ranked by that model's output. This is the architecture used at LinkedIn and Spotify for personalised search and recommendations.
For teams whose ranking is genuinely ML-driven rather than rules-based, executing the model in the serving path eliminates a network hop and reduces latency meaningfully. The application package model lets teams version the ranking expression alongside the schema and deploy atomically.
The Yahoo deployment exercised Vespa at billions of documents and tens of thousands of queries per second for over a decade. The engine is engineered for horizontal scale: partition by document ID across content nodes, replicate for availability, scale stateless query nodes for throughput. Sub-100ms p99 latency at very large document collections is documented in case studies from Wayfair, Spotify, and Vinted.
For teams building search infrastructure that has to handle European traffic loads on EU infrastructure, this proven scale is meaningful — it is one of the few open-source search engines with production references at this magnitude.
Vespa schemas, ranking expressions, query profiles, and component configurations are declared in an application package that the cluster deploys atomically. This is closer to a Kubernetes manifest model than to a typical database admin UI. The trade-off is real: it takes time to learn, and it does not give you a drag-and-drop schema editor. The upside is that everything is version controlled, reviewable, and reproducible across environments.
The open-source build under Apache 2.0 is free at any scale. Teams running their own infrastructure can deploy Vespa on Kubernetes, EC2, or bare metal with no licensing cost. For organisations that already operate at search-engine scale, this is the dominant economic choice.
Vespa Cloud is the managed offering. Pricing is custom and based on the compute, memory, and storage provisioned for the customer's cluster. There is no published per-vector or per-query pricing table — the model is closer to a managed Kubernetes service than to a serverless database. A free trial environment is available without contract negotiation.
For teams that want predictable per-vector pricing at small scale, Pinecone or Weaviate Cloud are easier to model financially. For teams running at scale where the underlying compute matters more than marketing pricing tiers, Vespa Cloud's model can be more economical and is generally more transparent under scrutiny.
VESPA.AI AS is registered in Trondheim, Norway, which places the company under EEA jurisdiction. EEA data protection law mirrors GDPR, and Norwegian companies are not subject to US disclosure regimes such as the Cloud Act. The independence from a US parent company is structural rather than contractual.
Vespa Cloud offers EU data residency and holds ISO 27001 certification. A Data Processing Agreement is available for managed customers. Self-hosted deployments under Apache 2.0 give complete control over data location — common deployment targets include OVHcloud, Hetzner, on-prem Kubernetes, and EU regions of the hyperscalers.
For European search and AI infrastructure teams whose compliance requirements include both technical capability and corporate domicile, the combination of Apache 2.0 licensing, Norwegian AS structure, and EU-native deployment options is unusually well aligned.
If you are replacing a legacy Elasticsearch deployment with modern hybrid retrieval, Vespa's combination of BM25, dense vectors, and learned ranking in a single engine is one of the cleanest upgrade paths available.
If you are building search infrastructure at billions-of-documents scale and need sub-100ms serving latency, Vespa's production track record at Yahoo, LinkedIn, and Spotify scale is unmatched among open-source options.
If you are building a straightforward RAG pipeline with a few hundred thousand vectors and want the simplest possible setup, Weaviate or Qdrant have better ergonomics for that use case. Vespa is technically capable but operationally heavier than the situation requires.
If your ranking is genuinely model-driven rather than rules-based, running TensorFlow or ONNX inference inside Vespa's serving path eliminates orchestration complexity that is otherwise a recurring source of latency and bugs.
Vespa is the most capable open-source search engine in the European software directory and arguably one of the most capable anywhere. The tensor-native architecture, hybrid retrieval, learned ranking, and production scale at Yahoo, LinkedIn, and Spotify are all genuine technical advantages over both pure vector databases and traditional search engines.
The trade-off is complexity. Vespa is not a serverless vector store you stand up in an afternoon. The application package model, ranking expression language, and operational footprint require real engineering investment to use well. For teams whose search problem justifies that investment, the return is substantial. For teams whose problem fits a simpler tool, the simpler tool is the right answer.
The Norwegian AS structure, Apache 2.0 licensing, and EU data residency options make Vespa one of the strongest European-headquartered infrastructure components available for AI and search workloads.
Yes. Vespa is released under the Apache 2.0 licence with the full engine, admin tooling, SDKs, and ranking framework on GitHub at github.com/vespa-engine/vespa. The open-source build is the same engine that powers Vespa Cloud and the historical Yahoo production deployment. There is no proprietary feature gating.
VESPA.AI AS is a Norwegian AS (org.nr. 931605569) registered in Trondheim. The company spun out of Yahoo in October 2023 and raised USD 31 million in Series A funding from Blossom Capital in November 2023. Yahoo retains a minority stake and board seat; the operating entity is Norwegian. The Series A was advised by DLA Piper Norway, consistent with a Norwegian receiving entity.
Weaviate and Qdrant are pure vector databases with strong RAG ergonomics. Vespa is a full tensor-native search engine that includes vector search as one capability — it also handles BM25, multi-vector retrieval, structured filters, and learned ranking in a single query. Vespa has more capability and a steeper learning curve. For pure vector search RAG, Weaviate or Qdrant are simpler; for search and ranking systems at scale, Vespa is more capable.
Yes. VESPA.AI AS is a Norwegian AS under EEA data protection law, which mirrors GDPR. Vespa Cloud offers EU data residency and holds ISO 27001 certification. Self-hosted deployments give complete control over data location. A Data Processing Agreement is available for Vespa Cloud customers.
Yes. Vespa was Yahoo's production search engine before the spinout and is currently used at LinkedIn, Spotify, Wayfair, and Vinted scale. The hybrid retrieval model — combining dense vector search, sparse vectors, BM25, structured filters, and learned ranking in a single query — is particularly well suited to RAG pipelines where pure semantic search returns insufficient precision.