European open-source feature store and ML platform for the data-for-AI lifecycle
Hopsworks is a Stockholm-based ML platform and open-source feature store founded in 2016 as a spinout of KTH Royal Institute of Technology and RISE SICS AB by professor Jim Dowling. It provides a unified platform for the data-for-AI lifecycle — real-time and batch feature engineering, online feature serving via RonDB, model registry, model serving, and MLOps pipelines. Available as open-source community, managed serverless, and enterprise editions.
Headquarters
Stockholm, Sweden
Founded
2016
Pricing
EU Data Hosting
Yes
Employees
51-200
Open Source
Yes
Free
Pay-as-you-go
Contact Sales
Billing: free, usage_based, custom
The infrastructure layer for production machine learning has been built almost entirely by US companies. Databricks (San Francisco), Tecton (San Francisco), AWS SageMaker (Seattle), Google Vertex AI (Mountain View) — the organisations that set the standards for how enterprises manage ML features, train models, and serve predictions are, with a few exceptions, American. European ML teams frequently build on American infrastructure, accept US data residency by default, and navigate GDPR compliance as an afterthought.
Hopsworks is the significant exception. Founded in 2016 as a spinout of KTH Royal Institute of Technology and RISE SICS AB — Sweden's leading research institutes — by Professor Jim Dowling, the platform predates AWS SageMaker Feature Store (launched 2020) and can reasonably claim to have invented the open-source feature store concept. The company is headquartered in Stockholm (operating as Logical Clocks AB), subject to Swedish and EU law, and offers EU-hosted managed deployments on AWS Europe, Azure Europe, and GCP Europe regions.
The ML feature store problem Hopsworks solves is fundamental: without a dedicated platform, ML teams duplicate feature computation logic between training pipelines and serving code, creating "training-serving skew" where models behave differently in production than during training. Hopsworks unifies feature engineering, storage, versioning, and serving — so the same feature definitions used to train models are the same ones served at inference time.
Hopsworks provides both an offline feature store (for batch training pipelines) and an online feature store (for real-time inference). This distinction matters in production ML systems: offline features are computed in batch — typically using Spark or Flink on historical data — and stored for model training. Online features need to be served at low latency during inference — milliseconds, not seconds — because they are called at prediction time.
Most feature store implementations have weaker online serving than offline storage. Hopsworks addresses this with RonDB, a distributed in-memory key-value store derived from NDB Cluster technology, which backs the online feature store and delivers sub-millisecond feature lookups. This matters for ML applications like fraud detection, recommendation systems, and personalisation, where prediction latency is a product constraint.
Beyond the feature store, Hopsworks includes a model registry for versioning and managing trained ML models, and model serving infrastructure for deploying and serving predictions. The model registry tracks lineage from feature group versions through training data to the deployed model, which is critical for auditing and reproducing ML system behaviour — a growing regulatory requirement in EU contexts under the AI Act.
More recent versions of Hopsworks integrate vector similarity search, supporting embedding-based ML applications such as semantic search, RAG (retrieval-augmented generation), and recommendation systems. Vector search runs alongside the feature store, meaning embedding features and structured features can be served together in a unified low-latency lookup.
The community edition is fully open source under AGPLv3 and available on GitHub. Unlike many "open core" platforms that withhold commercially useful features to drive enterprise sales, Hopsworks' community edition includes the full feature store, model registry, and pipeline infrastructure. Teams can self-host on any cloud or on-premise infrastructure, inspect the code, and contribute to development.
The primary interaction model for Hopsworks is the Python HSFS (Hopsworks Feature Store) client library. ML engineers define feature groups, feature views, and feature pipelines in Python, and the platform handles storage, versioning, and serving underneath. The HSFS client integrates with Pandas, Polars, and Spark DataFrames, which means minimal changes to existing Python ML workflows when adopting Hopsworks.
Hopsworks operates a three-tier model. The community edition is free: open source, self-hosted, full-featured, community-supported via GitHub and Slack. The serverless managed tier starts free (with usage limits) and scales to usage-based pricing beyond the free tier limits — designed for individual ML engineers and small teams who want managed infrastructure without an enterprise sales process.
Enterprise pricing is custom and includes dedicated managed clusters, private VPC networking, SAML/SSO, RBAC, dedicated customer success, and custom data residency arrangements. Enterprise is the tier for organisations with production ML systems, data governance requirements, and SLA needs.
For teams evaluating whether to self-host the community edition versus use the managed serverless tier, the calculation is typically: self-hosting costs engineering time for infrastructure management; the serverless tier costs money for that time back. Most teams starting with Hopsworks begin on serverless and migrate to enterprise when production scale and governance requirements mature.
Hopsworks' EU positioning is among the strongest in the ML platform space, and it is structural rather than cosmetic.
The company is headquartered in Stockholm, Sweden — a founding EU member with one of the world's strongest data protection regimes (Datainspektionen, now the Swedish Authority for Privacy Protection, or IMY). Logical Clocks AB is a Swedish entity subject to Swedish law and EU GDPR. The managed cloud platform offers EU-region deployments on AWS Europe (Frankfurt, Ireland), Azure Europe (Netherlands, Ireland), and GCP Europe (Belgium, Frankfurt), meaning data can remain within EU borders throughout the ML lifecycle.
The managed cloud platform is SOC 2 Type II certified. The open source community edition allows organisations to self-host entirely within their own infrastructure — on-premise, in a private cloud, or in a specific EU data centre — with zero data leaving the organisation's control.
For European enterprises navigating GDPR compliance in ML systems — where training data and model outputs can contain personal data — Hopsworks' EU headquarters, EU hosting options, data lineage capabilities, and feature group versioning provide compliance tooling that US-headquartered ML platforms cannot offer with equivalent simplicity.
European enterprises building production ML systems on sensitive data (financial services, healthcare, telecommunications) where GDPR data residency requirements apply to training data and model outputs. Hopsworks' EU headquarters and EU hosting options provide structural GDPR alignment.
ML engineering teams replacing ad hoc feature pipelines who are experiencing training-serving skew, duplicated feature computation, or loss of model lineage. These are the canonical feature store problems Hopsworks was built to solve.
Research organisations and academia where the KTH/RISE spinout origin resonates, the open-source AGPLv3 licence is preferable to commercial tools, and the academic publication record (Professor Dowling's MLOps research) provides confidence in the platform's design decisions.
Teams evaluating Databricks or AWS SageMaker Feature Store who want a cloud-agnostic, EU-headquartered alternative with an open-source option.
If the priority is a cloud-agnostic, EU-headquartered feature store with open-source transparency, choose Hopsworks. If the priority is a low-code, notebook-first experience for early-stage data science teams, choose Deepnote or stay in plain Jupyter instead. If production ML is centred entirely on AWS and AWS-only dependencies are acceptable, SageMaker Feature Store will integrate more cleanly.
Hopsworks occupies a distinctive position: the original open-source feature store, founded at one of Europe's leading technical universities, with production-grade real-time serving, model registry, and vector search capabilities — all from a Stockholm-headquartered team subject to EU law. For European ML engineering teams building production systems, the combination of genuine technical depth, EU data sovereignty, and open-source transparency is not replicated elsewhere. The learning curve is real and the community is smaller than Databricks, but for teams who prioritise EU alignment and open-source audibility in their ML infrastructure, Hopsworks is the serious choice.
A feature store manages the computed attributes (features) that ML models use for training and predictions. Teams typically need one when they have more than one or two production ML models, when multiple teams need to share feature definitions, or when they are experiencing training-serving skew (models behaving differently in production than in training). Without a feature store, feature computation logic gets duplicated across training and serving pipelines with no guarantee of consistency.
The open-source community edition and free serverless tier are specifically designed for individual ML engineers and small teams. A data scientist can start with Hopsworks on the serverless tier with no infrastructure management and no upfront cost. Enterprise contracts are for organisations with production SLAs, private networking requirements, and dedicated support needs. The licensing model does not artificially restrict features to push small teams toward enterprise.
Hopsworks predates SageMaker Feature Store by several years and offers a more mature feature engineering workflow, including native Spark and Flink pipeline support, deeper online serving capabilities via RonDB, and a cloud-agnostic architecture (vs. SageMaker's AWS-only dependency). For European teams, Hopsworks' Swedish headquarters and EU-region managed deployments provide data sovereignty that AWS — as a US company under CLOUD Act jurisdiction — cannot structurally replicate.
Hopsworks was founded by Professor Jim Dowling, who holds a position at KTH Royal Institute of Technology in Stockholm. The company spun out of KTH and RISE SICS AB (Research Institutes of Sweden) in 2016. Dowling has published extensively on distributed ML systems, MLOps, and feature stores. This academic origin gives Hopsworks an unusually rigorous theoretical foundation for its design decisions, and active KTH and RISE research connections continue to influence the platform's development.
Hopsworks' Swedish headquarters makes it subject to EU GDPR directly. Managed deployments in EU cloud regions ensure data does not leave EU borders. Feature group versioning and data lineage tracking provide the audit trails needed to respond to GDPR data subject requests (right to erasure, right of access) in ML systems — a capability that generic ML platforms rarely address explicitly. Enterprise customers can additionally configure private VPC networking and custom data residency arrangements for specific compliance requirements.
Open-source AI framework for building RAG pipelines and search applications
Alternative to Langchain, Llamaindex
Open-source LLM observability, tracing, and prompt management platform
AI coding assistant for VS Code and JetBrains powered by Codestral and Devstral
Alternative to Github Copilot, Cursor
High-performance open-source vector database built in Rust