DocumentDB goes cloud-native: Introducing the DocumentDB Kubernetes Operator

Search the blog

Share

READ TIME
9 min

WRITTEN BY

/en-us/opensource/blog/author/abhishek-gupta

Today, we're excited to announce the DocumentDB Kubernetes Operator, an open-source, cloud-native solution to deploy, manage, and scale DocumentDB instances on Kubernetes. DocumentDB is a MongoDB-compatible, open-source document database built on PostgreSQL. The DocumentDB Kubernetes Operator represents a natural evolution of the DocumentDB ecosystem, following our open source announcement and recent joining of the Linux Foundation.

When it comes to distributed databases, there is no one-size-fits-all solution. Database-as-a-Service (DBaaS) options may not always meet customers' data sovereignty or portability needs. On the other hand, managing database clusters manually is complex and resource intensive.

DocumentDB: Open-Source Announcement

What’s needed is a balanced approach: one that automates routine tasks like updates and backups, while simplifying operations such as scaling, failover, and recovery. This is precisely where Kubernetes excels—bridging automation with operational simplicity.

However, unlike stateless applications that can be easily scaled and replaced, running stateful workloads in Kubernetes has always posed unique challenges. The DocumentDB Kubernetes Operator addresses these by using the operator pattern to extend Kubernetes, making it possible to manage DocumentDB clusters as native Kubernetes resources.

This approach creates a clear separation of responsibilities:

  • The database platform team can focus solely on system health.
  • App developers enjoy a DBaaS-like experience, without the need to build custom automation between container orchestration and database operations.
  • The operator handles the complexity of PostgreSQL cluster orchestration, MongoDB protocol translation, and other critical operations.
  • Application development teams can integrate services using MongoDB-compatible drivers and tools, thereby simplifying the process of migrating existing workloads to DocumentDB, or building new cloud-native applications.
DocumentDB joins the Linux Foundation

DocumentDB operator architecture overview

To understand how this works, let’s take a look under the hood, to explore the key components and architecture that make this seamless Kubernetes integration possible.

A DocumentDB cluster deployed on Kubernetes consists of multiple DocumentDB instances that are orchestrated by the operator. A DocumentDB instance consists of the following core components that run inside a Kubernetes Pod:

  • PostgreSQL with DocumentDB Extension: This is the core database engine enhanced with document storage and querying capabilities.It is deployed in customer application namespaces on Kubernetes worker nodes.
  • Gateway Container: A protocol translator that runs as a sidecar container, converting MongoDB wire protocol requests into PostgreSQL DocumentDB extension calls.

By default, the DocumentDB instance is accessible within the cluster. If configured, the operator creates a Kubernetes Service for external client applications to connect to the DocumentDB cluster (via the Gateway) using any MongoDB-compatible client or tooling.

CloudNative-PG operator for PostgreSQL orchestration

The DocumentDB operator uses the CloudNative-PG (CNPG) operator for PostgreSQL cluster management. CNPG is a Cloud Native Computing Foundation (CNCF) Sandbox project that provides an open-source Kubernetes operator for managing PostgreSQL workloads. The CNPG operator runs in the cnpg-system namespace on Kubernetes worker nodes. Behind the scenes, the DocumentDB operator creates the required CNPG resources to manage the lifecycle of PostgreSQL instances with the DocumentDB extension.

Figure 1: High level overview of DocumentDB cluster deployment on Kubernetes
The operator also includes a CNPG Sidecar Injector component, which is an admission webhook that automatically injects the DocumentDB Gateway container into PostgreSQL pods during deployment. Thanks to the extensibility of CNPG, the DocumentDB gateway container is implemented as a CloudNativePG Interface (CNPG-I) plugin.

DocumentDB is addressing a real need as an open-source, document-oriented NoSQL database built on PostgreSQL. By offering MongoDB API compatibility without vendor lock-in, it tackles a long-standing challenge for developers. We are thrilled to see the DocumentDB Kubernetes Operator joining the Linux Foundation, and proud that under the hood, it's powered by CloudNativePG, a CNCF Sandbox project. The future of PostgreSQL on Kubernetes just got even brighter!

Gabriele Bartolini, Vice President, EDB

Getting started with DocumentDB Kubernetes Operator

Ready to try it out? Getting started with the operator is straightforward. You can use a local Kubernetes cluster such as minikube or kind and use Helm for installation.

First, execute the commands below to install cert-manager to manage TLS certificates for the DocumentDB cluster:

helm repo add jetstack https://charts.jetstack.io

helm repo update

helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set installCRDs=true
theme
eclipse
language
powershell
title
Text
Next, install the DocumentDB operator using the Helm chart:
helm install documentdb-operator oci://ghcr.io/microsoft/documentdb-operator --namespace documentdb-operator --create-namespace
theme
eclipse
language
powershell
title
Text

This will install the latest version of the operator. To specify a version, use -- version.

Wait for the operator to start. Run this command to verify its status:

kubectl get pods -n documentdb-operator
language
powershell
title
Text
You should see an output like this:
NAME                                 READY   STATUS    RESTARTS  AGE

documentdb-operator-65d6b97878-ns5wk 1/1     Running   0         1m
language
powershell
title
Text
Now, create a Kubernetes Secret to store the DocumentDB credentials. This should have your desired administrator username and password (make sure to note them down):
cat <<EOF | kubectl apply -f -
apiVersion: v1

kind: Namespace

metadata:

  name: documentdb-preview-ns

---

apiVersion: v1

kind: Secret

metadata:

  name: documentdb-credentials

  namespace: documentdb-preview-ns

type: Opaque

stringData:

  username: k8s_secret_user

  password: DemoPwd

EOF
language
powershell
title
Text
With the credentials in place, create a single-node DocumentDB cluster:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: documentdb-preview-ns
---
apiVersion: db.microsoft.com/preview
kind: DocumentDB
metadata:
  name: documentdb-preview
  namespace: documentdb-preview-ns
spec:
  nodeCount: 1
  instancesPerNode: 1
  documentDbCredentialSecret: documentdb-credentials
  resource:
    storage:
      pvcSize: 10Gi
  exposeViaService:
    serviceType: ClusterIP
EOF
language
powershell
title
Text
Wait for the DocumentDB cluster to be fully initialized, and run this command to verify that it is running:
kubectl get pods -n documentdb-preview-ns
language
powershell
title
Text
You should see an output like this:
NAME                   READY   STATUS    RESTARTS   AGE

documentdb-preview-1   2/2     Running   0          1m
language
powershell
title
Text
Once the cluster is running, you can connect to the DocumentDB instance directly through the Gateway port 10260. For both minikube and kind, this can be easily done using port forwarding:
kubectl port-forward pod/documentdb-preview-1 10260:10260 -n documentdb-preview-ns
highlight
2-5 rows
theme
eclipse
language
powershell
title
Text
With port forwarding active, you can now connect using any MongoDB client or tool. For example, from a different terminal, try connecting with mongosh (MongoDB shell):
mongosh 127.0.0.1:10260 -u k8s_secret_user -p DemoPwd --authenticationMechanism SCRAM-SHA-256 --tls --tlsAllowInvalidCertificates
highlight
2-5 rows
theme
eclipse
language
powershell
title
Text

Join us in our mission to advance the open-source document database ecosystem

The DocumentDB Kubernetes Operator represents an important milestone in our broader mission and our commitment to vendor-neutral, community-driven development that puts developer needs first.

We invite you to join the community and help shape the future of cloud-native document databases.

Get started by exploring the GitHub repository, documentation, or participate in discussions on our Discord community. As the project continues to evolve under Linux Foundation governance, you can expect to see contributions that expand functionality and integrate with other Kubernetes and CNCF projects.

/en-us/opensource/blog/author/abhishek-gupta
Related posts