DevOps/GitOps v1.0.0
| CNCF Landscape
Overview
DevOps is about automation across the lifecycle of an application. GitOps extends that with disciplined methods —Git as the single Source of Truth (SoT) —across all layers of all components, from infra to services, with the goal of repeatable, verifiable deployment states.
GitOps is an operational framework that takes DevOps best practices used for application development such as version control, collaboration, compliance and such, and applies them to infrastructure automation. GitOps consists of Infrastructure as Code (IaC), configuration management (CM) by Git, Platform Engineering, and Continuous Integration and Continuous Delivery (CI/CD).
Why
Q:
How many possible configurations are there for 3 hosts, each having 6 services, each having 6 parameters, each having only two possible settings?
This scenario is an artificially simple infrastructure to steelman the argument against DevOps/GitOps/IaC. So, let's see what we may see …
A:
Parameters per service: Each service has 6 parameters, and each parameter has 2 settings. So, the number of configurations for one service is:
2^6 = 64
Services per host: Each host has 6 services, so the number of configurations for one host is:
64^6
=(2^6)^6
=2^36
= 68,719,476,736Total hosts: There are 3 hosts, so the total number of configurations is:
(68,719,476,736)^3
=(2^36)^3
=2^108
Final calculation:
(2^36)^3
=2^108
So, the number of possible configurations is:
324,518,553,658,426,726,783,156,020,576,256
(~ 3.2 x 10^32
)
That's many more than a trillion trillion possible configurations.
More than the estimated number of stars in the Universe.
Not the galaxy. The entire Universe.
And only one of those is the one you want. All the others are some kind of misconfiguration.
Do you like those odds?
DevOps/GitOps/IaC is an upfront cost that pays dividends each time it is applied. And the more your infa builds out, the larger those per-build dividends grow.
Conversely, absent DevOps/GitOps/IaC, every stage of the build out is levied a tax dwarfing that of the prior stage. The non-linear explosion of misconfigurations is merciless. It grinds down morale along with productivity.
Principles
- Declarative
A system managed by GitOps must have its desired state expressed declaratively. - Versioned and Immutable
Desired state is stored in a way that enforces immutability, versioning and retains a complete version history. - Pulled Automatically
Software agents automatically pull the desired state declarations from the source. - Continuously Reconciled
Software agents continuously observe actual system state and attempt to apply the desired state.
Results
- A standard workflow for application development.
- Increased security for setting application requirements upfront.
- Improved reliability with visibility and version control through Git.
- Consistency across clusters and their environments.
Methods
- Declarative Configuration:
Use declarative configurations (YAML files) for all resources and store them in a Git repository. This approach ensures that the desired state of your cluster is version-controlled and auditable.- Branching Strategies:
Trunk-based rather than Gitflow to manage different environments (development, staging, production) or to handle feature development and releases.
- Branching Strategies:
- Pull Request Workflow:
Use pull (merge) requests (PR/MR) to manage changes to the Kubernetes configuration. This allows for code review, approval processes, and automated testing before changes are merged and applied. - Automated Deployment:
Implement CI/CD pipelines that automatically apply changes from Git to your Kubernetes cluster. This could involve testing changes in a staging environment before promoting them to production. - Disaster Recovery:
Regularly back up your Git repository and Kubernetes cluster state. Ensure you have a process in place for restoring from backups in case of a disaster.
Tools | CNCF Landscape
- Videos
- Cloud Wrappers
- LocalStack : Mocks cloud-vendor services locally. Develop and test your AWS applications locally to reduce development time and increase product velocity. Reduce unnecessary AWS spend and remove the complexity and risk of maintaining AWS dev accounts.
- CloudCraft : 3D graphic and resource/cost model of a cloud infra
- Service Catalog : UI of IDP : built/maintained
by GitOps/DevOps vendor/admin, not by end users.
- Port : SasS only
- Backstage.io : Build Developers' Portals (IDP)
- Crossplane.io @ GitHub
- Programmable Control Plane, Controllers, APIs
- Embed IaC tooling such as Terraform, Helm, Ansible, which converts IaC to Cloud-vendors' API requests.
- IaC : Service Management : Provision/Configure:
- Kubernetes :
Cluster API, Crossplane, …
K8s is a universal Control Plane - Pulumi :
IaC in any language, but pay per play platform interfacing to cloud vendors.
- sst Pulumi wrapper : Deploy everything your app needs with a single config.
- Terraform:
Declarative provisioning of cloud infrastructure and policies (per-vendor modules), and managing Kubernetes resources. - Ansible:
Provision and configure infrastructure, OS/packages, and application software in any environment. A comprehensive, versatile automation tool allowing for both declarative and imperative methods. - SaltStack | ChatGPT Infrastructure automation and CM.
- Kubernetes :
Cluster API, Crossplane, …
- IaC : Workloads
- Application Management (K8s Manifests)
- K8s Operator Pattern
The goal of an Operator is to put operational knowledge into software. Operators implement and automate common Day-1 (installation, configuration, etc.) and Day-2 (re-configuration, update, backup, failover, restore, etc.) activities in a piece of software running in K8s, by integrating natively with K8s concepts and APIs. We call this a K8s-native application. Instead of treating an app as a collection of primitives (Pods, Deployments, Services , ConfigMaps, …) it's treated as a single object that only exposes the knobs that make sense for the application, extending the core K8s API with CRDs as needed to do so.- Operator Framework
- Mast : Ansible runner to build simple and lightweight K8s Operators
- Kopd (K8s Operator Pythonic Framework) : Framework and library for building K8s-operators
- kube.rs : Rust client for K8
- kubebuilder : Project for learning and building K8s API extensions
- Helm:
K8s package manager; version controlled and deployable using GitOps tools like Argo CD or Flux. - Kustomize:
Generate, customize, and/or otherwise manage Kubernetes objects using files (YAML) stored in a Git repo. It's integrated intokubectl
and can be used with other GitOps tools to manage deployments. Use to modify Helm chart per environment. - Timoni.sh (Uses CUE)
Distribution and Lifecycle Management for Cloud-Native Applictions - KCL @ GitHub An open-source constraint-based record & functional language mainly used in configuration and policy scenarios. Writtin in Rust, Golang, Python.
- CUE
- Pkl
Configuration that is Programmable, Scalable, and Safe
- K8s Operator Pattern
- CI : Glorified Chron Job
- Dagger functions: Pipeline agnostic functions that run in CI/CD pipeline of any vendor.
- Tekton
- Argo Workflows
- Jenkins
- GitHub Actions
- GitLab CI
- CD : Application Lifecycle
- Flux CD
- A tool to automatically sync K8s clusters/applications with their configuration sources (Git) across their lifecycles.
- Supports automated deployments, where changes to the Git repo trigger updates in the Kubernetes cluster.
- Handles secret management and multi-tenancy.
- Flagger integration: Flux can be used together with Flagger for progressive delivery; advanced deployment strategies.
- Flagger:
- Automates the release process by gradually shifting traffic to the new version while measuring metrics and running conformance tests. If anomalies are detected, Flagger can automatically rollback.
- Designed for progressive delivery techniques like canary releases, A/B testing, and blue/green deployments.
- Service-Mesh Integration: Used with service meshes like Istio, Linkerd, and others, leveraging their features for traffic shifting and monitoring.
- Argo CD:
- A declarative, GitOps continuous delivery tool for K8s.
Visualize (Web UI) and manage the lifecycle of K8s applications;
supports automated or manual syncing of changes.
- For CD, also need Argo Workflows & Argo Events (preferred over Tekton Events)
- Application Definitions, Configurations, and Environments: All these are declaratively managed and versioned in Git.
- Automated Deployment: Argo CD automatically applies changes made in the Git repository to the designated Kubernetes clusters.
- Visualizations and UI: Argo CD provides a rich UI and CLI for viewing the state and history of applications, aiding in troubleshooting and management.
- Rollbacks and Manual Syncs: Supports rollbacks and manual interventions for syncing with Git repositories.
- A declarative, GitOps continuous delivery tool for K8s.
Visualize (Web UI) and manage the lifecycle of K8s applications;
supports automated or manual syncing of changes.
- Argo Rollouts: Advanced deployment strategies like canary and blue/green. Similar to Flagger
- Flux CD
- Application Management (K8s Manifests)
- Multi-tenancy
- vCluster Virtual clusters for better isolation than Namespace offers. OSS and Enterprise editions.
- Logging : Cluster-level logging, AKA Log Aggregation AKA Unified Loggging, so that logs survive their (ephemeral) generator, be that of any host or container process.
- Elastic stack : to collect, store, query, and visualize log data.
- Composed of:
- Backend : Elasticsearch : A search & analytics engine, with an integral storage scheme. Elasticsearch uses a distributed document-oriented database model where it stores data in indices. These indices are persisted to disk in a data directory, typically managed by Elasticsearch nodes. The storage and retrieval of data are handled internally by Elasticsearch using its own mechanisms, such as the Lucene library for indexing and searching.
- Frontend : Kibana frontend : Web UI optimized for query/view —Explore, Visualize, Discover —logs from Elasticsearch.
- Agent : Collector/Forwarder of container logs : This is the data-processing pipeline that ingests logs from applications, and then transform (normalize) and forwards them to provide for Unified Logging. This is the stack's workhorse, yet oddly external to the stack namesake and core (Elasticsearch/Kibana). Solutions are provided by various projects, many entirely separate from Elasticsearch (the company):
- Logstash : Elastic's native solution
- Fluentd : Data collector (not limited to logs, metrics and tracing).
- Fluent Bit : Lightweight forwarder for Fluentd for environments having limited resources
- Fluent Operator, formerly "FluentBit Operator" : Manage Fluent Bit and Fluentd the Kubernetes way.
- Stacks
- ECK Operator (Elastic Cloud on K8s) Contains only Elasticsearch and Kibana. Does not include any Collector/Forwarder (Fluentd, Logstash, …)
- Deploy in air-gap environment :
- EFK Stack | HowTo | Helm
- ELK stack : Logstash instead of Fluentd for log processing and aggregation. Logstash is more resource-intensive but offers more complex processing capabilities.
- OpenSearch : FOSS fork of Elastic stack (Elasticsearch/Kibana)
- Data Prepper : Data collector designed specifically for OpenSearch; focus is on observability data, particularly logs, metrics, and traces.
- ECK Operator (Elastic Cloud on K8s) Contains only Elasticsearch and Kibana. Does not include any Collector/Forwarder (Fluentd, Logstash, …)
- Composed of:
- Grafana Loki |
grafana/loki
(Install) : "Prometheus, but for logs". A lightweight alternative to Elastic stack.- Does not provide full-text indexing of logs; indexes only the logs' metadata (labels).
- No viable installation method is available (2024-08), contrary to project claims.
- Elastic stack : to collect, store, query, and visualize log data.
- Observability : Distributed Tracing and Metrics
- Prometheus : TSDB and monitoring system optimized for telemetry (metrics and tracing).
The defacto standard, but does not scale, and has horrible alerts (Alertmanager). So popular that projects provide workarounds to manage scaling. Provision using Prometheus Operator :
- prometheus-operator/prometheus-operator :
The bare operator (bundle.yaml
) kube-prometheus
: A collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide … end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.- The Prometheus Operator
- Grafana
- Highly available Prometheus
- Highly available Alertmanager
- Prometheus
node-exporter
- Prometheus
blackbox-exporter
- Prometheus Adapter for Kubernetes Metrics APIs
kube-state-metrics
; replacment formetrics-server
- Install using one of two very similar projects:
kube-prometheus
Manifest method :prometheus-operator/kube-prometheus
kube-prometheus-stack
Helm method :prometheus-community/kube-prometheus-stack
- Thanos @ GitHub : Prometheus HA + long-term storage (MinIO) : CNCF project; can "seamlessly upgrade" on top of an existing Prometheus deployment.
- prometheus-operator/prometheus-operator :
- Grafana : Web UI : Dashboards
- Grafana Tempo : Tracing backend; scales and integrates with OpenTelemetry, Zipkin, and Jaeger; fixes Jaeger shortcomings.
- Jaeger : Tracing collector that integrates with OpenTelemetry
- Jaeger Operator @ GitHub
- Requires
cert-manager
- Requires
- Jaeger Operator @ GitHub
- OpenTelemetry (OTEL)
Vendor-agnostic tracing library for generating traces.
Its app library covers almost all languages.- OpenTelemetry Operator @ GitHub : K8s Operator to manage collectors (OpenTelemetry Collector) and auto-instrumentation of workloads using OTEL libraries.
- VictoriaMetrics : TSDB & Monitoring Solution (as a Service); compatible with Prometheus.
Inspektor-Gadget.io : eBPF-based CLIs (gadgets)
kubectl gadget deploy # Monitor all network traffic of a namespace kubectl gadget advise network-policy monitor -n $ns -o network.$ns.log.json # Processes in containers of Pods kubectl get snapshot process -n $ns # Inspect top (processes) of a namespace kubectl gadget top file -n $ns # Trace requests into services of a namespace kubectl gadget trace tcp -n $ns
- Expanding BPF usage from single nodes to across the entire cluster layers
- Maps low-level Linux resources to high-level Kubernetes concepts integration
- Use stand-alone or integrate into your own tooling, e.g., Prometheus metrics.
- Several tools utilized it already, e.g., Kubescape
Robusta : Alerting
Komodor : Troubleshooting
Pixie : All in one
Groundcover : All in one
- Prometheus : TSDB and monitoring system optimized for telemetry (metrics and tracing).
The defacto standard, but does not scale, and has horrible alerts (Alertmanager). So popular that projects provide workarounds to manage scaling. Provision using Prometheus Operator :
- Streaming/Messaging : Run on dedicated nodes
- RabbitMQ : A widely used open-source message broker that supports multiple messaging protocols, including AMQP, MQTT, and STOMP. It's known for its simplicity, ease of setup, and support for various messaging patterns like work queues, publish-subscribe, and routing. RabbitMQ is a good choice for IoT and other simpler, high-throughput messaging scenarios.
- Strimzi : Kafka on K8s :
strimzi-kafka-operator
: For production features such as rack awareness to spread brokers across availability zones, and K8s taints and tolerations to run Kafka on dedicated nodes. Expose Kafka outside K8s using NodePort, Load balancer, Ingress and OpenShift Routes. Easily secured using TLS over TCP. The Kube-native management of Kafka can also manage Kafka topics, users, Kafka MirrorMaker and Kafka Connect using Custom Resources. Allows for using K8s processes and tooling to manage complete Kafka applications.- Kafka operators to deploy and configure an Apache Kafka cluster on K8s.
- Kafka Bridge provides a RESTful interface for your HTTP clients.
- NATS : A lightweight, high-performance messaging system designed for microservices, IoT, and cloud-native systems. It supports various messaging models including pub-sub, request-reply, and queueing. NATS is known for its simplicity and performance.
- Redpanda : A newer, Kafka-compatible streaming platform designed to offer better performance and easier operation. It is API-compatible with Kafka, which means existing Kafka clients and ecosystem tools work with Redpanda without modification. Redpanda is designed to be simpler to deploy and manage, with a focus on reducing operational overhead.
- >Enabling SELinux can result in latency issues. If you wish to avoid such latency issues, do not use this mechanism.
- Networking
- External Load Balancer
- CNI
- Ingress
- Istio
- Traefik : Automatically wires routes per Service discovery
- Ingress-Nginx
- Cilium : eBPF
- Consul
- K8s Gateway API
- Service Mesh
- Service Discovery
- etcd : K8s cluster
- Consul : Multi-cluster
IA/Security
- AuthN/AuthZ
- AuthN (Authentication)
- K8s
- Two types of subject : K8s provides for binding either type to roles for cluster access:
ServiceAccount
: K8s object declaring a non-human entity (subject). E.g., a Pod.- A user and/or group : K8s concept of human entity (subject).
Though K8s has neither user nor group objects, it searches for these subjects in certificates and tokens, and provides for binding them to roles.
- Two scenarios:
- Clients authenticating against the K8s API server
- The two most common methods:
- X.509 certificate issued by K8s CA
- Token (JWTs) generated by an OIDC provider, e.g., Dex or Keycloak, which may proxy an upstream Identity Provider (IdP) such as AD. K8s recognizes the subject, e.g., by token claims of user/group, or
ServiceAccount
having K8scluster.user
.
- Regardless of method, identities that match a (
Cluster
)RoleBinding
are authorized for access according to the associated (Cluster
)Role
.
- The two most common methods:
- Users authenticating at web UI against an application running on the cluster.
- Token (JWTs) generated by an OIDC provider, which may be same as other scenario, to enable Single Sign On (SSO), since OIDC is just an extention of OAuth2.
- Clients authenticating against the K8s API server
- Authentication Plugins
- Static Token file
- Bearer token
- Service Account token
- X.509 certificates (TLS)
- Open ID Connect (OIDC) token
- Authentication proxy
- Webhook
- Static Token file
- Two types of subject : K8s provides for binding either type to roles for cluster access:
- K8s
- AuthZ (Authorization) | Modules/Modes
Regardless of authentication method, K8s can implement Role-based Access Control (RBAC) model against subjects (known by request attribute(s)) using a pair of K8s objects for each of the two scopes of K8s API resources (api-resources
):- K8s
- Namespaced (
Deployment
,Pod
,Service
, …)Role
: Rules declaring the allowed actions (verbs
) uponresources
scoped to APIs (apiGroup
).RoleBinding
: Binding a subject (authenticated user or ServiceAccount) to a role.
- Cluster-wide (
PersistentVolume
,StorageClass
, …)ClusterRole
ClusterRoleBinding
- Namespaced (
- K8s
- AuthN (Authentication)
- Distributed-Workload Identities : AuthN providing IDs having AuthZ primitives.
- SPIFFE/SPIRE : Successor to RBAC for defining (SPIFFE) and implementing (SPIRE) a workload identity platform and access controls rooted in Zero Trust (versus Perimeter Security) principles to mitigate attack risk. SPIFFE/SPIRE provides a uniform identity layer across distributed systems. The core idea is to issue SVIDs (SPIFFE Verifiable Identity Documents), of either X.509 or JWT, to workloads based on strong attestation (e.g., node identity, container metadata). Tools like OPA, Istio, Linkerd, or Envoy with RBAC consume SPIFFE IDs for authorization policies (AuthZ). So the same artifact (SVID) is used for both AuthN (proof of identity) and as a handle for AuthZ rules.
- Secure Production Identity Framework for Everyone (SPIFFE) : An OSS framework specificition to provide attested, cryptographic identities to distributed workloads; capable of bootstrapping and issuing identity to services; defines short-lived cryptographic identity documents (SVID) via a simple API. Workloads use these SVIDs when authenticating to other workloads, for example by establishing a TLS connection or by signing and verifying a JWT token.
- SPIFFE Runtime Environment (SPIRE) : a production-ready implementation of the SPIFFE APIs (pluggable multi-factor attestation and SPIFFE federation) that performs node and workload attestation in order to securely issue SVIDs to workloads, and verify the SVIDs of other workloads, based on a predefined set of conditions.
spiffe://cluster/ns/foo/sa/bar
- A cryptographically verifiable identity
- SPIFFE/SPIRE : Successor to RBAC for defining (SPIFFE) and implementing (SPIRE) a workload identity platform and access controls rooted in Zero Trust (versus Perimeter Security) principles to mitigate attack risk. SPIFFE/SPIRE provides a uniform identity layer across distributed systems. The core idea is to issue SVIDs (SPIFFE Verifiable Identity Documents), of either X.509 or JWT, to workloads based on strong attestation (e.g., node identity, container metadata). Tools like OPA, Istio, Linkerd, or Envoy with RBAC consume SPIFFE IDs for authorization policies (AuthZ). So the same artifact (SVID) is used for both AuthN (proof of identity) and as a handle for AuthZ rules.
- Threat Detection / Remediation : CVEs (Common Vulnerabilities and Exposures)
- K8s Adminssion Controllers
- Trivy : Scan OCI-container images, OS folders, Kubernetes clusters, Git repos, virtual machines, and more. And can create SBOM and CVE-vulnerabilities audits of them.
trivy-operator
by Helm chart : Recurringly scan all container images : GeneratesVulnerabilityReport
s per Pod, DaemonSet, … across all/declared Namespaces.
- Kubescape (10K) : Runtime Detection
- Falco by Sysdig (7K) : Threat detection/reporting : Runtime security across hosts, containers, Kubernetes, and cloud environments. It leverages custom rules on Linux kernel events and other data sources through plugins, enriching event data with contextual metadata to deliver real-time alerts. Falco enables the detection of abnormal behavior, potential security threats, and compliance violations.
cve-bin-tool
:
Python tool for finding known vulnerabilities in software, using data from the NVD's list of CVEs as well as known vulnerability data from Redhat, Open Source Vulnerability Database (OSV), Gitlab Advisory Database (GAD), and Curl
- Policy Enforcement
- K8s Adminssion Controllers
- Kyverno Policy as Code (6K)
- Kyverno CLI can be used to apply and test policies off-cluster e.g., as part of an IaC and CI/CD pipelines.
- Kyverno Policy Reporter : a sub-project of Kyverno that provides in-cluster management of policy reports with a web-based graphical user interface.
- Kyverno JSON : a sub-project of Kyverno that allows applying Kyverno policies to off-cluster workload. It works on any JSON payload.
- Kyverno Chainsaw sub-project of Kyverno provides declarative end-to-end testing for Kubernetes controllers.
- OPA/Gatekeeper (3.7K): Automated policy enforcement
- Kubearmor (1.6K): Runtime Security Enforcement : Policy-based controls : a runtime Kubernetes security engine that uses eBPF and Linux Security Modules (LSM) for fortifying workloads based on Cloud Containers, IoT/Edge, and 5G networks.
Secrets
- Vault :
vault-helm
:vault-secrets-operator
provides for various implementations:- Vault Agent Sidecar Injection : Most secure (rest/transit) :
Transparent encrypt/decrypt of K8s
Secret
objects (data
) - Vault CSI Driver : Fetch secrets on container init and mount as files in the container.
- External Secrets Operator : Integrates with external secret stores (Vault, AWS Secrets Manager, Google Cloud Secret Manager); syncs secrets from Vault into K8s
Secret
objects. - Direct API Calls to Vault : Configured per app.
- Vault Agent Sidecar Injection : Most secure (rest/transit) :
Transparent encrypt/decrypt of K8s
- External Secrets Operator :
Pull/Push secrets; sync K8s
Secret
objects (decrypted) with external store (encrypted). - Teller : Like External Secretes Operator; local CLI secrets manager for developers.
- Bitnami
SealedSecret
s :- Asymmetric crypto allows developers to encrypt secrets as "
SealedSecret
" K8s object (CRD) to store outside the cluster in (public) Git repo or other untrusted environment. - Automatically decrypts it and creates a regular Kubernetes Secret object, accessible to your applications.
- Once in the cluster, it is stored unencrypted in K8s
Secret
object. - Components
- A cluster-side
controller
/ operator - A client-side utility:
kubeseal
: Utility encrypts secrets that only the controller can decrypt.
- A cluster-side
- Transit Engine : works entirely within Vault and does not require a sidecar or agent within your Pods. It is used by the K8s control plane (e.g., API server) to perform encryption operations, ensuring data is encrypted when stored (e.g., in etcd) and decrypted when needed.
- Asymmetric crypto allows developers to encrypt secrets as "
- SOPS (Secrets OPerationS) : Secrets management; editor that interfaces with Vault etal.
age
(Actually Good Encryption) : A simple CLI providing AEAD encrypt/decrypt. Rejoice over a replacement for the obnoxiously complicated PGP (Pretty Good Privacy) project.# Install : bad binary sudo curl -sSLo /usr/local/bin/age https://dl.filippo.io/age/latest?for=linux/amd64 # Install : okay go install filippo.io/age/cmd/...@latest sudo ln -s /home/u1/go/bin/age /usr/local/bin # Generate public-private key pair key=age.key pub="$(age-keygen -o $key 2>&1 |cut -d':' -f2 )" # Encrypt a source (archive) tar cvz ~/$src |age -r $pub > $src.tgz.age # Decrypt a source age -d -i $key $src.tgz.age > $src.tgz
- Vault :
Signing
- Sigstore Cosign
- Notary
TLS
- cert-manager : …obtain certificates from … public … as well as private Issuers …, and ensure the certificates are valid and up-to-date, and … renew certificates at a configured time before expiry.
- AuthN/AuthZ
Storage
- MinIO : a Kubernetes-native high performance object store with an S3-compatible API;
supports deploying MinIO Tenants onto private and public cloud infrastructures AKA Hybrid Cloud.
- MinIO Operator :
minio/operator
| Docs
- MinIO Operator :
- Rook : Open source cloud-native storage orchestrator, providing the platform,
framework, and support for Ceph storage to natively integrate with cloud-native environments. Provides S3/Swift API.
- Ceph : Distributed storage system that provides file, block and object storage and is deployed in large scale production clusters on commodity hardware.
- JuiceFS : Distributed POSIX file system built on top of Redis and S3 (MinIO).
- Gluster
- Longhorn
- KubeFS
- Database
- Managed
- Aiven
- KubeBlocks : K8s Operator : Supports many databases
- TiKV : distributed, and transactional key-value database. FOSS. CNCF Graduated project.
- Cassandra : NoSQL distributed database. Apache/CNCF project.
- NiFi @ GPTchat : A system to ingest, process and distribute data (from anywhere); automated and managed flow of information between systems; suited for complex data integration, ETL processes, real-time data flows, and scenarios requiring detailed data lineage and tracking. Apache/CNCF project.
- CloudNativePG (CNPG) : K8s Operator covering full lifecycle of a highly available PostgreSQL database cluster with a primary/standby architecture, using native streaming replication. A CNCF project.
- Atlas Operator : Schema
- Managed
- MinIO : a Kubernetes-native high performance object store with an S3-compatible API;
supports deploying MinIO Tenants onto private and public cloud infrastructures AKA Hybrid Cloud.
Misc
- Charm : Library and Tools
- Velero : Backup K8s cluster and
PersistentVolume
s
Environments
Upon what infrastructure does the app AKA workload AKA service run?
- Cloud : 3rd-party vendor, typically virtual; SDNs, VMs, …
- On-prem : Self managed; physical and/or virtual
- Bare-metal : OS and app on physical machine, sans hypervisor/virtualization, regardless of whether on-prem or in cloud.
- Edge : More than just a reference to gateway router(s); an environment and topology.
Distributed architectures and practices for processing data closer to where it is generated or consumed.
- Computing:
- Proximal to Data : Located close to the source of data, such as IoT devices, sensors, or users. This proximity allows for faster data processing and reduced latency.
- Distributed Architecture : Deploying smaller, localized data centers or computing resources that work together with centralized cloud services. This creates a distributed architecture where certain tasks are handled at the edge, while others are processed in the cloud or a central data center.
- Real-Time Processing : For applications that require real-time processing and quick decision-making, such as autonomous vehicles, industrial automation, and smart cities.
- Reduced Bandwidth Usage : Only the relevant or processed data needs sent to central data center/cloud, reducing amount of data egress.
- Environment:
- Edge Devices : Sensors, IoT devices, smart appliances, …
- To generate or consume data.
- Edge Servers or Mini Data Centers : small-scale computing resources in retail stores, factories, telecom towers, vehicles, … deployed close to edge devices
- To process and analyze data locally.
- Edge Gateways : Routers and other devices.
- To aggregating data from various edge devices and sometimes perform initial processing before forwarding data to central servers or the cloud.
- Edge Devices : Sensors, IoT devices, smart appliances, …
- Computing:
Configuration
- Repository Structure:
Organize your Git repository in a way that reflects your deployment environments and application structure. This could involve separate directories for each environment and application. - Secrets Management:
Use tools like Bitnami Sealed Secrets, SOPS, or Vault for encrypting secrets that are stored in Git. This ensures that sensitive information is securely managed. - Monitoring and Alerting:
Integrate monitoring and alerting tools to track the health of your deployments and the Kubernetes cluster. Prometheus and Grafana are commonly used tools that can be managed via GitOps. - Policy Enforcement:
Use policy-as-code tools like Open Policy Agent (OPA) or Kyverno to enforce policies on your Kubernetes clusters. Store policies in Git to apply them consistently across your environments. - Security Scanning:
Implement security scanning of your Docker images and Kubernetes configurations as part of your CI/CD pipelines. Tools like Trivy, Clair, and KubeLinter can be integrated into your workflows.
Schemes for Unique Identifier
- UUID (Universally Unique Identifier):
A widely used 128-bit number that guarantees uniqueness across different systems, often used in databases and distributed systems; considered the standard for generating unique identifiers. [1, 2, 3] - NanoID:
A compact, URL-friendly unique string generator with a similar collision probability to UUID, designed for JavaScript environments [1, 4, 5] - ULID (Universally Unique Lexicographically Sortable Identifier): A unique identifier that is also sortable, combining timestamp and randomness for efficient database indexing [1, 6]
- CUID (Collision-resistant Unique Identifier): A unique identifier that incorporates a timestamp, counter, and random characters, designed for collision resistance in distributed systems [1, 5]
- Snowflake ID: A unique identifier scheme often used in distributed databases, incorporating timestamps and sequence numbers to generate unique IDs [6, 7]
Key points to consider when choosing a unique identifier scheme:
- Collision probability: How likely is it for two different identifiers to be the same.
- Length and readability: How long is the identifier and how easy is it to read and interpret.
- Sorting capabilities: Whether the identifier can be easily sorted in a database
- Generation method: How the identifier is generated, whether it uses randomness or timestamps