DevOps/GitOps `v1.0.0` | CNCF Landscape

Overview

DevOps is about automation across the lifecycle of an application. GitOps extends that with disciplined methods —Git as the single Source of Truth (SoT) —across all layers of all components, from infra to services, with the goal of repeatable, verifiable deployment states.

GitOps is an operational framework that takes DevOps best practices used for application development such as version control, collaboration, compliance and such, and applies them to infrastructure automation. GitOps consists of Infrastructure as Code (IaC), configuration management (CM) by Git, Platform Engineering, and Continuous Integration and Continuous Delivery (CI/CD).

Why

Q:

How many possible configurations are there for 3 hosts, each having 6 services, each having 6 parameters, each having only two possible settings?

This scenario is an artificially simple infrastructure to steelman the argument against DevOps/GitOps/IaC. So, let's see what we may see …

A:

Parameters per service: Each service has 6 parameters, and each parameter has 2 settings. So, the number of configurations for one service is:
2^6 = 64
Services per host: Each host has 6 services, so the number of configurations for one host is:
64^6 = (2^6)^6 = 2^36 = 68,719,476,736
Total hosts: There are 3 hosts, so the total number of configurations is:
(68,719,476,736)^3 = (2^36)^3 = 2^108
Final calculation: (2^36)^3 = 2^108

So, the number of possible configurations is:
324,518,553,658,426,726,783,156,020,576,256
(~ 3.2 x 10^32)

That's many more than a trillion trillion possible configurations.
More than the estimated number of stars in the Universe. Not the galaxy. The entire Universe.

And only one of those is the one you want. All the others are some kind of misconfiguration.

Do you like those odds?

DevOps/GitOps/IaC is an upfront cost that pays dividends each time it is applied. And the more your infa builds out, the larger those per-build dividends grow.

Conversely, absent DevOps/GitOps/IaC, every stage of the build out is levied a tax dwarfing that of the prior stage. This non-linear explosion of misconfigurations is merciless. It grinds down productivity along with morale, and does so ever more with each iteration.

Principles

Declarative
A system managed by GitOps must have its desired state expressed declaratively.
Versioned and Immutable
Desired state is stored in a way that enforces immutability, versioning and retains a complete version history.
Pulled Automatically
Software agents automatically pull the desired state declarations from the source.
Continuously Reconciled
Software agents continuously observe actual system state and attempt to apply the desired state.

Results

A standard workflow for application development.
Increased security for setting application requirements upfront.
Improved reliability with visibility and version control through Git.
Consistency across clusters and their environments.

Methods

Declarative Configuration:
Use declarative configurations (YAML files) for all resources and store them in a Git repository. This approach ensures that the desired state of your cluster is version-controlled and auditable.
- Branching Strategies:
  Trunk-based rather than Gitflow to manage different environments (development, staging, production) or to handle feature development and releases.
Pull Request Workflow:
Use pull (merge) requests (PR/MR) to manage changes to the Kubernetes configuration. This allows for code review, approval processes, and automated testing before changes are merged and applied.
Automated Deployment:
Implement CI/CD pipelines that automatically apply changes from Git to your Kubernetes cluster. This could involve testing changes in a staging environment before promoting them to production.
Disaster Recovery:
Regularly back up your Git repository and Kubernetes cluster state. Ensure you have a process in place for restoring from backups in case of a disaster.

Tools | CNCF Landscape

Videos
- DevOps Toolkit
- eBPF Cilium
Cloud Wrappers
- LocalStack : Mocks cloud-vendor services locally. Develop and test your AWS applications locally to reduce development time and increase product velocity. Reduce unnecessary AWS spend and remove the complexity and risk of maintaining AWS dev accounts.
- CloudCraft : 3D graphic and resource/cost model of a cloud infra
Service Catalog : UI of IDP : built/maintained by GitOps/DevOps vendor/admin, not by end users.
- Port : SasS only
- Backstage.io : Build Developers' Portals (IDP)
- Crossplane.io @ GitHub
  - Programmable Control Plane, Controllers, APIs
  - Embed IaC tooling such as Terraform, Helm, Ansible, which converts IaC to Cloud-vendors' API requests.
IaC : Service Management : Provision/Configure:
- Kubernetes : Cluster API, Crossplane, …
  K8s is a universal Control Plane
- Pulumi : IaC in any language, but pay per play platform interfacing to cloud vendors.
  - sst Pulumi wrapper : Deploy everything your app needs with a single config.
- Terraform:
  Declarative provisioning of cloud infrastructure and policies (per-vendor modules), and managing Kubernetes resources.
- Ansible:
  Provision and configure infrastructure, OS/packages, and application software in any environment. A comprehensive, versatile automation tool allowing for both declarative and imperative methods.
- SaltStack | ChatGPT Infrastructure automation and CM.
IaC : Workloads
- Application Management (K8s Manifests)
  - K8s Operator Pattern
    The goal of an Operator is to put operational knowledge into software. Operators implement and automate common Day-1 (installation, configuration, etc.) and Day-2 (re-configuration, update, backup, failover, restore, etc.) activities in a piece of software running in K8s, by integrating natively with K8s concepts and APIs. We call this a K8s-native application. Instead of treating an app as a collection of primitives (Pods, Deployments, Services , ConfigMaps, …) it's treated as a single object that only exposes the knobs that make sense for the application, extending the core K8s API with CRDs as needed to do so.
    - Operator Framework
      - Operator Lifecycle Manager (OLM)
    - Mast : Ansible runner to build simple and lightweight K8s Operators
    - Kopd (K8s Operator Pythonic Framework) : Framework and library for building K8s-operators
    - kube.rs : Rust client for K8
    - kubebuilder : Project for learning and building K8s API extensions
  - Helm:
    K8s package manager; version controlled and deployable using GitOps tools like Argo CD or Flux.
  - Kustomize:
    Generate, customize, and/or otherwise manage Kubernetes objects using files (YAML) stored in a Git repo. It's integrated into kubectl and can be used with other GitOps tools to manage deployments. Use to modify Helm chart per environment.
  - Timoni.sh (Uses CUE)
    Distribution and Lifecycle Management for Cloud-Native Applictions
  - KCL @ GitHub An open-source constraint-based record & functional language mainly used in configuration and policy scenarios. Writtin in Rust, Golang, Python.
  - CUE
  - Pkl
    Configuration that is Programmable, Scalable, and Safe
- CI : Glorified Chron Job
  - Dagger functions: Pipeline agnostic functions that run in CI/CD pipeline of any vendor.
  - Tekton
  - Argo Workflows
  - Jenkins
  - GitHub Actions
  - GitLab CI
- CD : Application Lifecycle
  - Flux CD
    - A tool to automatically sync K8s clusters/applications with their configuration sources (Git) across their lifecycles.
    - Supports automated deployments, where changes to the Git repo trigger updates in the Kubernetes cluster.
    - Handles secret management and multi-tenancy.
    - Flagger integration: Flux can be used together with Flagger for progressive delivery; advanced deployment strategies.
  - Flagger:
    - Automates the release process by gradually shifting traffic to the new version while measuring metrics and running conformance tests. If anomalies are detected, Flagger can automatically rollback.
    - Designed for progressive delivery techniques like canary releases, A/B testing, and blue/green deployments.
    - Service-Mesh Integration: Used with service meshes like Istio, Linkerd, and others, leveraging their features for traffic shifting and monitoring.
  - Argo CD:
    - A declarative, GitOps continuous delivery tool for K8s. Visualize (Web UI) and manage the lifecycle of K8s applications; supports automated or manual syncing of changes.
      - For CD, also need Argo Workflows & Argo Events (preferred over Tekton Events)
    - Application Definitions, Configurations, and Environments: All these are declaratively managed and versioned in Git.
    - Automated Deployment: Argo CD automatically applies changes made in the Git repository to the designated Kubernetes clusters.
    - Visualizations and UI: Argo CD provides a rich UI and CLI for viewing the state and history of applications, aiding in troubleshooting and management.
    - Rollbacks and Manual Syncs: Supports rollbacks and manual interventions for syncing with Git repositories.
  - Argo Rollouts: Advanced deployment strategies like canary and blue/green. Similar to Flagger
Multi-tenancy
- vCluster Virtual clusters for better isolation than Namespace offers. OSS and Enterprise editions.
Logging : Cluster-level logging, AKA Log Aggregation AKA Unified Loggging, so that logs survive their (ephemeral) generator, be that of any host or container process.
- Elastic stack : to collect, store, query, and visualize log data.
  - Composed of:
    1. Backend : Elasticsearch : A search & analytics engine, with an integral storage scheme. Elasticsearch uses a distributed document-oriented database model where it stores data in indices. These indices are persisted to disk in a data directory, typically managed by Elasticsearch nodes. The storage and retrieval of data are handled internally by Elasticsearch using its own mechanisms, such as the Lucene library for indexing and searching.
    2. Frontend : Kibana frontend : Web UI optimized for query/view —Explore, Visualize, Discover —logs from Elasticsearch.
    3. Agent : Collector/Forwarder of container logs : This is the data-processing pipeline that ingests logs from applications, and then transform (normalize) and forwards them to provide for Unified Logging. This is the stack's workhorse, yet oddly external to the stack namesake and core (Elasticsearch/Kibana). Solutions are provided by various projects, many entirely separate from Elasticsearch (the company):
      - Logstash : Elastic's native solution
      - Fluentd : Data collector (not limited to logs, metrics and tracing).
        
        Fluent Bit : Lightweight forwarder for Fluentd for environments having limited resources
        
        Fluent Operator, formerly "FluentBit Operator" : Manage Fluent Bit and Fluentd the Kubernetes way.
  - Stacks
    - ECK Operator (Elastic Cloud on K8s) Contains only Elasticsearch and Kibana. Does not include any Collector/Forwarder (Fluentd, Logstash, …)
      - Deploy in air-gap environment :
    - EFK Stack | HowTo | Helm
    - ELK stack : Logstash instead of Fluentd for log processing and aggregation. Logstash is more resource-intensive but offers more complex processing capabilities.
    - OpenSearch : FOSS fork of Elastic stack (Elasticsearch/Kibana)
      - Data Prepper : Data collector designed specifically for OpenSearch; focus is on observability data, particularly logs, metrics, and traces.
- Grafana Loki | grafana/loki (Install) : "Prometheus, but for logs". A lightweight alternative to Elastic stack.
  - Does not provide full-text indexing of logs; indexes only the logs' metadata (labels).
  - No viable installation method is available (2024-08), contrary to project claims.
Observability : Distributed Metrics and Tracing
- Prometheus : TSDB and monitoring system optimized for telemetry (metrics and tracing). The defacto standard, but does not scale, and has horrible alerts (Alertmanager). So popular that projects provide workarounds to manage scaling. Provision using Prometheus Operator :
  - prometheus-operator/prometheus-operator :
    The bare operator (bundle.yaml)
  - kube-prometheus : A collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide … end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
    - The Prometheus Operator
    - Grafana
    - Highly available Prometheus
    - Highly available Alertmanager
    - Prometheus node-exporter
    - Prometheus blackbox-exporter
    - Prometheus Adapter for Kubernetes Metrics APIs
    - kube-state-metrics; replacment for metrics-server
    - Install using one of two very similar projects:
      - kube-prometheus
        Manifest method : prometheus-operator/kube-prometheus
      - kube-prometheus-stack
        Helm method : prometheus-community/kube-prometheus-stack ChatGPT 5(See k8s-vanilla-ha-rhel9)
  - Thanos @ GitHub : Prometheus HA + long-term storage (MinIO) : CNCF project; can "seamlessly upgrade" on top of an existing Prometheus deployment.
- Grafana : Web UI : Dashboards
  - Grafana Tempo : Tracing backend; scales and integrates with OpenTelemetry, Zipkin, and Jaeger; fixes Jaeger shortcomings.
- Jaeger : Tracing collector that integrates with OpenTelemetry
  - Jaeger Operator @ GitHub
    - Requires cert-manager
- OpenTelemetry (OTEL)
  Vendor-agnostic tracing library for generating traces.
  Its app library covers almost all languages.
  - OpenTelemetry Operator @ GitHub : K8s Operator to manage collectors (OpenTelemetry Collector) and auto-instrumentation of workloads using OTEL libraries.
- VictoriaMetrics : TSDB & Monitoring Solution (as a Service); compatible with Prometheus.
- Inspektor-Gadget.io : eBPF-based CLIs (gadgets)
```
kubectl gadget deploy
# Monitor all network traffic of a namespace
kubectl gadget advise network-policy monitor -n $ns -o network.$ns.log.json
# Processes in containers of Pods
kubectl get snapshot process -n $ns
# Inspect top (processes) of a namespace
kubectl gadget top file -n $ns
# Trace requests into services of a namespace
kubectl gadget trace tcp -n $ns
```
  - Expanding BPF usage from single nodes to across the entire cluster layers
  - Maps low-level Linux resources to high-level Kubernetes concepts integration
  - Use stand-alone or integrate into your own tooling, e.g., Prometheus metrics.
    - Several tools utilized it already, e.g., Kubescape
- Robusta : Alerting
- Komodor : Troubleshooting
- Pixie : All in one
- Groundcover : All in one
Streaming/Messaging : Run on dedicated nodes
- RabbitMQ : A widely used open-source message broker that supports multiple messaging protocols, including AMQP, MQTT, and STOMP. It's known for its simplicity, ease of setup, and support for various messaging patterns like work queues, publish-subscribe, and routing. RabbitMQ is a good choice for IoT and other simpler, high-throughput messaging scenarios.
- Strimzi : Kafka on K8s : strimzi-kafka-operator : For production features such as rack awareness to spread brokers across availability zones, and K8s taints and tolerations to run Kafka on dedicated nodes. Expose Kafka outside K8s using NodePort, Load balancer, Ingress and OpenShift Routes. Easily secured using TLS over TCP. The Kube-native management of Kafka can also manage Kafka topics, users, Kafka MirrorMaker and Kafka Connect using Custom Resources. Allows for using K8s processes and tooling to manage complete Kafka applications.
  - Kafka operators to deploy and configure an Apache Kafka cluster on K8s.
  - Kafka Bridge provides a RESTful interface for your HTTP clients.
- NATS : A lightweight, high-performance messaging system designed for microservices, IoT, and cloud-native systems. It supports various messaging models including pub-sub, request-reply, and queueing. NATS is known for its simplicity and performance.
- Redpanda : A newer, Kafka-compatible streaming platform designed to offer better performance and easier operation. It is API-compatible with Kafka, which means existing Kafka clients and ecosystem tools work with Redpanda without modification. Redpanda is designed to be simpler to deploy and manage, with a focus on reducing operational overhead.
  - >Enabling SELinux can result in latency issues. If you wish to avoid such latency issues, do not use this mechanism.
Networking
- External Load Balancer
  - kube-vip (GitHub | Docs): K8s Virtual IP and Load Balancer (LB) for both control plane and services for On-prem, Edge, Bare-Metal, …
    - Architecture
- CNI
  - Calico
  - Cilium : eBPF
- Ingress
  - Istio
  - Traefik : Automatically wires routes per Service discovery
  - Ingress-Nginx
  - Cilium : eBPF
  - Consul
- K8s Gateway API
  - Traefik
- Service Mesh
  - Istio
    - Envoy Service Proxy (Sidecar)
  - Traefik : Automatically wires routes per Service discovery
  - Linkerd : Service Mesh (East-West) : mTLS + Load Balancing between services
    - Sidecar Proxy (written in Rust).
  - Kuma
  - Consul
- Service Discovery
  - etcd : K8s cluster
  - Consul : Multi-cluster
IA/Security
- AuthN/AuthZ
  - AuthN (Authentication)
    - K8s
      - Two types of subject : K8s provides for binding either type to roles for cluster access:
        
        ServiceAccount : K8s object declaring a non-human entity (subject). E.g., a Pod.
        
        A user and/or group : K8s concept of human entity (subject).
        Though K8s has neither user nor group objects, it searches for these subjects in certificates and tokens, and provides for binding them to roles.
      - Two scenarios:
        
        Clients authenticating against the K8s API server
        
        The two most common methods:
        
        X.509 certificate issued by K8s CA
        
        Token (JWTs) generated by an OIDC provider, e.g., Dex or Keycloak, which may proxy an upstream Identity Provider (IdP) such as AD. K8s recognizes the subject, e.g., by token claims of user/group, or ServiceAccount having K8s cluster.user.
        
        Regardless of method, identities that match a (Cluster)RoleBinding are authorized for access according to the associated (Cluster)Role.
        
        Users authenticating at web UI against an application running on the cluster.
        
        Token (JWTs) generated by an OIDC provider, which may be same as other scenario, to enable Single Sign On (SSO), since OIDC is just an extention of OAuth2.
      - Authentication Plugins
        
        Static Token file
        
        Bearer token
        
        Service Account token
        
        X.509 certificates (TLS)
        
        Open ID Connect (OIDC) token
        
        Authentication proxy
        
        Webhook
  - AuthZ (Authorization) | Modules/Modes
    Regardless of authentication method, K8s can implement Role-based Access Control (RBAC) model against subjects (known by request attribute(s)) using a pair of K8s objects for each of the two scopes of K8s API resources (api-resources):
    - K8s
      1. Namespaced (Deployment, Pod, Service, …)
        
        Role : Rules declaring the allowed actions (verbs) upon resources scoped to APIs (apiGroup).
        
        RoleBinding : Binding a subject (authenticated user or ServiceAccount) to a role.
      2. Cluster-wide (PersistentVolume, StorageClass, …)
        
        ClusterRole
        
        ClusterRoleBinding
- Distributed-Workload Identities : AuthN providing IDs having AuthZ primitives.
  - SPIFFE/SPIRE : Successor to RBAC for defining (SPIFFE) and implementing (SPIRE) a workload identity platform and access controls rooted in Zero Trust (versus Perimeter Security) principles to mitigate attack risk. SPIFFE/SPIRE provides a uniform identity layer across distributed systems. The core idea is to issue SVIDs (SPIFFE Verifiable Identity Documents), of either X.509 or JWT, to workloads based on strong attestation (e.g., node identity, container metadata). Tools like OPA, Istio, Linkerd, or Envoy with RBAC consume SPIFFE IDs for authorization policies (AuthZ). So the same artifact (SVID) is used for both AuthN (proof of identity) and as a handle for AuthZ rules.
    - Secure Production Identity Framework for Everyone (SPIFFE) : An OSS framework specificition to provide attested, cryptographic identities to distributed workloads; capable of bootstrapping and issuing identity to services; defines short-lived cryptographic identity documents (SVID) via a simple API. Workloads use these SVIDs when authenticating to other workloads, for example by establishing a TLS connection or by signing and verifying a JWT token.
    - SPIFFE Runtime Environment (SPIRE) : a production-ready implementation of the SPIFFE APIs (pluggable multi-factor attestation and SPIFFE federation) that performs node and workload attestation in order to securely issue SVIDs to workloads, and verify the SVIDs of other workloads, based on a predefined set of conditions.
      - spiffe://cluster/ns/foo/sa/bar
        
        A cryptographically verifiable identity
- Threat Detection / Remediation : CVEs (Common Vulnerabilities and Exposures)
  - K8s Adminssion Controllers
  - Trivy : Scan OCI-container images, OS folders, Kubernetes clusters, Git repos, virtual machines, and more. And can create SBOM and CVE-vulnerabilities audits of them.
    - trivy-operator by Helm chart : Recurringly scan all container images : Generates VulnerabilityReports per Pod, DaemonSet, … across all/declared Namespaces.
  - Kubescape (10K) : Runtime Detection
  - Falco by Sysdig (7K) : Threat detection/reporting : Runtime security across hosts, containers, Kubernetes, and cloud environments. It leverages custom rules on Linux kernel events and other data sources through plugins, enriching event data with contextual metadata to deliver real-time alerts. Falco enables the detection of abnormal behavior, potential security threats, and compliance violations.
  - cve-bin-tool :
    Python tool for finding known vulnerabilities in software, using data from the NVD's list of CVEs as well as known vulnerability data from Redhat, Open Source Vulnerability Database (OSV), Gitlab Advisory Database (GAD), and Curl
- Policy Enforcement
  - K8s Adminssion Controllers
  - Kyverno Policy as Code (6K)
    - Kyverno CLI can be used to apply and test policies off-cluster e.g., as part of an IaC and CI/CD pipelines.
    - Kyverno Policy Reporter : a sub-project of Kyverno that provides in-cluster management of policy reports with a web-based graphical user interface.
    - Kyverno JSON : a sub-project of Kyverno that allows applying Kyverno policies to off-cluster workload. It works on any JSON payload.
    - Kyverno Chainsaw sub-project of Kyverno provides declarative end-to-end testing for Kubernetes controllers.
  - OPA/Gatekeeper (3.7K): Automated policy enforcement
  - Kubearmor (1.6K): Runtime Security Enforcement : Policy-based controls : a runtime Kubernetes security engine that uses eBPF and Linux Security Modules (LSM) for fortifying workloads based on Cloud Containers, IoT/Edge, and 5G networks.
- Secrets
  - Hashicorp Vault : OpenBao (OSS fork)
    - Implementations:
      - vault-helm
      - vault-secrets-operator
      - Vault Agent Sidecar Injection : Most secure (rest/transit) : Transparent encrypt/decrypt of K8s Secret objects (data)
      - Vault CSI Driver : Fetch secrets on container init and mount as files in the container.
      - External Secrets Operator : Integrates with external secret stores (Vault, AWS Secrets Manager, Google Cloud Secret Manager); syncs secrets from Vault into K8s Secret objects.
      - Direct API Calls to Vault : Configured per app.
  - External Secrets Operator : Pull/Push secrets; sync K8s Secret objects (decrypted) with external store (encrypted).
  - Teller : Like External Secretes Operator; local CLI secrets manager for developers.
  - Bitnami SealedSecrets :
    - Asymmetric crypto allows developers to encrypt secrets as "SealedSecret" K8s object (CRD) to store outside the cluster in (public) Git repo or other untrusted environment.
    - Automatically decrypts it and creates a regular Kubernetes Secret object, accessible to your applications.
    - Once in the cluster, it is stored unencrypted in K8s Secret object.
    - Components
      - A cluster-side controller / operator
      - A client-side utility: kubeseal : Utility encrypts secrets that only the controller can decrypt.
    - Transit Engine : works entirely within Vault and does not require a sidecar or agent within your Pods. It is used by the K8s control plane (e.g., API server) to perform encryption operations, ensuring data is encrypted when stored (e.g., in etcd) and decrypted when needed.
  - SOPS (Secrets OPerationS) : Secrets management; editor that interfaces with Vault etal.
  - age (Actually Good Encryption) : A simple CLI providing AEAD encrypt/decrypt. Rejoice over a replacement for the obnoxiously complicated PGP (Pretty Good Privacy) project.
```
# Install : bad binary
sudo curl -sSLo /usr/local/bin/age https://dl.filippo.io/age/latest?for=linux/amd64
# Install : okay
go install filippo.io/age/cmd/...@latest
sudo ln -s /home/u1/go/bin/age /usr/local/bin
            
# Generate public-private key pair
key=age.key
pub="$(age-keygen -o $key 2>&1 |cut -d':' -f2 )"
# Encrypt a source (archive)
tar cvz ~/$src |age -r $pub > $src.tgz.age
# Decrypt a source
age -d -i $key $src.tgz.age > $src.tgz
```
- Signing
  - Sigstore Cosign
  - Notary
- TLS : Automated TLS Management (Enterprise Grade)
  - cert-manager (cert-manager.io|GitHub): …obtain certificates from … public … as well as private Issuers …, and ensure the certificates are valid and up-to-date, and … renew certificates at a configured time before expiry.
    - Private Issuers (Backends):
      - Smallstep step-ca
        An online CA for secure, automated X.509 and SSH certificate management. It's the server counterpart to step CLI. Run step-ca (internal ACME) + step-issuer or use cert-manager’s ACME issuer pointed at step-ca. Provides air-gapped ACME + tight Kubernetes integration.
        
        Generate TLS certificates for private infrastructure using the ACME protocol.
        
        Automate TLS certificate renewal.
        
        Add Automated Certificate Management Environment (ACME) support to a legacy subordinate CA.
        
        Issue short-lived SSH certificates via OAuth OIDC single sign on.
        
        Issue customized X.509 and SSH certificates.
      - Hashicorp Vault PKI | OpenBao PKI
      - Venafi TLS Protect for Kubernetes (the commercial successor to Jetstack Secure): central policy/visibility across clusters and CAs (public or private). If you want enterprise governance and inventory at scale, this is the “batteries-included” option.
Storage
- MinIO : a Kubernetes-native high performance object store with an S3-compatible API; supports deploying MinIO Tenants onto private and public cloud infrastructures AKA Hybrid Cloud.
  - MinIO Operator : minio/operator | Docs
- Rook : Open source cloud-native storage orchestrator, providing the platform, framework, and support for Ceph storage to natively integrate with cloud-native environments. Provides S3/Swift API.
  - Ceph : Distributed storage system that provides file, block and object storage and is deployed in large scale production clusters on commodity hardware.
- JuiceFS : Distributed POSIX file system built on top of Redis and S3 (MinIO).
- Gluster
- Longhorn
- KubeFS
- Database
  - Managed
    - Aiven
  - KubeBlocks : K8s Operator : Supports many databases
  - TiKV : distributed, and transactional key-value database. FOSS. CNCF Graduated project.
  - Cassandra : NoSQL distributed database. Apache/CNCF project.
  - NiFi @ GPTchat : A system to ingest, process and distribute data (from anywhere); automated and managed flow of information between systems; suited for complex data integration, ETL processes, real-time data flows, and scenarios requiring detailed data lineage and tracking. Apache/CNCF project.
  - CloudNativePG (CNPG) : K8s Operator covering full lifecycle of a highly available PostgreSQL database cluster with a primary/standby architecture, using native streaming replication. A CNCF project.
  - Atlas Operator : Schema
Misc
- Charm : Library and Tools
- Velero : Backup K8s cluster and PersistentVolumes

Environments

Upon what infrastructure does the app AKA workload AKA service run?

Cloud : 3rd-party vendor, typically virtual; SDNs, VMs, …
On-prem : Self managed; physical and/or virtual
Bare-metal : OS and app on physical machine, sans hypervisor/virtualization, regardless of whether on-prem or in cloud.
Edge : More than just a reference to gateway router(s); an environment and topology. Distributed architectures and practices for processing data closer to where it is generated or consumed.
- Computing:
  - Proximal to Data : Located close to the source of data, such as IoT devices, sensors, or users. This proximity allows for faster data processing and reduced latency.
  - Distributed Architecture : Deploying smaller, localized data centers or computing resources that work together with centralized cloud services. This creates a distributed architecture where certain tasks are handled at the edge, while others are processed in the cloud or a central data center.
  - Real-Time Processing : For applications that require real-time processing and quick decision-making, such as autonomous vehicles, industrial automation, and smart cities.
  - Reduced Bandwidth Usage : Only the relevant or processed data needs sent to central data center/cloud, reducing amount of data egress.
- Environment:
  - Edge Devices : Sensors, IoT devices, smart appliances, …
    - To generate or consume data.
  - Edge Servers or Mini Data Centers : small-scale computing resources in retail stores, factories, telecom towers, vehicles, … deployed close to edge devices
    - To process and analyze data locally.
  - Edge Gateways : Routers and other devices.
    - To aggregating data from various edge devices and sometimes perform initial processing before forwarding data to central servers or the cloud.

Configuration

Repository Structure:
Organize your Git repository in a way that reflects your deployment environments and application structure. This could involve separate directories for each environment and application.
Secrets Management:
Use tools like Bitnami Sealed Secrets, SOPS, or Vault for encrypting secrets that are stored in Git. This ensures that sensitive information is securely managed.
Monitoring and Alerting:
Integrate monitoring and alerting tools to track the health of your deployments and the Kubernetes cluster. Prometheus and Grafana are commonly used tools that can be managed via GitOps.
Policy Enforcement:
Use policy-as-code tools like Open Policy Agent (OPA) or Kyverno to enforce policies on your Kubernetes clusters. Store policies in Git to apply them consistently across your environments.
Security Scanning:
Implement security scanning of your Docker images and Kubernetes configurations as part of your CI/CD pipelines. Tools like Trivy, Clair, and KubeLinter can be integrated into your workflows.

Schemes for Unique Identifier

UUID (Universally Unique Identifier):
A widely used 128-bit number that guarantees uniqueness across different systems, often used in databases and distributed systems; considered the standard for generating unique identifiers. [1, 2, 3]
NanoID:
A compact, URL-friendly unique string generator with a similar collision probability to UUID, designed for JavaScript environments [1, 4, 5]
ULID (Universally Unique Lexicographically Sortable Identifier): A unique identifier that is also sortable, combining timestamp and randomness for efficient database indexing [1, 6]
CUID (Collision-resistant Unique Identifier): A unique identifier that incorporates a timestamp, counter, and random characters, designed for collision resistance in distributed systems [1, 5]
Snowflake ID: A unique identifier scheme often used in distributed databases, incorporating timestamps and sequence numbers to generate unique IDs [6, 7]

Key points to consider when choosing a unique identifier scheme:

Collision probability: How likely is it for two different identifiers to be the same.
Length and readability: How long is the identifier and how easy is it to read and interpret.
Sorting capabilities: Whether the identifier can be easily sorted in a database
Generation method: How the identifier is generated, whether it uses randomness or timestamps

DevOps/GitOps v1.0.0 | CNCF Landscape