After reviewing 13 new CNCF Sandbox projects from 2023 H1, we’ll continue with 12 other additions from the same (previous) year. They include two batches of projects accepted to CNCF Sandbox on September 19th and December 19th, 2023. As before, we’ll list them by their formal categories, starting with those featuring more new projects.
Observability
1. Logging operator
- Website; GitHub
- ~1600 GH stars, ~130 contributors
- Initial commit: Jun 23, 2018
- License: Apache 2.0
- Original owner/creator: Cisco
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
This operator streamlines the deployment and configuration of logging pipelines in Kubernetes. To do so, it leverages existing tooling, such as Fluent Bit for collecting logs and Fluentd and syslog-ng as log forwarders. (While Fluentd is a CNCF Graduated project, Fluent Bit is a part of the Fluentd ecosystem.)
When using the Logging operator, Fluent Bit will be deployed as a DaemonSet on all nodes to collect container and application logs and enrich them with Pod metadata. Then, a log forwarder (e.g., Fluentd) will filter, transform, and forward the processed logs to specified destinations.
Key features of the Logging operator include namespace isolation, native Kubernetes label selectors, secure (TLS) communication, and configuration validation. It supports multiple log streams for different transformations and multiple outputs for storing logs in various destinations, including S3, GCS, Elasticsearch, and Loki.
Additionally, the operator can handle various logging systems, including multiple Fluentd and Fluent Bit deployments within the same cluster.
2. K8sGPT
- Website; GitHub
- 6000+ GH stars, 90+ contributors
- Initial commit: Mar 21, 2023
- License: Apache 2.0
- Original owner/creator: Alex Jones
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
K8sGPT is a leading AI assistant for Kubernetes administrators, as we highlighted in our “OpenAI-based Open Source tools for Kubernetes AIOps” article.
It aims to automate your SRE experience thanks to special analysers (such as the one for Pods, StatefulSets, Deployments, etc.) to diagnose existing issues. To provide you with recommendations regarding these issues, it integrates with various AI models, including OpenAI, Cohere, Amazon Bedrock and SageMaker, Azure OpenAI, Google Gemini and Vertex AI, Hugging Face, IBM watsonx.ai, Ollama and other local models.
You can install K8sGPT locally or as a cluster operator and integrate it with Trivy and Prometheus. For more information about its features and usage, please refer to the abovementioned article.
Scheduling & Orchestration
3. kcp
- Website; GitHub
- ~2400 GH stars, 120 contributors
- Initial commit: Mar 31, 2021
- License: Apache 2.0
- Original owner/creator: Red Hat
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
kcp is a horizontally scalable control plane for Kubernetes-like APIs, featuring fully isolated “workspaces” that function as separate K8s clusters. Each such workspace has its own URL, API set (i.e. various CRDs), and RBAC, yet it is as fast and simple as a regular Kubernetes namespace.
kcp allocates workspaces to kcp instances, called shards, in a manner similar to how Pods are scheduled to Kubernetes nodes. It allows API service providers to offer APIs centrally through multi-tenant operators, and users can benefit from easy-to-use APIs within their workspaces. On top of that, kcp supports advanced deployment strategies, including affinity/anti-affinity, geographic replication, and cloud-to-cloud replication.
Called “a building block for SaaS service providers”, kcp is used in IBM’s KubeStellar (see the next project), Kubermatic Kubernetes Platform, and SAP.
4. KubeStellar
- Website; GitHub
- ~300 GH stars, 40+ contributors
- Initial commit: Nov 4, 2022
- License: Apache 2.0
- Original owner/creator: IBM
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
KubeStellar facilitates the deployment and configuration of applications across multiple Kubernetes clusters by hiding the intricacies of the actual multi-cluster reality from developers.
The project’s motto—”create once, deploy many”—refers to the idea that developers don’t need to modify their K8s apps created for one cluster to be used across multiple clusters. It only requires the users to define a binding policy between clusters and Kubernetes objects. Then, common tools for deploying apps to Kubernetes—such as kubectl
, Helm, Kustomize, Argo CD, and Flux—can be used as usual.
To achieve this, KubeStellar has quite a sophisticated architecture involving the KubeStellar Controller Manager, Pluggable Transport Controller, KubeFlex for space management, OCM (Open Cluster Management) Cluster Manager, and other components. Luckily, for users, it boils down to such a high-level design:
Security & Compliance
5. Copa
- Website; GitHub
- ~1100 GH stars, 20+ contributors
- Initial commit: Jan 12, 2023
- License: Apache 2.0
- Original owner/creator: Microsoft
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
Copa (formerly known as Copacetic) is a command-line tool that enables direct container image patching using vulnerability reports created by scanners like Trivy.
By allowing you to quickly patch container images, there’s no need to wait for upstream (original image authors) to rebuild them fully, which might be crucial in some cases. Additionally, this approach is more efficient in terms of storage and network as it involves a single patching layer rather than rebuilding the entire image. It also significantly reduces the time needed to update and patch images, bypassing the entire build pipeline.
The diagram below illustrates the way Copa works:
The process involves analyzing a vulnerability report created by Trivy, generating the necessary patch packages, and using package managers (e.g., apt
) to process them. The updated files are then applied to the container image via buildkit
. You can also extend the functionality of Copa with additional adapters for other vulnerability scanners and package managers.
Cloud Native Storage
6. Kanister
- Website; GitHub
- ~800 GH stars, 90+ contributors
- Initial commit: Dec 5, 2017
- License: Apache 2.0
- Original owner/creator: Kasten (acquired by Veeam in 2020)
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
Kanister is a tool for data protection management in Kubernetes. It comes with extensible APIs allowing you to define and manage data operations, abstracting the details of running them in K8s.
Kanister’s APIs are implemented as Custom Resources in Kubernetes. The tool supports S3-compliant object storages, asynchronous and synchronous job scheduling, RBAC to prevent unauthorized access. Importantly, it comes with a set of ready-to-use functions, which you can use for data preparations (e.g., mounting PVCs and executing commands), data backups and pre-backup operations (e.g., scaling replicas), handling snapshots, etc. Project’s repo also features an examples directory with helpful configurations for handling data within popular services, such as PostgreSQL, AWS RDS, MySQL, Redis, etcd, Elasticsearch, etc.
Kanister leverages a Kubernetes operator pattern, defining and interacting with its own resources through three main components: the controller, ActionSet, and Blueprint custom resources. Here’s how it operates:
When an ActionSet is created, the Kanister controller detects it and checks the environment for the specified Blueprint and other required settings. If all the requirements are met, the controller performs the action (e.g., backup) based on the discovered Blueprint.
Automation & Configuration
7. KCL
- Website; GitHub
- 1700+ GH stars, 30+ contributors
- Initial commit: May 13, 2022
- License: Apache 2.0
- Original owner/creator: Ant Group
- Languages: Rust
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
KCL is a versatile language designed for creating complex configurations, especially for cloud environments. It emphasizes modularity, scalability, and stability (e.g., strong immutability), rendering it easy to write logic, automate APIs, and integrate with various systems.
KCL originated from high-level languages such as Python and Go, and it offers its own syntax, semantics, runtime environment, and module design. The language includes built-in modules and schema-oriented configuration types, with modular abstractions to create configurations using types, logic, and policies. The project recommends managing all your configurations and model libraries in the way of a configuration library by storing together KCL definitions and various types of configurations (application operations, maintenance configurations, policies, etc.).
It enables the generation of low-level static configurations using common formats like JSON and YAML. KCL provides schema modelling to define typical structures and constraints (attribute types, default values, etc.) used throughout your configuration data and validate them. At the same time, it features SDKs for Rust, Go, Python, .NET, Java and Node.js and various integrations and plugins, including those for kubectl, Kustomize, Helm, KPT and Crossplane.
KCL comes with powerful automation capabilities to organize, simplify, and manage large configurations. You can integrate it with your CI/CD based on GitLab or GitHub Actions and implement the GitOps approach via ready-to-use instructions for both Argo CD and Flux CD. It also offers native support for API specifications such as OpenAPI, Kubernetes CRD, and Kubernetes Resource Model (KRM).
API Gateway
8. Easegress
- Website; GitHub
- ~5800 GH stars, 60+ contributors
- Initial commit: Mar 27, 2017
- License: Apache 2.0
- Original owner/creator: MegaEase
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
Easegress (formerly known as Ease Gateway) is a Cloud Native traffic orchestration system that enhances service availability, reliability, and performance without requiring any changes to be made in your existing code.
It operates as an API and service mesh gateway, managing traffic and API requests and enabling numerous features for them. They include providing resilience (circuit breakers, rate limiting, retries) and security (IP filtering, managing Let’s Encrypt certifications, verifying JWT tokens, validating OAuth2, etc.), orchestrating the flow of APIs and aggregating API requests across multiple APIs, and activating blue-green and canary deployments.
Easegress ensures high availability (up to 99.99%) with a built-in Raft consensus algorithm, implements load balancing based on various algorithms (including header hash and sticky sessions), and supports compression for body responses, cache for backend services, and hot updates. It comes with observability features, including built-in OpenTelemetry support for distributed tracing.
To extend Easegress features, you can create and execute WebAssembly code via AssemblyScript. You can integrate Easegress with other Cloud Native tooling, such as Knative for FaaS, Eureka/Consul/etcd/ZooKeeper for service discovery, and Kubernetes Ingress Controller.
Container Runtime
9. Kuasar
- Website; GitHub
- ~1300 GH stars, ~30 contributors
- Initial commit: Apr 8, 2023
- License: Apache 2.0
- Original owner/creator: Huawei Cloud
- Languages: Rust
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
Kuasar is a container runtime which is written in Rust to be efficient (i.e. ensure a fast startup and reduced overhead) and boasts multiple sandboxing options.
Technically, Kuasar consists of two primary modules:
- Kuasar-sandboxer provides the Sandbox API, handling the sandbox lifecycle and resource allocation;
- Kuasar-task implements the Task API, responds to high-level container runtime requests, and oversees container lifecycle and resource allocation.
Today, the project supports the following sandboxers:
- MicroVM, providing support for Cloud Hypervisor, QEMU, and StratoVirt; Firecracker is planned for the near future (it’s claimed as “planned for 2024,” but perhaps it will take more time to implement);
- Wasm with WasmEdge and Wasmtime supported (Wasmer is expected soon);
- App Kernel with Quark supported (gVisor is also expected);
- runC.
By supporting various sandbox types, Kuasar allows users to select the best option based on the application requirements. It also supports running multiple sandboxes on the same node, which ensures “higher security and efficiency at lower cost,” as its authors from Huawei Cloud state.
Chaos Engineering
10. Krkn
- GitHub
- ~300 GH stars, 40+ contributors
- Initial commit: Apr 20, 2020
- License: Apache 2.0
- Original owner/creator: Red Hat
- Languages: Python
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
Krkn (aka Kraken) is a resilience testing/chaos engineering tool for Kubernetes, helping you to understand how your app will behave in case of failure.
To start using Krkn, you can simply specify the target Kubernetes/OpenShift cluster using kubeconfig
. The tool will then inject chaos scripts as defined in the configuration, interact with Cerberus (its own health watcher) for a cluster health report, retrieve metrics from Prometheus, generate a metrics profile with PromQL queries, and store results in Elasticsearch. Once that’s done, it will evaluate those metrics to assign a pass or fail status:
Simply put, Krkn checks whether target components recover from failures within a specified timeframe. It inspects the overall health of your Kubernetes-based system after this incident, evaluates performance metrics such as latency and resource utilization, and prints results based on severity levels. The recovery of specific components, overall cluster health, and metrics evaluation determine whether this chaos experiment was a success or a failure.
The project features a variety of chaos scenarios. For example:
- Pod and container scenarios disrupt Kubernetes environments to evaluate application availability, startup time, and recovery.
- Node scenarios, supported on AWS, Azure, GCP, OpenStack, and bare metal, trigger failures like node deletion, fork bombs, halts, and
kubelet
failures. - Zone cutoffs simulate failures in availability zones, impacting all nodes within a specified time period.
- Application failures block traffic to test dependency responses, while power outages simulate cluster shutdowns and restarts.
- Resource-related scenarios stress CPU, memory, and I/O to test reserved resources and performance under load.
- Network chaos scenarios include latency, packet loss, unstable interfaces, DNS errors, packet corruption, and bandwidth limitation.
- …
If you want to stop Krkn executed in CI or as an external job at some point, you can send a specific signal to pause or stop the chaos experiment. Krkn also supports installing Grafana with dashboards that will help with performance monitoring, i.e., finding bottlenecks.
Continuous Integration & Delivery
11. kube-burner
- GitHub
- 500+ GH stars, ~50 contributors
- Initial commit: Aug 13, 2020
- License: Apache 2.0
- Original owner/creator: Red Hat
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
Kube-burner is a tool to stress Kubernetes clusters by creating, deleting, and modifying resources at a given rate.
To initiate cluster testing, all you need to do is a) define (in a local or remote configuration file) the operations you want to be performed, and b) run the init
command.
While performing these tests, you can collect and index the Prometheus metrics from a given time range, scrape metrics from multiple endpoints, collect more detailed metrics from additional data sources (e.g., using Kubernetes API) for a given set of Kubernetes resources, check for specified alerts, and perform an overall health check for the affected K8s cluster.
Cloud Native Network
12. Spiderpool
- GitHub
- 500+ GH stars, 30+ contributors
- Initial commit: Mar 7, 2022
- License: Apache 2.0
- Original owner/creator: DaoCloud
- Languages: Go
- CNCF Sandbox: sandbox request; onboarding issue; DevStats
Spiderpool is an RDMA (Remote Direct Memory Access) network solution for Kubernetes, built on top of macvlan, ipvlan, and SR-IOV CNI plugins to address diverse networking needs. It operates on bare metal, virtual machines, and public cloud environments, and aims to ensure exceptional network performance for I/O-intensive, low-latency applications like data warehousing, middleware, observability, and AI.
Spiderpool offers RDMA based on RoCE and InfiniBand, enabling Pods to use the RDMA device in shared or exclusive mode. While it leverages existing CNIs as underlay networks, it comes with several advantages and additional features. Unlike CNI solutions based on virtual interfaces, the underlay networks in Spiderpool eliminate L3 forwarding at the host and avoid tunnel encapsulation overhead which contributes to high network throughput, low latency, and reduced CPU utilization. Spiderpool also enables seamless connectivity to underlay L2 VLAN networks, supporting both L2 and L3 communications, multicast and broadcast, and firewalls to control the traffic.
Additionally, when using Spiderpool, data packets carry the actual IP addresses of Kubernetes Pods, facilitating direct communication based on those IPs. The underlay CNI can also create virtual interfaces using parent network interfaces on the host to provide isolated subnets for network-intensive applications.
Moreover, Spiderpool enables multiple underlay CNI interfaces or a combination of overlay and underlay CNI interfaces for K8s Pods as well as CRD-based dual-stack IPAM (e.g., assigning static IPs for stateful workloads).
The project comes with a helpful roadmap depicting currently supported features for various CNIs:
Afterword
By analyzing all these 12 projects and comparing them with the previous 13 CNCF Sandbox additions in 2023, we can see that younger projects are being accepted now. An overall median age is close to 2 years old, as opposed to 4 years for the previous batch. It would be interesting to see how this trend will evolve in our further overviews of the projects added to CNCF Sandbox in 2024.
As for other common traits, the Go language is an absolute winner (leaving a bit of space for Rust when high performance is required), and Apache 2.0 is the standard de facto for licensing (it records for 100% of projects in this batch). While most projects come from huge international companies (e.g., Red Hat and IBM initiated 25% of these projects), a noticeable fraction of them (another 25%) originated in China.
P.S. Other articles in this series
- Part 1: 13 arrivals of 2023 H1: Inspektor Gadget, Headlamp, Kepler, SlimToolkit, SOPS, Clusternet, Eraser, PipeCD, Microcks, kpt, Xline, HwameiStor, and KubeClipper.
Comments