The future of Kubernetes and cloud infrastructure

For the past decade, Kubernetes has been the dominant force in cloud-native computing and in enterprise software generally, as cloud providers and their customers have turned toward running their applications and services in clusters of containers instead of in tiers of virtual machines. And yet, the Cloud Native Computing Foundation’s 2023 annual survey (conducted from August through December 2023) found that 44% of organizations still are not yet using Kubernetes in production. Findings like this indicate there is still much room for growth in the mass enterprise market, where on-premises deployments are still common.

What’s holding Kubernetes back? As the CNCF survey finds year after year, the top challenges organizations face in using containers continue to be complexity, security, and monitoring, which were joined in the latest survey by lack of training and cultural changes in development teams. These challenges are hardly surprising in the dramatic, monolith-to-microservices journey that Kubernetes represents. But some expect the challenges to grow even larger, with Gartner estimating that more than 95% of new digital workloads will be deployed on cloud-native infrastructure by 2025.

Yet, help is on the way. From new software development approaches like internal developer platforms to innovations like eBPF, which promises to extend the cloud-native capabilities of the Linux kernel, exciting developments in cloud infrastructure are on the horizon. These fundamentally industry-altering design patterns, open-source tools, and architectures are set to address the complexity and scaling challenges of Kubernetes and evolve cloud infrastructure as we know it today.

Reducing cloud-native complexity

For Kubernetes to flourish in the mainstream market, usability improvements will be needed. “Kubernetes has been an incredible standard API for accessing infrastructure on any cloud, but it takes a lot of work to make it an enterprise-ready platform,” James Watters, director of research and development, VMware Tanzu Division at Broadcom, says.

The open-source world is tackling this challenge with internal developer platforms to decrease friction, whereas public clouds are offering solutions to ease the management of container infrastructure. Still, Watters sees a need for enterprise application platforms for containers that decrease the barrier to entry.

“Developers want access to self-service APIs, and these are not always the lowest level available—they’re not just VM as a service or a container as a service,” says Watters. “Developers need much more than an application runtime as a service to be productive.” Companies including VMware, Rafay, Mirantis, KubeSphere, and D2IQ, not to mention the leading cloud providers, are working to make enterprise container management more usable.

Others agree that a massive reduction in product complexity is necessary across the board. “The complexity of cloud-native open-source technology is too high for the common enterprise,” says Thomas Graf, one of the creators of Cilium, co-founder of Isovalent, and now VP of cloud networking and security of Isovalent at Cisco. Graf adds that compliance and security are common barriers to adopting cloud-native technology patterns within many on-prem brownfield situations.

Increasing visibility into cloud resource usage

Most enterprises are already using multiple clouds simultaneously. This will only become more commonplace, analysts say, requiring more cross-cloud management. “In a cross-cloud integration framework, data and workloads are integrated to operate collaboratively across clouds,” Sid Nag, VP Analyst at Gartner, says. This could enable any-to-any connectivity, adaptive security, and central management, he says.

Part of enhancing awareness around cloud behavior is having an agnostic logging mechanism. “We’re starting to see the same energy around OpenTelemetry that we saw around Kubernetes,” Ellen Chisa, partner at Boldstart Ventures, says. In mid-2023, OpenTelemetry was the second fastest-growing project hosted by the Cloud Native Computing Foundation, according to CNCF data.

A couple of factors are teeing up OpenTelemetry’s growing significance. First, organizations now have numerous logs and face increasing data costs. “As technical teams face real budget pressure from boards and CFOs, there’s more of the question around ‘how do we make our logging more useful to the business?’” Chisa says.

Secondly, OpenTelemetry can empower the production environment with greater context. “The same way we talk about wanting ease of deployment (code to cloud), we’ll want real information about what’s happening in the cloud as we write code (cloud to code),” she says.

Increasing platform abstraction and automation

IT infrastructure has never been easier to use than in today’s public and private clouds. But while developers have more control with self-service APIs and user-friendly internal platforms, platform engineering still possesses considerable toil, ripe for change.

As an industry, we need to get out of the weeds of YAML and “climb the ladder of abstraction,” Jonas Bonér, CTO of Lightbend, says. “The next generation of serverless is that you don’t see infrastructure at all.” Instead, Bonér foresees a future where the actual running of an internal developer platform is outsourced away from operations or site reliability engineering (SRE) teams. “We’re in the transition phase of developers and operators learning to let go.”

“Building enterprise-ready platforms remains labor-intensive, with significant effort required to ensure systems are secure and scalable,” Broadcom’s Watters says. “Platform teams are going to play a significant role in infrastructure innovation because they’re making it easier for developers to consume in a pre-secured, pre-optimized way.”

According to Guillermo Rauch, CEO of Vercel, modern frameworks can “completely automate infrastructure away.” As such, Rauch foresees more framework-defined infrastructure and increased investment in global front-end footprints. He says this will evolve cloud infrastructure away from bespoke and specialized infrastructure, which is provisioned (and usually overprovisioned) on a per-application basis, benefiting both developer productivity and business agility.

Whatever shape they eventually take, streamlined internal platforms are clearly a direction for cloud infrastructure. “In the same way that today’s developers no longer think about individual servers, data centers, or operating systems, we are moving to a time when they can stop being concerned about their application capabilities and dependencies,” says Liam Randall, CEO of Cosmonic. “Just as they expect today’s public clouds to maintain their data centers, developers want their common application dependencies maintained by their platforms as well.”

According to Randall, WebAssembly will usher in the next phase of software abstraction and a new era beyond containerization. “Componentized applications [based on the WebAssembly Component Model] are compatible with container ecosystem concepts like service mesh, Kubernetes, and even containers themselves, but they are not dependent upon them,” says Randall. Components solve the cold start problem, they’re smaller than containers, they’re more secure, and they’re composable across language and language framework boundaries, he says.

Bringing virtualization to Kubernetes clusters

Another evolving area is inner-Kubernetes virtualization. “The same paradigm that drove hardware virtualization for Linux servers is now being applied to Kubernetes,” says Lukas Gentele, CEO and co-founder of Loft Labs. One reason is to address cloud computing costs, which continue to escalate with AI and machine learning workloads. In these scenarios, “sharing and dynamic allocation of computing resources is more important than ever,” he says.

A second reason is to address cluster sprawl. As of 2022, half of Kubernetes users surveyed by the Cloud Native Computing Foundation were operating 10 or more clusters. However, the number of clusters in use can vary dramatically. For instance, Mercedez-Benz runs on 900 clusters. “Many organizations end up managing hundreds of Kubernetes clusters because they don’t have a secure and straightforward way to achieve multi-tenancy within their Kubernetes architecture,” Gentele says.

According to Gentele, virtual clusters can reduce the number of physical clusters needed while maintaining the security and isolation required for different workloads, thus significantly lowering resource overhead while easing the operational burden.

Orchestrating the AI and data layers

With the rise of AI, cloud-based infrastructure is anticipated to grow and evolve to meet new use cases. “The nexus of generative AI and cloud is going to be the next game-changing inflection point for cloud infrastructure,” says Gartner analyst Nag.

“Incorporating specialized silicon, like GPUs, TPUs, and DPUs, in the infrastructure substrate will be key,” Nag says. He adds that the capability to do this across varying cloud estates based on unique AI needs, like training, inferencing, and fine-tuning, will have to be addressed.

Orchestration of AI workloads is an area where Kubernetes seems primed to excel. “Kubernetes will continue to play a mainstream role in the orchestration for generative AI infrastructure,” says Rajiv Thakkar, director of product marketing, Portworx by Pure Storage. Thakkar views Kubernetes as an efficient way to enable data science teams to access GPU computing. Still, due to the mammoth amount of data these models require, this will hinge on continuous access to persistent storage, he says.

Of course, managing stateful deployments on Kubernetes has been, for years, a tricky problem to solve. Yet, many feel the technology is now mature enough to surmount this issue. “It’s finally time for data on Kubernetes to hit the mainstream,” says Liz Warner, CTO of Percona.

“There’s still a sense of ‘Kubernetes was designed to be ephemeral, you should steer clear,’” says Warner. “But with today’s operators, it’s possible to run open-source databases, like MySQL, PostgreSQL, or MongoDB, reliably on Kubernetes.” She adds that doing so will likely result in cost benefits, better multi-cloud and hybrid solutions, and better synergy in the development environment.

Kubernetes on-prem and at the edge

Kubernetes and cloud-native technology are beginning to find new homes… far from the cloud.

“Kubernetes’ unknown magic sauce is that it looks and behaves very modern, but like a CPU, it has backward compatibility 40 to 50 years,” says Isovalent at Cisco’s Thomas Graf. The language-agnosticism of cloud-native technology allows it to handle legacy code, making Kubernetes a prime destination for more massive adoptions, he says. “Most enterprises are betting on it for the next 10 years as this is what they’ll standardize on.”

“Containers, on-premises, in data centers. That’s relatively new. That’s where I see things moving forward,” says Graf. If this is truly where the industry is heading, it will require a modern, universal security mechanism for both cloud and traditional data centers to avoid duplicative efforts. He views eBPF, a doorway to safely and dynamically programming the Linux kernel, made more accessible by the open-source Cilium project, as a key foundation for a common networking layer and a platform-agnostic firewall.

The same undercurrents are driving a new infrastructure paradigm at the edge. “Many of the innovations in the last few years all point toward decentralization,” says Lightbend’s Jonas Bonér, who notes the trend toward smaller Amazon Relational Database Service instances and more powerful infrastructure to help meet users where they are: at the edge.

“It’s extremely wasteful to constantly ship data to the cloud and back,” says Bonér. “We need platforms where data and compute are physically co-located with the end users.” Bonér says this would deliver a “holy trinity” of high throughput, low latency, and high resilience. This sort of local-first development is not wholly reliant upon the cloud but rather treats the cloud as a luxury for data redundancy. As a result, “Cloud and edge are really becoming one,” he says.

A data fabric will be necessary to enable this future of decentralized hybrid architecture, Bonér says. At the same time, he views WebAssembly as a helpful alternative building block to containers, due to its isolated environment—an important consideration for moving data to edge devices. Lightweight alternatives to vanilla Kubernetes, like K3s or KubeEdge, which enable you to run cloud-native capabilities anywhere, will also be key, Bonér says.

Realizing the future of cloud infrastructure

As the flagship for cloud-native infrastructure, Kubernetes is primed for even more mainstream enterprise usage in the coming years. The same can be said for the numerous innovations across persistent data, cluster virtualization, platform engineering, logging, monitoring, and multi-cloud management tools that continue to push the envelope on what the cloud-native ecosystem can offer.

Interestingly, as local computing improves and data ingress and egress fees escalate, there’s an evident shift toward local-first development and deploying cloud-native tools on the edge and classical data centers. “This brings a whole new level of complexity that is mostly unknown to the cloud-native world,” says Isovalent at Cisco’s Graf.

Generative AI is equally set to bring impressive capabilities to this field, automating more and more cloud engineering practices. “With Kubernetes specifically, I believe the system will continue to improve gradually, but it will get a huge boost from AI,” says Omer Hamerman, principal engineer at Zesty. Hamerman believes AI will deliver “a quantum leap” in the automation of Kubernetes and container-based application deployments.

Other technological innovations are poised to reinvent much of what we take for granted across software development at large. For instance, Cosmonic’s Randall notes the use of WebAssembly by edge providers to achieve higher levels of abstraction in their developer platforms. “WebAssembly-native orchestrators like wasmCloud can autoscale the common pluggable capabilities across diverse physical architectures and existing Kubernetes-based platforms,” he says. “With WebAssembly Components, the future is already here—it’s just not evenly distributed.”

That seems an appropriate summation of cloud infrastructure at large. The future is here, much of it founded on progressive open-source technology. Now, this future just has to be realized.

Go to Source

Author: