Second Brain: Crafted, Curated, Connected, Compounded on 10月02日 21:13
Kubernetes:容器编排的基石
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Kubernetes已成为云原生应用的事实标准,用于实现容器化应用的自动扩展和部署,且支持多云环境,避免厂商锁定。它通过YAML文件定义资源,简化了跨环境应用的开发部署,并能通过资源缩减(甚至缩至零)来降低成本。Kubernetes的模块化和抽象化管理,结合容器技术,使得集中监控成为可能。对于初学者,Docker Desktop集成Kubernetes是一个便捷的入手方式。此外,Managed Data Stacks等服务也为简化Kubernetes的设置和管理提供了方案,同时Kubernetes的命名空间机制提供了良好的安全隔离。

🔑 **Kubernetes的核心价值在于其强大的容器编排能力。** 它作为云原生应用的基石,实现了应用的高可用性、自动化部署与扩展。通过YAML文件定义“期望状态”,Kubernetes能够持续地将实际状态驱动至期望状态,极大地简化了复杂分布式系统的管理,并支持跨云环境,提供了“无锁定”的部署选项。

⚙️ **Kubernetes的资源模型和架构是其高效运作的关键。** 最小的可调度单元是Pod,它封装了一个或多个紧密关联的容器。YAML文件(Manifests)被用来声明Kubernetes资源的期望状态,如Pods、Deployments、StatefulSets等。其核心架构包括etcd(存储集群状态)、kube-api(API入口)、scheduler(调度Pods)和controllers(维护资源状态),共同协作以实现自动化管理。

🌐 **Kubernetes提供了多种服务(Services)类型来灵活地暴露和管理网络流量。** 从仅限于集群内部的ClusterIP,到通过节点IP暴露的NodePort,再到利用云提供商负载均衡器的LoadBalancer,以及用于外部DNS映射的ExternalName,每种类型都针对不同的应用场景。Ingress资源则提供了更高级的HTTP/HTTPS路由、SSL/TLS终止等功能,是外部暴露复杂应用的理想选择。

🧩 **在Pod内部,允许多个容器协同工作,这是一种常见的部署模式。** 这种设计尤其适用于“Sidecar”模式(如日志收集、数据同步)、“Adapter”模式(数据格式转换)和“Ambassador”模式(网络代理)。Init-Containers则用于在主容器启动前执行初始化任务,例如数据准备或配置加载,进一步增强了Pod的灵活性和功能性。

It’s a platform that allows you to run and orchestrate container workloads. Kubernetes has become the de-facto standard for your cloud-native apps to (auto-) Scale-out and deploys your open-source zoo fast, cloud-provider-independent. No lock-in here. You could use OpenShift or OKD. With the latest version, they added the OperatorHub which you can install as of today 182 items with just a few clicks. Also, check out Managed Data Stacks which were created to mitigate exactly that.

Some more reasons for Kubernetes are the move from infrastructure as code towards infrastructure as data, specifically as YAML. All the resources in Kubernetes that include Pods, Configurations, Deployments, Volumes, etc., can simply be expressed in a YAML file. Developers quickly write applications that run across multiple operating environments. Costs can be reduced by scaling down (even to zero with, e.g. [Knative][63]) and also by using plain python or other programming languages instead of paying for a service on Azure, AWS, or Google Cloud. Its management makes it easy through its modularity and abstraction, also with the use of Containers (Docker or [Rocket][65]), you can monitor all your applications in one place.

To get hands-on with Kubernetes you can install Docker Desktop with Kubernetes included. All of my examples are built on top of it and run on any cloud as well as locally. For a more sophisticated set-up in terms of Apache Spark, I suggest reading the blog post from Data Mechanics about Setting up, Managing & Monitoring Spark on Kubernetes. If you are more of a video guy, An introduction to Apache Spark on Kubernetes contains the same content but adds still even on top of it.

As said above, if setting up Kubernetes is too hard, there are Managed Data Stacks, where you can choose existing open-source tools to pick from.

Security: Separation of Concerns as with different namespaces.

# Kubernetes Orchestration

Continuously working towards a desired state.

    Everything is represented as a “Kubernetes Resources”A Pod is the smallest “schedulable” resource (~= container)A Manifest (YAML) defines the desired state of a resourceKubernetes drives “reality” to the desired stateThe current state is updated based on “reality”

# Kubernetes Architecture

    etcd: defines and documents:
      current known statedesired state
graph LR  subgraph node    kubelet["kubelet & kube-proxy"]    containerd    container  end  subgraph control_plane    subgraph etcd      kubernetes_resource    end    controllers    kube-api    scheduler[Default Scheduler]  end  subgraph yaml_file    resource_configurations  end  resource_configurations --> kubectl  kubectl --> kube-api  controllers -->|adapts| kube-api  scheduler -->|adapts| kube-api  kube-api -->|informs| scheduler  kube-api -->|informs| controllers  kube-api -->|manages| kubernetes_resource["kubernetes resource:
- current known state
- desired state"] kube-api -->|informs| kubelet kubelet -->|updates state| kube-api kubelet -->|manages| containerd containerd --> container

Kubernetes Architecture image ^31c463

# Workload Resources

graph TD  subgraph Workload Resources    deployment-->replicaset-->pod    statefulset-->pod    daemonset-->pod    cronjob-->job-->pod    pod[Pod]-.->container    container[Container]    style container stroke-dasharray: 5 5  end
    Pods - smallest schedulable unit ~= containerDeployment - declarative updates for PodsStatefulSet - manages stateful applicationsDaemonSet - ensures a Pod on each nodeCronJob - runs Jobs on a schedule
      Job - runs Pods to completion

# Deployment Patterns

# Containers deployments

When to use multiple container inside a deplyoment?

In Kubernetes, it’s common to run multiple containers within a single Pod when the containers are tightly coupled application components that need to operate together. It’s a anti-pattern to use multiple containers inside the same pod, except for below patterns such as Sidecar, Ambassaador, etc. Usually you would use a different pod deployment for a DB or a different important service.

    Shared Storage: Containers in the same Pod share the same storage volumes. This can be beneficial for situations where one container writes to a shared volume and another reads from it.Inter-process Communication: Since containers in the same Pod share the same network namespace, they can easily communicate with each other using localhost and share the same Port space.Sidecar Pattern: A common use case is the sidecar pattern, where the main application might need an auxiliary helper that pushes logs or data elsewhere. For example, one container might serve a web application while a sidecar container pushes logs or data to an external source.Adapter Pattern: You can use a second container to modify or adapt the data output of the main container in some way. For example, transforming output formats or adapting legacy systems to more modern requirements.Ambassador Pattern: A container can proxy or shuttle network connections for the main container. This can be used for sharding or partitioning in distributed systems.

Init-Container is another container, but these are specified in a sepreate part of the deployment.

Here an example:

 1 2 3 4 5 6 7 8 9101112
...  initContainers:  - name: copy-airflow-dag-to-airflow-bd    image: my-image:0.1.0-a.2    command: ["/bin/sh","-c"]    args: [      "mkdir -p /storage/backup/dags-$(date +%Y%m%d-%H%M%S) && cp -a /storage/dags/. /storage/backup/dags-$(date +%Y%m%d-%H%M%S)/ &&       rm -rf /storage/dags/* && cp -a /opt/airflow/airflow_home/dags/. /storage/dags/"    ]    volumeMounts:    - name: storage      mountPath: /storage

# Services (Network)

Kubernetes provides several types of Services to expose your application inside or outside of a cluster. Let’s break them down:

    ClusterIP: This is the default service type.
      Scope: Internal to the cluster.Purpose: Provides a single IP address and port pair which routes traffic to the underlying Pods.Use-case: When you want to expose your service only within the Kubernetes cluster, for example, a backend service that should not be exposed to external traffic.
    NodePort: Exposes the service on each Node’s IP at a static port.
      Scope: External, using <NodeIP>:<NodePort> combination.Purpose: Allocates a port from a specified range (default: 30000-32767) on each node and forwards traffic on that port to the service.Use-case: Useful for development and debugging, but typically not used directly for production workloads exposed externally.
    LoadBalancer: Provisions an external load balancer in a cloud provider’s infrastructure and directs external traffic to the Kubernetes service.
      Scope: External.Purpose: Integrates with cloud providers to automatically provision an external load balancer pointing to the NodePort and ClusterIP services.Use-case: When running Kubernetes in a cloud provider that supports automatic load balancer provisioning (like AWS, GCP, Azure), this is a straightforward way to expose services to external traffic.
    ExternalName: Maps a service to a DNS name, rather than an IP.
      Scope: External.Purpose: Returns a CNAME record pointing to the specified external name.Use-case: Useful when you want to point a service to an external service outside the cluster without proxying traffic through Kubernetes.
    Headless Service: Service without a ClusterIP.
      Scope: Internal to the cluster.Purpose: Allows direct pod-to-pod communication without a virtual IP in the middle.Use-case: Useful for stateful applications like databases where direct pod addressing is preferable.

Ingress: Ingress is not a service type, but a separate Kubernetes resource designed for HTTP and HTTPS routing to services.

    Scope: External.Purpose: Allows you to define HTTP and HTTPS routes, host-based routing, path-based routing, SSL/TLS termination, and other advanced routing features. Ingress requires an Ingress Controller (like nginx, traefik, or others) to function.Use-case: When you want to expose multiple services under the same IP address with path- or host-based routing, and especially when you need SSL/TLS termination.

Decision Points:

    If you need simple internal communication: Use ClusterIP.For quick external exposure, especially during development: Use NodePort.If you’re using a cloud provider that supports it and need simple external exposure: Use LoadBalancer.To map a service to a DNS name: Use ExternalName.For direct pod-to-pod communication: Use a Headless Service.To expose HTTP/HTTPS applications with routing, SSL, etc.: Use Ingress.

As Kubernetes continues to evolve, there might be additional service types or routing mechanisms in the future. Always refer to the official Kubernetes documentation for the most up-to-date information.

# Pod Types

# Evicted

Evicted pods in Kubernetes are pods that have been terminated and removed from nodes due to various reasons, such as:

    Node pressure: When a node is under resource pressure (e.g., low on memory or disk space), Kubernetes may evict pods to free up resources.Quality of Service (QoS): Lower priority pods might be evicted to make room for higher priority pods.Node maintenance: Pods may be evicted when a node is being drained for maintenance.

Evicted pods remain in the cluster’s API server but are not running on any node. They stay in the “Evicted” state until they are manually deleted or automatically cleaned up by the cluster (depending on your cluster’s configuration).

To delete all evicted pods in a specific namespace, you can use the following kubectl command:

1
kubectl get pods -n <namespace> | grep Evicted | awk '{print $1}' | xargs kubectl delete pod -n <namespace>

# Kinds

# DaemonSets

The desiredNumberScheduled in a DaemonSet is not typically set directly. Instead, it’s determined by the number of nodes in your cluster that match the DaemonSet’s node selection criteria. This is why you don’t see a direct option to set this number in the Helm chart.

Here’s how it works:

    By default, a DaemonSet will try to schedule a pod on every node in the cluster.

# Alternatives

Kubernetes Alternatives


References: YAML, DevOps engine – Kubernetes

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Kubernetes 容器编排 云原生 DevOps YAML Container Orchestration Cloud Native Microservices
相关文章