k8s中pod的自动扩缩容

HPA说明

Kubernetes从1.1版本开始，新增了名为Horizontal Pod Autoscaler（HPA）的控制器，用于实现基于CPU使用率进行自动Pod扩缩容的功能。 HPA控制器基于Master的kube-controller-manager服务启动参数–horizontal-pod-autoscaler-sync-period定义的探测周期（默认值为15s），周期性地监测目标Pod的资源性能指标，并与HPA资源对象中的扩缩容条件进行对比，在满足条件时对Pod副本数量进行调整。Kubernetes在早期版本中，只能基于Pod的CPU使用率进行自动扩缩容操作，关于CPU使用率的数据来源于Heapster组件。 Kubernetes从1.6版本开始，引入了基于应用自定义性能指标的HPA机制，并在1.9版本之后逐步成熟。

HPA工作原理

Kubernetes中的某个Metrics Server（Heapster或自定义Metrics Server）持续采集所有Pod副本的指标数据。 HPA控制器通过Metrics Server的API（Heapster的API或聚合API）获取这些数据，基于用户定义的扩缩容规则进行计算，得到目标Pod副本数量。当目标Pod副本数量与当前副本数量不同时， HPA控制器就向Pod的副本控制器（Deployment、 RC或ReplicaSet）发起scale操作，调整Pod的副本数量，完成扩缩容操作。如下图所示：

指标类型

Pod的资源使用率，例如CPU使用率
Pod自定义指标，例如接收的请求数量
Object自定义指标或外部定义指标，例如通过HTTP URL“/metrics”提供，或者使用外部服务提供的指标采集URL

Kubernetes从1.11版本开始，弃用基于Heapster组件完成Pod的CPU使用率采集的机制，全面转向基于Metrics Server完成数据采集。 Metrics Server将采集到的Pod性能指标数据通过聚合API（Aggregated API）如metrics.k8s.io、 custom.metrics.k8s.io和external.metrics.k8s.io提供给HPA控制器进行查询

示例

基于CPU的HPA

下面创建一个deployment

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mty-production-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mty-production-api
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: mty-production-api
    spec:
      containers:
      - image: harbor.ysmty.com:19999/onair/mty-production-api:202007151447-3.5.2-b9a7f09
        imagePullPolicy: IfNotPresent
        name: mty-production-api
        resources:
          limits:
            cpu: 4
            memory: 4Gi
          requests:
            cpu: 100m
            memory: 128Mi
        volumeMounts:
        - mountPath: /usr/local/mty-production-api/logs
          name: log-pv
          subPath: mty-production-api
      imagePullSecrets:
      - name: mima
      restartPolicy: Always
      volumes:
      - name: log-pv
        persistentVolumeClaim:
          claimName: log-pv

运行这个yaml文件即可，这时这个deployment资源pod会启动起来，现在正常应该是只启动一个pod 下面，使用HPA，基于CPU来做动态扩容

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-demo
  namespace: default
spec:
  maxReplicas: 5
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mty-production-api
  targetCPUUtilizationPercentage: 10
status:
  currentReplicas: 1
  desiredReplicas: 0

完事之后，启动该yaml文件，可以查看hpa的资源类型

# kubectl get hpa
NAME       REFERENCE                       TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
hpa-demo   Deployment/mty-production-api   8%/10%    1         5         5          28m

使用简单的压测工具，进行测试下

ab -n 10000 -c 10 http://172.17.58.255:8080/api/healthy/check

随后，再次查看pod数量

# kubectl get pod | grep mty-production-api
mty-production-api-596dfc85c4-599xj               1/1     Running       0          28m
mty-production-api-596dfc85c4-922p4               1/1     Running       0          27m
mty-production-api-596dfc85c4-b6zcx               1/1     Running       0          27m
mty-production-api-596dfc85c4-cqdz2               1/1     Running       0          12d
mty-production-api-596dfc85c4-fmk5w               1/1     Running       0          27m

可以看到现在已经启动了4个了。说明hpa已经生效了。查看下hpa的相关信息

# kubectl describe hpa hpa-demo 
Name:                                                  hpa-demo
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           kubectl.kubernetes.io/last-applied-configuration:
                                                         {"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-demo","namespace":"default"},"spe...
CreationTimestamp:                                     Mon, 03 Aug 2020 23:20:50 +0800
Reference:                                             Deployment/mty-production-api
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  8% (8m) / 10%
Min replicas:                                          1
Max replicas:                                          5
Deployment pods:                                       5 current / 5 desired
Conditions:
  Type            Status  Reason               Message
  ----            ------  ------               -------
  AbleToScale     True    ScaleDownStabilized  recent recommendations were higher than current one, applying the highest recent recommendation
  ScalingActive   True    ValidMetricFound     the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  True    TooManyReplicas      the desired replica count is more than the maximum replica count
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  29m   horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  28m   horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  28m   horizontal-pod-autoscaler  New size: 5; reason: cpu resource utilization (percentage of request) above target

停止压测，过一会，pod的数量应该会再次变成一个pod。