kubernetes pod的弹性伸缩(二)(基于自定义监控指标的弹性伸缩)

实际生产环境中，通过CPU和内存的监控指标弹性伸缩不能很好的反映应用真实的状态，所以我们需要根据应用本身自定义一些监控指标来进行弹性伸缩，如web应用，根据当前QPS来进行弹性，在这里Kubernetes HPA本身也支持自定义监控指标。

自定义监控指标收集过程：

pod 内置一个metrics 或者挂一个sidecar 当作exporter对外暴露。
Prometheus收集对应的监控指标。
Prometheus-adapter定期从prometheus收集指标对抓取的监控指标进行过滤和晒算，通过custom-metrics-apiserver将指标对外暴露。
HPA控制器从custom-metrics-apiserver获取数据。

部署prometheus

clone代码

1	git clone https://github.com/stefanprodan/k8s-prom-hpa

切换到k8s-prom-hpa目录

1 2	kubectl create namespace monitoring kubectl create -f ./prometheus

查看prometheus

kubectl get pod -n monitoring
NAME                                        READY   STATUS    RESTARTS   AGE
prometheus-64bc56989f-qcs4p                 1/1     Running   1          22h

kubectl  get svc -n monitoring
NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
prometheus                 NodePort    10.104.215.207   <none>        9090:31190/TCP   22h

访问prometheus
http://node_ip:31190

prometheus内prometheus-cfg.yaml 配置了自动发生规则，会自动将组件注册。

监控指标被prometheus收集 –>Prometheus adapter变换指标格式–>custom metrics apiserver–>k8s hpa

部署kubernetes-prometheus-adapter

进入k8s-prom-hpa目录
生成Prometheus-adapter所需的TLS证书：

1	make certs

custom-metrics-api/custom-metrics-apiservice.yaml中配置了
insecureSkipTLSVerify: true选项，所以生成的证书是否受信任都无所谓。

查看output文件夹有以下文件

1 2	ls output/ apiserver.csr apiserver-key.pem apiserver.pem

部署k8s-prometheus-adapter

1	kubectl create -f ./custom-metrics-api

prometheus-adapter配置文件查看
custom-metrics-apiserver-deployment.yaml文件

args:
       - /adapter
       - --secure-port=6443
       - --tls-cert-file=/var/run/serving-cert/serving.crt //连接custom-metrics-apiserver的证书
       - --tls-private-key-file=/var/run/serving-cert/serving.key
       - --logtostderr=true //日志标准错误输出
       - --prometheus-url=http://prometheus.monitoring.svc:9090/ //连接prometheus的地址，因为prometheus部署在集群内，所以可以直接用Service地址
       - --metrics-relist-interval=30s
       - --v=10 //日志debug等级，值越高日志越详细，可以适当调小
       - --config=/etc/adapter/config.yaml //prometheus-adapter收集到prometheus监控指标后的过滤规则

custom-metrics-config-map.yaml文件

custom-metrics-config-map.yaml文件主要定义的是prometheus-adapter去prometheus获取指标的规则，因为prometheus的指标不能直接拿来用，需要通过prometheus-adapter进行一层中转和修改，然后将重新组装的指标通过自己接口暴露给custom-metrics-apiserver。

所以这个文件内的值，实际上定义的就是从prometheus抓取规则的语句。

- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
     seriesFilters:
     - isNot: .*_seconds_total
     resources:
       template: <<.Resource>>
     name:
       matches: ^(.*)_total$
       as: ""
     metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)

这个文件内容分为两部分第一个部分是 rules，用于 custom metrics；第二部分是 resourceRules，用于 metrics。
Prometheus adapter，可以将 Prometheus 中的任何一个指标都用于 HPA，但需要在prometheus-adapter内定义查询语句将它拿到。如果只需要使用一个指标做 HPA，可以只写一条查询，而不需要像这里面使用了多个查询。

字段解释：

seriesQuery：prometheus的查询语句
seriesFilters：指标过滤
is：需要筛选保留下来的。
isNot：需要过滤掉的。
resource：将指标中的标签和k8s资源对应起来有两种方式一种是用overrides方式
name：用来给对应的指重命名的，有些指标是递增的http_request但采集的原始指标是http_request_total，需要进行一层计算然后过滤掉_total
matches：通过正则表达式来匹配指标名，可以进行分组；
as：默认值为 $1，也就是第一个分组。as 为空就是使用默认值的意思,也就是去.*对应的值。
metricsQuery： metricsQuery字段是一个Go模板，对调用出来的prometheus指标进行特定的处理。
Series：指标名称
LabelMatchers：标签匹配列表。
1md：定义时间范围

总体来说就是获取多次http_request_total指标，然后进行处理计算过去1分钟内每秒http_request，最后结果返回为http_request指标。

查看k8s-prometheus-adapter部署情况

kubectl  get pod -n monitoring
NAME                                        READY   STATUS    RESTARTS   AGE
custom-metrics-apiserver-6c6c7f67d8-9vdkt   1/1     Running   0          4h58m
prometheus-64bc56989f-qcs4p                 1/1     Running   0          4h59m

查看创建的api组

1
2
3

kubectl api-versions  | grep metrics
custom.metrics.k8s.io/v1beta1
metrics.k8s.io/v1beta1

获取自定义监控指标

1	yum install jq -y

1	kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" \| jq .

{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "namespaces/go_gc_duration_seconds_count",
      "singularName": "",
      "namespaced": false,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    },

部署Podinfo应用测试custom-metric autoscale

1 2	kubectl create -f ./podinfo/podinfo-svc.yaml,./podinfo/podinfo-dep.yaml

prometheus配置了自动发现规则，在podinfo-dep.yaml里面配置了对应的规则

annotations:
        prometheus.io/scrape: 'true'
```        
所以应用一启动就能直接在prometheus-target中发现。  



获取自定义监控指标

kubectl get –raw “/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests” | jq .
{
“kind”: “MetricValueList”,
“apiVersion”: “custom.metrics.k8s.io/v1beta1”,
“metadata”: {
“selfLink”: “/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/http_requests”
},
“items”: [
{
“describedObject”: {
“kind”: “Pod”,
“namespace”: “default”,
“name”: “podinfo-58b68656c9-b9cmg”,
“apiVersion”: “/v1”
},
“metricName”: “http_requests”,
“timestamp”: “2019-11-09T09:00:09Z”,
“value”: “888m”
},
{
“describedObject”: {
“kind”: “Pod”,
“namespace”: “default”,
“name”: “podinfo-58b68656c9-mr265”,
“apiVersion”: “/v1”
},
“metricName”: “http_requests”,
“timestamp”: “2019-11-09T09:00:09Z”,
“value”: “911m”
}
]
}




配置podinfo custom hpa

kubectl create -f ./podinfo/podinfo-hpa-custom.yaml

```
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: podinfo
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: podinfo
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metricName: http_requests
      targetAverageValue: 10

这里10指的是每秒10个请求，按照定义的规则metricsQuery中的时间范围1分钟，这就意味着过去1分钟内每秒如果达到10个请求则会进行扩容。

kubectl  get hpa
NAME      REFERENCE            TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
podinfo   Deployment/podinfo   713m/10   2         10        8          18h

713m是什么意思？
自定义API SERVER收到请求后会从Prometheus里面查询http_requests_total的值，然后把这个值换算成一个以时间为单位的请求率。713m的m就是milli-requests，按照定义的规则metricsQuery中的时间范围1分钟，大概每秒为0.71个请求

使用webbench进行压测

1
2
3

webbench -c 100  http://172.31.48.86:31198/

-c表示发送100个请求

查看hpa，可以看见请求数暴涨

1
2
3

kubectl  get hpa
NAME      REFERENCE            TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
podinfo   Deployment/podinfo   57610m/10   2         10        8          12h

查看HPA事件

kubectl describe hpa/podinfo
Type     Reason                        Age                From                       Message
  ----     ------                        ----               ----                       -------
  Normal   SuccessfulRescale             33s                horizontal-pod-autoscaler  New size: 4; reason: pods metric http_requests above target
  Normal   SuccessfulRescale             17s                horizontal-pod-autoscaler  New size: 8; reason: pods metric http_requests above target
  Normal   SuccessfulRescale             2s                 horizontal-pod-autoscaler  New size: 10; reason: pods metric http_requests above target

我爱西红柿

Solution Architect

kubernetes pod的弹性伸缩(二)(基于自定义监控指标的弹性伸缩)

部署prometheus

部署kubernetes-prometheus-adapter