部署后,注意到k8s采集信息失败,排查categraf日志发现,
input.kubernetes等k8s信息采集,配置中的kubelet配置为127.0.0.1,无法访问;
2024/12/17 02:52:55 instances.go:227: E! failed to query url: https://127.0.0.1:10250/metrics/cadvisor error: Get "https://127.0.0.1:10250/metrics/cadvisor": dial tcp 127.0.0.1:10250: connect: connection refused
2024/12/17 02:52:55 prometheus.go:227: E! failed to query url: https://127.0.0.1:10250/metrics error: Get "https://127.0.0.1:10250/metrics": dial tcp 127.0.0.1:10250: connect: connection refused
2024/12/17 02:53:10 kubernetes.go:113: E! failed to load https://127.0.0.1:10250/stats/summary error: error making HTTP request to https://127.0.0.1:10250/stats/summary: dial tcp 127.0.0.1:10250: connect: connection refused
集群kubelet不允许127.0.0.1访问,
[root@xcmgt01 n9e-helm]# netstat -tunlp |grep 10250
tcp 0 0 172.30.31.15:10250 0.0.0.0:* LISTEN 1809/kubelet
[root@xcmgt01 n9e-helm]# curl https://127.0.0.1:10250
curl: (7) Failed to connect to 127.0.0.1 port 10250: 拒绝连接
[root@xcmgt01 n9e-helm]# curl https://172.30.31.15:10250
404 page not found
[root@xcmgt01 n9e-helm]# curl https://127.0.0.1:10250/metrics/cadvisor
curl: (7) Failed to connect to 127.0.0.1 port 10250: 拒绝连接
[root@xcmgt01 n9e-helm]# curl https://172.30.31.15:10250/metrics/cadvisor
建议添加注释,或优化为以下配置,提供遇到同样问题时的解决方案
${HOSTIP}为Categraf的pod自带的环境变量,完美解决了我的问题
# URL for the kubelet
url = "https://${HOSTIP}:10250"
[root@xcmgt01 n9e-helm]# kubectl exec -it nightingale-categraf-v6-wkgj9 -n n9e -- printenv | grep HOSTIP
HOSTIP=172.30.31.46
[root@xcmgt01 n9e-helm]# kubectl exec -it nightingale-categraf-v6-wkgj9 -n n9e -- printenv | grep 172.30
HOSTIP=172.30.31.46
部署后,注意到k8s采集信息失败,排查categraf日志发现,
input.kubernetes等k8s信息采集,配置中的kubelet配置为127.0.0.1,无法访问;
集群kubelet不允许127.0.0.1访问,
建议添加注释,或优化为以下配置,提供遇到同样问题时的解决方案
${HOSTIP}为Categraf的pod自带的环境变量,完美解决了我的问题