2017-10-04 35 views
0

我有在AWS实例上运行的Kubernetes集群,并在kubernetes内运行prometheus进行监视。有三个etcd服务器运行在kubernetes外部,我试图用prometheus来监视etcd的健康状况。Prometheus无法抓取外部etcd

Prometheus作为一个有状态集进行部署,并具有kubelet,节点导出器和其自身的度量标准。但是,我无法从etcd获取任何指标。

这里是普罗米修斯的配置的相关部分:

apiVersion: v1 
kind: ConfigMap 
metadata: 
    name: prometheus 
    namespace: monitoring 
    data: 
    prometheus.yml: |- 
global: 
    scrape_interval: 30s 
    evaluation_interval: 30s 

rule_files: 
- /etc/alertmanager/*.rules 

scrape_configs: 

- job_name: etcd 
    scheme: https 
    static_configs: 
    - targets: ['x.x.x.x:2379'] 
    tls_config: 
    ca_file: /etc/etcd/ssl/ca.pem 
    cert_file: /etc/etcd/ssl/client.pem 
    key_file: /etc/etcd/ssl/client-key.pem 
    insecure_skip_verify: true 

- job_name: kubelets 
    scheme: https 
    tls_config: 
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt 
    insecure_skip_verify: true 
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token 

这是我在普罗米修斯的仪表盘得到的错误:

Get https://x.x.x.x.:2379/metrics: x509: cannot validate certificate for x.x.x.x because it doesn't contain any IP SANs 

的证书是自签名的,但不应该“ insecure_skip_verify“照顾呢?

回答

0

为了消除etcd问题,如果您使用的是etcd3,您可以在etcd客户端etcdctl中使用以下参数,并使用https://github.com/coreos/etcd/blob/master/Documentation/dev-guide/interacting_v3.md中的步骤与etcd服务器交互。如果它的工作没有错误,我会说这是一个普罗米修斯问题,因为没有履行insecure_skip_verify: true配置。

--insecure-skip-tls-verify=true skip server certificate verification 
--insecure-transport=true   disable transport security for client connections