2017-09-26 244 views
0

我有一个kubernetes部署,其中我试图在单个节点上的单个pod内运行5个docker容器。容器处于“挂起”状态,并且从不计划。我不介意运行超过1个吊舱,但我想保持节点数量下降。我假设有1个节点,1个CPU和1.7G RAM就足够用于5个容器,并且我试图将工作负载分开。pod挂起挂起状态

最初我得出结论,我资源不足。我启用它产生了以下(见kubectl描述荚命令)节点的自动缩放:

荚没有引发规模化(如果添加一个新的节点,将不适合)

无论如何,每个Docker容器都有一个简单的命令,它运行一个相当简单的应用程序。理想情况下,我不想处理设置CPU和RAM分配资源的问题,但即使将CPU /内存限制设置在范围内,所以它们不会加起来大于1,我仍然可以得到(请参阅kubectl describe po/test- 529945953-gh6cl)我得到这样的:

没有可用的节点相匹配的以下所有谓词:: CPU不足(1),内存不足(1)。

下面是显示状态的各种命令。任何帮助我做错了将不胜感激。

kubectl得到所有

[email protected]:~/gce$ kubectl get all 
NAME       READY  STATUS RESTARTS AGE 
po/test-529945953-gh6cl 0/5  Pending 0   34m 

NAME    CLUSTER-IP EXTERNAL-IP PORT(S) AGE 
svc/kubernetes 10.7.240.1 <none>  443/TCP 19d 

NAME    DESIRED CURRENT UP-TO-DATE AVAILABLE AGE 
deploy/test 1   1   1   0   34m 

NAME     DESIRED CURRENT READY  AGE 
rs/test-529945953 1   1   0   34m 
[email protected]:~/gce$ 

kubectl描述PO /测试529945953-gh6cl

[email protected]:~/gce$ kubectl describe po/test-529945953-gh6cl 
Name:   test-529945953-gh6cl 
Namespace:  default 
Node:   <none> 
Labels:   app=test 
       pod-template-hash=529945953 
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"test-529945953","uid":"c6e889cb-a2a0-11e7-ac18-42010a9a001a"... 
Status:   Pending 
IP: 
Created By:  ReplicaSet/test-529945953 
Controlled By: ReplicaSet/test-529945953 
Containers: 
    container-test2-tickers: 
    Image:  gcr.io/testing-11111/testology:latest 
    Port:  <none> 
    Command: 
     process_cmd 
     arg1 
     test2 
    Limits: 
     cpu:  150m 
     memory: 375Mi 
    Requests: 
     cpu:  100m 
     memory: 375Mi 
    Environment: 
     DB_HOST:   127.0.0.1:5432 
     DB_PASSWORD:  <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false 
     DB_USER:   <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false 
    Mounts: 
     /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro) 
    container-kraken-tickers: 
    Image:  gcr.io/testing-11111/testology:latest 
    Port:  <none> 
    Command: 
     process_cmd 
     arg1 
     arg2 
    Limits: 
     cpu:  150m 
     memory: 375Mi 
    Requests: 
     cpu:  100m 
     memory: 375Mi 
    Environment: 
     DB_HOST:   127.0.0.1:5432 
     DB_PASSWORD:  <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false 
     DB_USER:   <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false 
    Mounts: 
     /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro) 
    container-gdax-tickers: 
    Image:  gcr.io/testing-11111/testology:latest 
    Port:  <none> 
    Command: 
     process_cmd 
     arg1 
     arg2 
    Limits: 
     cpu:  150m 
     memory: 375Mi 
    Requests: 
     cpu:  100m 
     memory: 375Mi 
    Environment: 
     DB_HOST:   127.0.0.1:5432 
     DB_PASSWORD:  <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false 
     DB_USER:   <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false 
    Mounts: 
     /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro) 
    container-bittrex-tickers: 
    Image:  gcr.io/testing-11111/testology:latest 
    Port:  <none> 
    Command: 
     process_cmd 
     arg1 
     arg2 
    Limits: 
     cpu:  150m 
     memory: 375Mi 
    Requests: 
     cpu:  100m 
     memory: 375Mi 
    Environment: 
     DB_HOST:   127.0.0.1:5432 
     DB_PASSWORD:  <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false 
     DB_USER:   <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false 
    Mounts: 
     /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro) 
    cloudsql-proxy: 
    Image:  gcr.io/cloudsql-docker/gce-proxy:1.09 
    Port:  <none> 
    Command: 
     /cloud_sql_proxy 
     --dir=/cloudsql 
     -instances=testing-11111:europe-west2:testology=tcp:5432 
     -credential_file=/secrets/cloudsql/credentials.json 
    Limits: 
     cpu:  150m 
     memory: 375Mi 
    Requests: 
     cpu:    100m 
     memory:   375Mi 
    Environment:  <none> 
    Mounts: 
     /cloudsql from cloudsql (rw) 
     /etc/ssl/certs from ssl-certs (rw) 
     /secrets/cloudsql from cloudsql-instance-credentials (ro) 
     /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro) 
Conditions: 
    Type   Status 
    PodScheduled False 
Volumes: 
    cloudsql-instance-credentials: 
    Type:  Secret (a volume populated by a Secret) 
    SecretName: cloudsql-instance-credentials 
    Optional: false 
    ssl-certs: 
    Type:  HostPath (bare host directory volume) 
    Path:  /etc/ssl/certs 
    cloudsql: 
    Type:  EmptyDir (a temporary directory that shares a pod's lifetime) 
    Medium: 
    default-token-b2mxc: 
    Type:  Secret (a volume populated by a Secret) 
    SecretName: default-token-b2mxc 
    Optional: false 
QoS Class:  Burstable 
Node-Selectors: <none> 
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s 
       node.alpha.kubernetes.io/unreachable:NoExecute for 300s 
Events: 
    FirstSeen  LastSeen  Count From     SubObjectPath Type   Reason     Message 
    ---------  --------  ----- ----     ------------- --------  ------     ------- 
    27m   17m    44  default-scheduler      Warning   FailedScheduling  No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (2). 
    26m   8s    150  cluster-autoscaler      Normal   NotTriggerScaleUp  pod didn't trigger scale-up (it wouldn't fit if a new node is added) 
    16m   2s    63  default-scheduler      Warning   FailedScheduling  No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (1). 
[email protected]:~/gce$ 

> Blockquote 

kubectl得到节点

[email protected]:~/gce$ kubectl get nodes 
NAME          STATUS AGE  VERSION 
gke-test-default-pool-abdf83f7-p4zw Ready  9h  v1.6.7 

kubectl得到荚

[email protected]:~/gce$ kubectl get pods 
NAME      READY  STATUS RESTARTS AGE 
test-529945953-gh6cl 0/5  Pending 0   38m 

kubectl描述节点

[email protected]:~/gce$ kubectl describe nodes 
Name:     gke-test-default-pool-abdf83f7-p4zw 
Role: 
Labels:     beta.kubernetes.io/arch=amd64 
         beta.kubernetes.io/fluentd-ds-ready=true 
         beta.kubernetes.io/instance-type=g1-small 
         beta.kubernetes.io/os=linux 
         cloud.google.com/gke-nodepool=default-pool 
         failure-domain.beta.kubernetes.io/region=europe-west2 
         failure-domain.beta.kubernetes.io/zone=europe-west2-c 
         kubernetes.io/hostname=gke-test-default-pool-abdf83f7-p4zw 
Annotations:   node.alpha.kubernetes.io/ttl=0 
         volumes.kubernetes.io/controller-managed-attach-detach=true 
Taints:     <none> 
CreationTimestamp:  Tue, 26 Sep 2017 02:05:45 +0100 
Conditions: 
    Type     Status LastHeartbeatTime      LastTransitionTime      Reason       Message 
    ----     ------ -----------------      ------------------      ------       ------- 
    NetworkUnavailable False Tue, 26 Sep 2017 02:06:05 +0100   Tue, 26 Sep 2017 02:06:05 +0100   RouteCreated     RouteController created a route 
    OutOfDisk    False Tue, 26 Sep 2017 11:33:57 +0100   Tue, 26 Sep 2017 02:05:45 +0100   KubeletHasSufficientDisk  kubelet has sufficient disk space available 
    MemoryPressure  False Tue, 26 Sep 2017 11:33:57 +0100   Tue, 26 Sep 2017 02:05:45 +0100   KubeletHasSufficientMemory  kubelet has sufficient memory available 
    DiskPressure   False Tue, 26 Sep 2017 11:33:57 +0100   Tue, 26 Sep 2017 02:05:45 +0100   KubeletHasNoDiskPressure  kubelet has no disk pressure 
    Ready     True Tue, 26 Sep 2017 11:33:57 +0100   Tue, 26 Sep 2017 02:06:05 +0100   KubeletReady     kubelet is posting ready status. AppArmor enabled 
    KernelDeadlock  False Tue, 26 Sep 2017 11:33:12 +0100   Tue, 26 Sep 2017 02:05:45 +0100   KernelHasNoDeadlock    kernel has no deadlock 
Addresses: 
    InternalIP: 10.154.0.2 
    ExternalIP: 35.197.217.1 
    Hostname:  gke-test-default-pool-abdf83f7-p4zw 
Capacity: 
cpu:   1 
memory:  1742968Ki 
pods:   110 
Allocatable: 
cpu:   1 
memory:  1742968Ki 
pods:   110 
System Info: 
Machine ID:     e6119abf844c564193495c64fd9bd341 
System UUID:     E6119ABF-844C-5641-9349-5C64FD9BD341 
Boot ID:      1c2f2ea0-1f5b-4c90-9e14-d1d9d7b75221 
Kernel Version:    4.4.52+ 
OS Image:      Container-Optimized OS from Google 
Operating System:    linux 
Architecture:     amd64 
Container Runtime Version:  docker://1.11.2 
Kubelet Version:    v1.6.7 
Kube-Proxy Version:   v1.6.7 
PodCIDR:      10.4.1.0/24 
ExternalID:      6073438913956157854 
Non-terminated Pods:   (7 in total) 
    Namespace      Name               CPU Requests CPU Limits  Memory Requests Memory Limits 
    ---------      ----               ------------ ----------  --------------- ------------- 
    kube-system     fluentd-gcp-v2.0-k565g           100m (10%)  0 (0%)   200Mi (11%)  300Mi (17%) 
    kube-system     heapster-v1.3.0-3440173064-1ztvw        138m (13%)  138m (13%)  301456Ki (17%) 301456Ki (17%) 
    kube-system     kube-dns-1829567597-gdz52          260m (26%)  0 (0%)   110Mi (6%)  170Mi (9%) 
    kube-system     kube-dns-autoscaler-2501648610-7q9dd       20m (2%)  0 (0%)   10Mi (0%)  0 (0%) 
    kube-system     kube-proxy-gke-test-default-pool-abdf83f7-p4zw    100m (10%)  0 (0%)   0 (0%)   0 (0%) 
    kube-system     kubernetes-dashboard-490794276-25hmn       100m (10%)  100m (10%)  50Mi (2%)  50Mi (2%) 
    kube-system     l7-default-backend-3574702981-flqck        10m (1%)  10m (1%)  20Mi (1%)  20Mi (1%) 
Allocated resources: 
    (Total limits may be over 100 percent, i.e., overcommitted.) 
    CPU Requests CPU Limits  Memory Requests Memory Limits 
    ------------ ----------  --------------- ------------- 
    728m (72%) 248m (24%)  700816Ki (40%) 854416Ki (49%) 
Events:   <none> 

回答

1

正如你可以在你的kubectl describe nodes命令的输出Allocated resources:下看,则存在728m (72%) CPU和700816Ki (40%)在节点上的kube-system命名空间中运行的Pod已经请求了内存。测试Pod的资源请求总和超过了节点上可用的剩余CPU和内存,正如您在kubectl describe po/[…]命令的Events下看到的那样。

如果要将所有容器保留在一个容器中,则需要减少容器的资源请求或在具有更多CPU和内存的节点上运行它们。更好的解决方案是将应用程序拆分为多个Pod,这样可以分布在多个节点上。

+0

会在每个容器中运行它自己的容器,从而允许将单独的容器调度到额外的节点上?这是另一种解决方案吗? – s5s

+0

是的,这将是一条路。 –