From 9481fe4541a4f0abab6a4e76931a3d6bd9a5b8b0 Mon Sep 17 00:00:00 2001 From: "Suren A. Chilingaryan" Date: Fri, 2 Aug 2019 02:11:00 +0200 Subject: New WAVe project --- docs/consistency.txt | 6 ++++++ docs/problems.txt | 4 +++- docs/troubleshooting.txt | 2 +- 3 files changed, 10 insertions(+), 2 deletions(-) (limited to 'docs') diff --git a/docs/consistency.txt b/docs/consistency.txt index 082a734..91a0ee7 100644 --- a/docs/consistency.txt +++ b/docs/consistency.txt @@ -10,6 +10,12 @@ General overview - API health check curl -k https://apiserver.kube-service-catalog.svc/healthz +Nodes +===== + - All systemd services are running + * Communication with docker daemon is actually working + - Replicas of mandatory pods (GlusterFS, Router) are running on all nodes + Storage ======= - Heketi status diff --git a/docs/problems.txt b/docs/problems.txt index fa88afe..1d729cd 100644 --- a/docs/problems.txt +++ b/docs/problems.txt @@ -20,6 +20,7 @@ Rogue network interfaces on OpenVSwitch bridge * With time, the new rogue interfaces are created faster and faster. At some point, it really slow downs system and causes pod failures (if many pods are re-scheduled in paralllel) even if not so many rogue interfaces still present + * Even if not failed, it takes several minutes to schedule the pod on the affected nodes. Cause: * Unclear, but it seems periodic ADEI cron jobs causes the issue. @@ -28,7 +29,8 @@ Rogue network interfaces on OpenVSwitch bridge Solutions: - * According to RedHat the temporal solution is to reboot affected node (not helping in my case). The problem + * According to RedHat the temporal solution is to reboot affected node (just temporarily reduces the rate how + often the new spurious interfaces appear, but not preventing the problem completely in my case). The problem should go away, but may re-apper after a while. * The simplest work-around is to just remove rogue interface. They will be re-created, but performance problems only starts after hundreds accumulate. diff --git a/docs/troubleshooting.txt b/docs/troubleshooting.txt index 9fa6f91..ea987b5 100644 --- a/docs/troubleshooting.txt +++ b/docs/troubleshooting.txt @@ -14,7 +14,7 @@ The services has to be running Required Services: - lvm2-lvmetad.socket - lvm2-lvmetad.service - - docker + - docker - it may happen that service is alive according to systemd, but does not respond ('docker ps' timeouts) - NetworkManager - firewalld - dnsmasq -- cgit v1.2.1