Define the bind mapping for prometheus http server

Changed api response time to milliseconds, updated grafan dashbaords, and updated README to reflect that.
Added logic to capture instance deploy failures
2026-03-25 09:54:49 +00:00 · 2021-03-10 13:36:27 -05:00 · 2020-12-25 09:57:18 -05:00 · 2020-12-25 09:57:18 -05:00 · 2020-12-11 15:37:04 -05:00 · 2020-12-11 15:11:51 -05:00
8 changed files with 1042 additions and 155 deletions
--- a/README.md
+++ b/README.md
@@ -35,23 +35,23 @@ openstack_api_status{api_name="horizon",cloud_name="CLOUD_NAME"}

 # Information
 ### Standard Metrics Provided
-| Metric | Description|
-|--------------------------|--------------------------|
-| `openstack_api_response_seconds {api_name="API_NAME",cloud_name="CLOUD_NAME"}` | Seconds for the api to respond via openstack sdk. nova, neutron, and cinder are currently recorded. |
-| `openstack_api_status {api_name="API_NAME",cloud_name="CLOUD_NAME"}` | Status of the openstack api. 1 = up 0 = down. nova, neutron, and cinder are currently recorded. |
-| `openstack_hypervisor_running_vms {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | Number of running VMs on every hypervisor in the region. |
-| `openstack_hypervisor_used_ram_mb {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | Amount of RAM in MB used (as reported by nova-compute) for every hypervisor in the region. |
-| `openstack_hypervisor_total_ram_mb {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | Amount of RAM in MB in total (as reported by nova-compute) for every hypervisor in the region. |
-| `openstack_hypervisor_used_cpus {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | Number of vcpus used (as reported by nova-compute) for every hypervisor in the region. |
-| `openstack_hypervisor_total_cpus {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | Number of vcpus in total (as reported by nova-compute) for every hypervisor in the region. |
-| `openstack_hypervisor_enabled {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | nova-compute status for every hypervisor in the region. 1 = enabled 0 = disabled|
-| `openstack_hypervisor_up {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | nova-compute state for every hypervisor in the region. 1 = up 0 = down |
-| `openstack_hypervisor_local_gb_total {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}`| Total local disk in GB (as reported by nova-compute) for every hypervisor in the region. |
-| `openstack_hypervisor_local_gb_used {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | Total local disk used in GB (as reported by nova-compute) for every hypervisor in the region. |
+| Metric | Metric Labels | Description|
+| :--- | :--- | :--- |
+| `openstack_api_response_milliseconds` | `{api_name="API_NAME",cloud_name="CLOUD_NAME"}` | Milliseconds for the api to respond via openstack sdk. nova, neutron, and cinder are currently recorded. |
+| `openstack_api_status` | `{api_name="API_NAME",cloud_name="CLOUD_NAME"}` | Status of the openstack api. 1 = up 0 = down. nova, neutron, and cinder are currently recorded. |
+| `openstack_hypervisor_running_vms` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}` | Number of running VMs on every hypervisor in the region. |
+| `openstack_hypervisor_used_ram_mb` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}` | Amount of RAM in MB used (as reported by nova-compute) for every hypervisor in the region. |
+| `openstack_hypervisor_total_ram_mb` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}` | Amount of RAM in MB in total (as reported by nova-compute) for every hypervisor in the region. |
+| `openstack_hypervisor_used_cpus` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}` | Number of vcpus used (as reported by nova-compute) for every hypervisor in the region. |
+| `openstack_hypervisor_total_cpus` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}` | Number of vcpus in total (as reported by nova-compute) for every hypervisor in the region. |
+| `openstack_hypervisor_enabled` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}` | nova-compute status for every hypervisor in the region. 1 = enabled 0 = disabled|
+| `openstack_hypervisor_up` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}` | nova-compute state for every hypervisor in the region. 1 = up 0 = down |
+| `openstack_hypervisor_local_gb_total` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}`| Total local disk in GB (as reported by nova-compute) for every hypervisor in the region. |
+| `openstack_hypervisor_local_gb_used` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME",aggregate="AGGREGATE_NAME"}` | Total local disk used in GB (as reported by nova-compute) for every hypervisor in the region. |

 ### Optional Metrics (use flags when running)
-| Metric | Description |
-|-----|-----|
-|`openstack_instance_deploy_seconds_to_ping {hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | Seconds from deploy command to ping when creating an instance for every hypervisor in the region. Requires --flavor, --image, --network, and --instance_deploy flags. The network used needs to have TCP port 22 (uses TCP instead of ICMP to ping) open in the default security group. |
-|`openstack_horizon_response_seconds {cloud_name="CLOUD_NAME"}` | Seconds it takes for Chromium to log into Horizon. Requires --horizon_url flag. |
-|`openstack_horizon_status {cloud_name="CLOUD_NAME"}` | Horizon status. 1 = up 0 = down. Requires --horizon_url flag. |
+| Metric | Metrics Labels | Description |
+| :--- | :--- | :--- |
+|`openstack_instance_deploy_seconds_to_ping` | `{hypervisor_hostname="HYPERVISOR_NAME",cloud_name="CLOUD_NAME"}` | Seconds from deploy command to ping when creating an instance for every hypervisor in the region. Requires --flavor, --image, --network, and --instance_deploy flags. The network used needs to have TCP port 22 (uses TCP instead of ICMP to ping) open in the default security group. |
+|`openstack_horizon_response_seconds` | `{cloud_name="CLOUD_NAME"}` | Seconds it takes for Chromium to log into Horizon. Requires --horizon_url flag. |
+|`openstack_horizon_status` | `{cloud_name="CLOUD_NAME"}` | Horizon status. 1 = up 0 = down. Requires --horizon_url flag. |
--- a/grafana-dashboards/Openstack
+++ b/grafana-dashboards/Openstack
@@ -64,7 +64,7 @@
  "gnetId": null,
  "graphTooltip": 0,
  "id": null,
-  "iteration": 1606267613619,
+  "iteration": 1608835196471,
  "links": [],
  "panels": [
    {
@@ -361,7 +361,7 @@
        "#d44a3a"
      ],
      "datasource": "${DS_PROMETHEUS}",
-      "format": "none",
+      "format": "ms",
      "gauge": {
        "maxValue": 100,
        "minValue": 0,
@@ -393,7 +393,7 @@
      "nullPointMode": "connected",
      "nullText": null,
      "options": {},
-      "postfix": " Seconds",
+      "postfix": "",
      "postfixFontSize": "50%",
      "prefix": "",
      "prefixFontSize": "50%",
@@ -409,12 +409,12 @@
      "tableColumn": "",
      "targets": [
        {
-          "expr": "openstack_api_response_seconds{api_name=\"nova\", cloud_name=\"$cloud\"}",
+          "expr": "openstack_api_response_milliseconds{api_name=\"nova\", cloud_name=\"$cloud\"}",
          "instant": false,
          "refId": "A"
        }
      ],
-      "thresholds": "5,10",
+      "thresholds": "5000,10000",
      "timeFrom": null,
      "timeShift": null,
      "title": "Nova API Avg Response Time",
@@ -439,7 +439,8 @@
        "#d44a3a"
      ],
      "datasource": "${DS_PROMETHEUS}",
-      "format": "none",
+      "decimals": null,
+      "format": "ms",
      "gauge": {
        "maxValue": 100,
        "minValue": 0,
@@ -471,7 +472,7 @@
      "nullPointMode": "connected",
      "nullText": null,
      "options": {},
-      "postfix": " Seconds",
+      "postfix": "",
      "postfixFontSize": "50%",
      "prefix": "",
      "prefixFontSize": "50%",
@@ -487,12 +488,12 @@
      "tableColumn": "",
      "targets": [
        {
-          "expr": "openstack_api_response_seconds{api_name=\"neutron\", cloud_name=\"$cloud\"}",
+          "expr": "openstack_api_response_milliseconds{api_name=\"neutron\", cloud_name=\"$cloud\"}",
          "instant": false,
          "refId": "A"
        }
      ],
-      "thresholds": "5,10",
+      "thresholds": "5000,10000",
      "timeFrom": null,
      "timeShift": null,
      "title": "Neutron API Avg Response Time",
@@ -517,7 +518,7 @@
        "#d44a3a"
      ],
      "datasource": "${DS_PROMETHEUS}",
-      "format": "none",
+      "format": "ms",
      "gauge": {
        "maxValue": 100,
        "minValue": 0,
@@ -549,7 +550,7 @@
      "nullPointMode": "connected",
      "nullText": null,
      "options": {},
-      "postfix": " Seconds",
+      "postfix": "",
      "postfixFontSize": "50%",
      "prefix": "",
      "prefixFontSize": "50%",
@@ -565,12 +566,12 @@
      "tableColumn": "",
      "targets": [
        {
-          "expr": "openstack_api_response_seconds{api_name=\"cinder\", cloud_name=\"$cloud\"}",
+          "expr": "openstack_api_response_milliseconds{api_name=\"cinder\", cloud_name=\"$cloud\"}",
          "instant": false,
          "refId": "A"
        }
      ],
-      "thresholds": "5,10",
+      "thresholds": "5000,10000",
      "timeFrom": null,
      "timeShift": null,
      "title": "Cinder API Avg Response Time",
@@ -595,7 +596,7 @@
        "#d44a3a"
      ],
      "datasource": "${DS_PROMETHEUS}",
-      "format": "none",
+      "format": "s",
      "gauge": {
        "maxValue": 100,
        "minValue": 0,
@@ -627,7 +628,7 @@
      "nullPointMode": "connected",
      "nullText": null,
      "options": {},
-      "postfix": " Seconds",
+      "postfix": "",
      "postfixFontSize": "50%",
      "prefix": "",
      "prefixFontSize": "50%",
@@ -648,7 +649,7 @@
          "refId": "A"
        }
      ],
-      "thresholds": "5,10",
+      "thresholds": "8,12",
      "timeFrom": null,
      "timeShift": null,
      "title": "Horizon API Avg Response Time",
@@ -678,10 +679,10 @@
      "type": "row"
    },
    {
-      "content": "\n# Current Status\n\nAPIs marked as \"Good\" and in green have a current 10 minute average response time of below 5 seconds. \n\nAPIs marked as \"Degraded\" and in orange have a current 10 minute average response time of above 5 seconds and below 8 seconds.\n\nAPIs marked as \"Unusable\" and in red have a current 10 minute average response time above 8 seconds.\n\n\n",
+      "content": "\n# API Current Status\n\nAPIs marked as \"Good\" and in green have a current 10 minute average response time of below 5000 milliseconds. \n\nAPIs marked as \"Degraded\" and in orange have a current 10 minute average response time of above 5000 seconds and below 10000 milliseconds.\n\nAPIs marked as \"Unusable\" and in red have a current 10 minute average response time above 10000 milliseconds.\n\n# Horizon Current Status\n\nHorizon marked as \"Good\" and in green have a current 10 minute average response time of below 8 seconds. \n\nHorizon marked as \"Degraded\" and in orange have a current 10 minute average response time of above 8 seconds and below 12 seconds.\n\nHorizon marked as \"Unusable\" and in red have a current 10 minute average response time above 12 seconds.\n",
      "datasource": "${DS_PROMETHEUS}",
      "gridPos": {
-        "h": 4,
+        "h": 8,
        "w": 12,
        "x": 0,
        "y": 21
@@ -718,7 +719,7 @@
        "h": 3,
        "w": 3,
        "x": 0,
-        "y": 25
+        "y": 29
      },
      "id": 11,
      "interval": null,
@@ -746,15 +747,15 @@
        {
          "from": "0",
          "text": "Good",
-          "to": "4.99"
-        },
-        {
-          "from": "5",
-          "text": "Degraded",
          "to": "7.99"
        },
        {
          "from": "8",
+          "text": "Degraded",
+          "to": "11.99"
+        },
+        {
+          "from": "12",
          "text": "Unusable",
          "to": "600"
        }
@@ -775,7 +776,7 @@
          "refId": "A"
        }
      ],
-      "thresholds": "5,8",
+      "thresholds": "8,12",
      "timeFrom": null,
      "timeShift": null,
      "title": "Horizon Response Time",
@@ -800,7 +801,7 @@
        "#d44a3a"
      ],
      "datasource": "${DS_PROMETHEUS}",
-      "format": "none",
+      "format": "ms",
      "gauge": {
        "maxValue": 100,
        "minValue": 0,
@@ -812,7 +813,7 @@
        "h": 3,
        "w": 3,
        "x": 3,
-        "y": 25
+        "y": 29
      },
      "id": 10,
      "interval": null,
@@ -840,17 +841,17 @@
        {
          "from": "0",
          "text": "Good",
-          "to": "4.99"
+          "to": "4999.99"
        },
        {
-          "from": "5",
+          "from": "5000",
          "text": "Degraded",
-          "to": "9.99"
+          "to": "9999.99"
        },
        {
-          "from": "10",
+          "from": "10000",
          "text": "Unusable",
-          "to": "600"
+          "to": "600000"
        }
      ],
      "sparkline": {
@@ -864,12 +865,12 @@
      "tableColumn": "",
      "targets": [
        {
-          "expr": "avg_over_time(openstack_api_response_seconds{api_name=\"nova\", cloud_name=\"$cloud\"}[10m])",
+          "expr": "avg_over_time(openstack_api_response_milliseconds{api_name=\"nova\", cloud_name=\"$cloud\"}[10m])",
          "instant": true,
          "refId": "A"
        }
      ],
-      "thresholds": "5,10",
+      "thresholds": "5000,10000",
      "timeFrom": null,
      "timeShift": null,
      "title": "Nova API Response Time",
@@ -906,7 +907,7 @@
        "h": 3,
        "w": 3,
        "x": 6,
-        "y": 25
+        "y": 29
      },
      "id": 14,
      "interval": null,
@@ -934,17 +935,17 @@
        {
          "from": "0",
          "text": "Good",
-          "to": "4.99"
+          "to": "4999.99"
        },
        {
-          "from": "5",
+          "from": "5000",
          "text": "Degraded",
-          "to": "9.99"
+          "to": "9999.99"
        },
        {
-          "from": "10",
+          "from": "10000",
          "text": "Unusable",
-          "to": "600"
+          "to": "600000"
        }
      ],
      "sparkline": {
@@ -958,12 +959,12 @@
      "tableColumn": "",
      "targets": [
        {
-          "expr": "avg_over_time(openstack_api_response_seconds{api_name=\"neutron\", cloud_name=\"$cloud\"}[10m])",
+          "expr": "avg_over_time(openstack_api_response_milliseconds{api_name=\"neutron\", cloud_name=\"$cloud\"}[10m])",
          "instant": true,
          "refId": "A"
        }
      ],
-      "thresholds": "5,10",
+      "thresholds": "5000,10000",
      "timeFrom": null,
      "timeShift": null,
      "title": "Neutron API Response Time",
@@ -1000,7 +1001,7 @@
        "h": 3,
        "w": 3,
        "x": 9,
-        "y": 25
+        "y": 29
      },
      "id": 18,
      "interval": null,
@@ -1028,17 +1029,17 @@
        {
          "from": "0",
          "text": "Good",
-          "to": "4.99"
+          "to": "4999.99"
        },
        {
-          "from": "5",
+          "from": "5000",
          "text": "Degraded",
-          "to": "9.99"
+          "to": "9999.99"
        },
        {
-          "from": "10",
+          "from": "10000",
          "text": "Unusable",
-          "to": "600"
+          "to": "600000"
        }
      ],
      "sparkline": {
@@ -1052,12 +1053,12 @@
      "tableColumn": "",
      "targets": [
        {
-          "expr": "avg_over_time(openstack_api_response_seconds{api_name=\"cinder\", cloud_name=\"$cloud\"}[10m])",
+          "expr": "avg_over_time(openstack_api_response_milliseconds{api_name=\"cinder\", cloud_name=\"$cloud\"}[10m])",
          "instant": true,
          "refId": "A"
        }
      ],
-      "thresholds": "5,10",
+      "thresholds": "5000,10000",
      "timeFrom": null,
      "timeShift": null,
      "title": "Cinder API Response Time",
@@ -1094,7 +1095,7 @@
        "h": 3,
        "w": 3,
        "x": 0,
-        "y": 28
+        "y": 32
      },
      "id": 12,
      "interval": null,
@@ -1188,7 +1189,7 @@
        "h": 3,
        "w": 3,
        "x": 3,
-        "y": 28
+        "y": 32
      },
      "id": 13,
      "interval": null,
@@ -1282,7 +1283,7 @@
        "h": 3,
        "w": 3,
        "x": 6,
-        "y": 28
+        "y": 32
      },
      "id": 15,
      "interval": null,
@@ -1376,7 +1377,7 @@
        "h": 3,
        "w": 3,
        "x": 9,
-        "y": 28
+        "y": 32
      },
      "id": 19,
      "interval": null,
@@ -1455,7 +1456,7 @@
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 31
+        "y": 35
      },
      "id": 28,
      "panels": [],
@@ -1474,7 +1475,7 @@
        "h": 8,
        "w": 12,
        "x": 0,
-        "y": 32
+        "y": 36
      },
      "id": 30,
      "legend": {
@@ -1560,7 +1561,7 @@
        "h": 8,
        "w": 12,
        "x": 12,
-        "y": 32
+        "y": 36
      },
      "id": 31,
      "legend": {
@@ -1646,7 +1647,7 @@
        "h": 8,
        "w": 12,
        "x": 0,
-        "y": 40
+        "y": 44
      },
      "id": 32,
      "legend": {
@@ -1732,7 +1733,7 @@
        "h": 8,
        "w": 12,
        "x": 12,
-        "y": 40
+        "y": 44
      },
      "id": 33,
      "legend": {
@@ -1813,7 +1814,7 @@
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 48
+        "y": 52
      },
      "id": 6,
      "panels": [],
@@ -1832,7 +1833,7 @@
        "h": 9,
        "w": 12,
        "x": 0,
-        "y": 49
+        "y": 53
      },
      "id": 2,
      "legend": {
@@ -1860,7 +1861,7 @@
      "steppedLine": false,
      "targets": [
        {
-          "expr": "openstack_api_response_seconds{api_name=\"nova\", cloud_name=\"$cloud\"}",
+          "expr": "openstack_api_response_milliseconds{api_name=\"nova\", cloud_name=\"$cloud\"}",
          "legendFormat": "{{ app }}",
          "refId": "A"
        }
@@ -1885,7 +1886,8 @@
      },
      "yaxes": [
        {
-          "format": "short",
+          "decimals": 1,
+          "format": "ms",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -1918,7 +1920,7 @@
        "h": 9,
        "w": 12,
        "x": 12,
-        "y": 49
+        "y": 53
      },
      "id": 16,
      "legend": {
@@ -1946,7 +1948,7 @@
      "steppedLine": false,
      "targets": [
        {
-          "expr": "openstack_api_response_seconds{api_name=\"neutron\", cloud_name=\"$cloud\"}",
+          "expr": "openstack_api_response_milliseconds{api_name=\"neutron\", cloud_name=\"$cloud\"}",
          "legendFormat": "{{ app }}",
          "refId": "A"
        }
@@ -1971,7 +1973,7 @@
      },
      "yaxes": [
        {
-          "format": "short",
+          "format": "ms",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -1979,7 +1981,7 @@
          "show": true
        },
        {
-          "format": "short",
+          "format": "ms",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -2004,7 +2006,7 @@
        "h": 9,
        "w": 12,
        "x": 0,
-        "y": 58
+        "y": 62
      },
      "id": 17,
      "legend": {
@@ -2032,7 +2034,7 @@
      "steppedLine": false,
      "targets": [
        {
-          "expr": "openstack_api_response_seconds{api_name=\"cinder\", cloud_name=\"$cloud\"}",
+          "expr": "openstack_api_response_milliseconds{api_name=\"cinder\", cloud_name=\"$cloud\"}",
          "legendFormat": "{{ app }}",
          "refId": "A"
        }
@@ -2057,7 +2059,7 @@
      },
      "yaxes": [
        {
-          "format": "short",
+          "format": "ms",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -2065,7 +2067,7 @@
          "show": true
        },
        {
-          "format": "short",
+          "format": "ms",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -2090,7 +2092,7 @@
        "h": 9,
        "w": 12,
        "x": 12,
-        "y": 58
+        "y": 62
      },
      "id": 4,
      "legend": {
@@ -2143,7 +2145,7 @@
      },
      "yaxes": [
        {
-          "format": "short",
+          "format": "s",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -2151,7 +2153,7 @@
          "show": true
        },
        {
-          "format": "short",
+          "format": "s",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -2171,7 +2173,7 @@
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 67
+        "y": 71
      },
      "id": 60,
      "panels": [],
@@ -2190,7 +2192,7 @@
        "h": 8,
        "w": 12,
        "x": 0,
-        "y": 68
+        "y": 72
      },
      "id": 62,
      "legend": {
@@ -2244,7 +2246,7 @@
      },
      "yaxes": [
        {
-          "format": "short",
+          "format": "s",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -2252,7 +2254,7 @@
          "show": true
        },
        {
-          "format": "short",
+          "format": "s",
          "label": null,
          "logBase": 1,
          "max": null,
@@ -2276,14 +2278,14 @@
        "allValue": null,
        "current": {},
        "datasource": "${DS_PROMETHEUS}",
-        "definition": "label_values(openstack_api_response_seconds, cloud_name)",
+        "definition": "label_values(openstack_api_status, cloud_name)",
        "hide": 0,
        "includeAll": false,
        "label": "Cloud",
        "multi": false,
        "name": "cloud",
        "options": [],
-        "query": "label_values(openstack_api_response_seconds, cloud_name)",
+        "query": "label_values(openstack_api_status, cloud_name)",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
@@ -2317,5 +2319,5 @@
  "timezone": "",
  "title": "Openstack Health",
  "uid": "so14pR0Mz",
-  "version": 7
+  "version": 11
 }
--- a/grafana-dashboards/Openstack
+++ b/grafana-dashboards/Openstack
@@ -58,9 +58,23 @@
  "gnetId": null,
  "graphTooltip": 0,
  "id": null,
-  "iteration": 1606268920451,
+  "iteration": 1606916848356,
  "links": [],
  "panels": [
+    {
+      "collapsed": false,
+      "datasource": "${DS_PROMETHEUS}",
+      "gridPos": {
+        "h": 1,
+        "w": 24,
+        "x": 0,
+        "y": 0
+      },
+      "id": 10,
+      "panels": [],
+      "title": "Hypervisor: $hypervisor",
+      "type": "row"
+    },
    {
      "cacheTimeout": null,
      "datasource": "${DS_PROMETHEUS}",
@@ -68,7 +82,7 @@
        "h": 5,
        "w": 3,
        "x": 0,
-        "y": 0
+        "y": 1
      },
      "id": 2,
      "links": [],
@@ -132,7 +146,7 @@
        "h": 5,
        "w": 3,
        "x": 3,
-        "y": 0
+        "y": 1
      },
      "id": 3,
      "links": [],
@@ -196,7 +210,7 @@
        "h": 5,
        "w": 3,
        "x": 6,
-        "y": 0
+        "y": 1
      },
      "id": 4,
      "links": [],
@@ -259,7 +273,7 @@
      "colorValue": false,
      "colors": [
        "#C4162A",
-        "#C4162A",
+        "#73BF69",
        "#299c46"
      ],
      "datasource": "${DS_PROMETHEUS}",
@@ -275,7 +289,7 @@
        "h": 5,
        "w": 3,
        "x": 9,
-        "y": 0
+        "y": 1
      },
      "id": 7,
      "interval": null,
@@ -322,7 +336,7 @@
          "refId": "A"
        }
      ],
-      "thresholds": "1,0",
+      "thresholds": "1,1",
      "timeFrom": null,
      "timeShift": null,
      "title": "Status",
@@ -342,13 +356,266 @@
      ],
      "valueName": "current"
    },
+    {
+      "cacheTimeout": null,
+      "colorBackground": false,
+      "colorValue": false,
+      "colors": [
+        "#299c46",
+        "rgba(237, 129, 40, 0.89)",
+        "#d44a3a"
+      ],
+      "datasource": "${DS_PROMETHEUS}",
+      "format": "decmbytes",
+      "gauge": {
+        "maxValue": 100,
+        "minValue": 0,
+        "show": false,
+        "thresholdLabels": false,
+        "thresholdMarkers": true
+      },
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 0,
+        "y": 6
+      },
+      "id": 17,
+      "interval": null,
+      "links": [],
+      "mappingType": 1,
+      "mappingTypes": [
+        {
+          "name": "value to text",
+          "value": 1
+        },
+        {
+          "name": "range to text",
+          "value": 2
+        }
+      ],
+      "maxDataPoints": 100,
+      "nullPointMode": "connected",
+      "nullText": null,
+      "options": {},
+      "pluginVersion": "6.4.1",
+      "postfix": "",
+      "postfixFontSize": "50%",
+      "prefix": "",
+      "prefixFontSize": "50%",
+      "rangeMaps": [
+        {
+          "from": "null",
+          "text": "N/A",
+          "to": "null"
+        }
+      ],
+      "sparkline": {
+        "fillColor": "rgba(31, 118, 189, 0.18)",
+        "full": false,
+        "lineColor": "rgb(31, 120, 193)",
+        "show": false,
+        "ymax": null,
+        "ymin": null
+      },
+      "tableColumn": "",
+      "targets": [
+        {
+          "expr": "(openstack_hypervisor_total_ram_mb{cloud_name=\"$cloud\",hypervisor_hostname=\"$hypervisor\"} - openstack_hypervisor_used_ram_mb{cloud_name=\"$cloud\",hypervisor_hostname=\"$hypervisor\"})",
+          "refId": "A"
+        }
+      ],
+      "thresholds": "",
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "RAM Available",
+      "type": "singlestat",
+      "valueFontSize": "80%",
+      "valueMaps": [
+        {
+          "op": "=",
+          "text": "N/A",
+          "value": "null"
+        }
+      ],
+      "valueName": "avg"
+    },
+    {
+      "cacheTimeout": null,
+      "colorBackground": false,
+      "colorValue": false,
+      "colors": [
+        "#299c46",
+        "rgba(237, 129, 40, 0.89)",
+        "#d44a3a"
+      ],
+      "datasource": "${DS_PROMETHEUS}",
+      "format": "none",
+      "gauge": {
+        "maxValue": 100,
+        "minValue": 0,
+        "show": false,
+        "thresholdLabels": false,
+        "thresholdMarkers": true
+      },
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 3,
+        "y": 6
+      },
+      "id": 18,
+      "interval": null,
+      "links": [],
+      "mappingType": 1,
+      "mappingTypes": [
+        {
+          "name": "value to text",
+          "value": 1
+        },
+        {
+          "name": "range to text",
+          "value": 2
+        }
+      ],
+      "maxDataPoints": 100,
+      "nullPointMode": "connected",
+      "nullText": null,
+      "options": {},
+      "pluginVersion": "6.4.1",
+      "postfix": "",
+      "postfixFontSize": "50%",
+      "prefix": "",
+      "prefixFontSize": "50%",
+      "rangeMaps": [
+        {
+          "from": "null",
+          "text": "N/A",
+          "to": "null"
+        }
+      ],
+      "sparkline": {
+        "fillColor": "rgba(31, 118, 189, 0.18)",
+        "full": false,
+        "lineColor": "rgb(31, 120, 193)",
+        "show": false,
+        "ymax": null,
+        "ymin": null
+      },
+      "tableColumn": "",
+      "targets": [
+        {
+          "expr": "openstack_hypervisor_total_cpus{cloud_name=\"$cloud\",hypervisor_hostname=\"$hypervisor\"} - openstack_hypervisor_used_cpus{cloud_name=\"$cloud\",hypervisor_hostname=\"$hypervisor\"}",
+          "refId": "A"
+        }
+      ],
+      "thresholds": "",
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "VCPU Available",
+      "type": "singlestat",
+      "valueFontSize": "80%",
+      "valueMaps": [
+        {
+          "op": "=",
+          "text": "N/A",
+          "value": "null"
+        }
+      ],
+      "valueName": "avg"
+    },
+    {
+      "cacheTimeout": null,
+      "colorBackground": false,
+      "colorValue": false,
+      "colors": [
+        "#299c46",
+        "rgba(237, 129, 40, 0.89)",
+        "#d44a3a"
+      ],
+      "datasource": "${DS_PROMETHEUS}",
+      "decimals": null,
+      "format": "decgbytes",
+      "gauge": {
+        "maxValue": 100,
+        "minValue": 0,
+        "show": false,
+        "thresholdLabels": false,
+        "thresholdMarkers": true
+      },
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 6,
+        "y": 6
+      },
+      "id": 19,
+      "interval": null,
+      "links": [],
+      "mappingType": 1,
+      "mappingTypes": [
+        {
+          "name": "value to text",
+          "value": 1
+        },
+        {
+          "name": "range to text",
+          "value": 2
+        }
+      ],
+      "maxDataPoints": 100,
+      "nullPointMode": "connected",
+      "nullText": null,
+      "options": {},
+      "pluginVersion": "6.4.1",
+      "postfix": "",
+      "postfixFontSize": "50%",
+      "prefix": "",
+      "prefixFontSize": "50%",
+      "rangeMaps": [
+        {
+          "from": "null",
+          "text": "N/A",
+          "to": "null"
+        }
+      ],
+      "sparkline": {
+        "fillColor": "rgba(31, 118, 189, 0.18)",
+        "full": false,
+        "lineColor": "rgb(31, 120, 193)",
+        "show": false,
+        "ymax": null,
+        "ymin": null
+      },
+      "tableColumn": "",
+      "targets": [
+        {
+          "expr": "openstack_hypervisor_local_gb_total{cloud_name=\"$cloud\",hypervisor_hostname=\"$hypervisor\"} - openstack_hypervisor_local_gb_used{cloud_name=\"$cloud\",hypervisor_hostname=\"$hypervisor\"}",
+          "refId": "A"
+        }
+      ],
+      "thresholds": "",
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "Local Disk Available",
+      "type": "singlestat",
+      "valueFontSize": "80%",
+      "valueMaps": [
+        {
+          "op": "=",
+          "text": "N/A",
+          "value": "null"
+        }
+      ],
+      "valueName": "avg"
+    },
    {
      "cacheTimeout": null,
      "colorBackground": true,
      "colorValue": false,
      "colors": [
        "#C4162A",
-        "#C4162A",
+        "#37872D",
        "#299c46"
      ],
      "datasource": "${DS_PROMETHEUS}",
@@ -363,8 +630,8 @@
      "gridPos": {
        "h": 5,
        "w": 3,
-        "x": 12,
-        "y": 0
+        "x": 9,
+        "y": 6
      },
      "id": 8,
      "interval": null,
@@ -411,7 +678,7 @@
          "refId": "A"
        }
      ],
-      "thresholds": "1,0",
+      "thresholds": "1,1",
      "timeFrom": null,
      "timeShift": null,
      "title": "State",
@@ -444,7 +711,7 @@
        "h": 5,
        "w": 15,
        "x": 0,
-        "y": 5
+        "y": 11
      },
      "id": 5,
      "legend": {
@@ -519,6 +786,553 @@
        "align": false,
        "alignLevel": null
      }
+    },
+    {
+      "collapsed": false,
+      "datasource": "${DS_PROMETHEUS}",
+      "gridPos": {
+        "h": 1,
+        "w": 24,
+        "x": 0,
+        "y": 16
+      },
+      "id": 12,
+      "panels": [],
+      "title": "Aggregate: $aggregate",
+      "type": "row"
+    },
+    {
+      "cacheTimeout": null,
+      "datasource": "${DS_PROMETHEUS}",
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 0,
+        "y": 17
+      },
+      "id": 14,
+      "links": [],
+      "options": {
+        "fieldOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "defaults": {
+            "mappings": [
+              {
+                "id": 0,
+                "op": "=",
+                "text": "N/A",
+                "type": 1,
+                "value": "null"
+              }
+            ],
+            "max": 1,
+            "min": 0,
+            "nullValueMode": "connected",
+            "thresholds": [
+              {
+                "color": "#299c46",
+                "value": null
+              },
+              {
+                "color": "rgba(237, 129, 40, 0.89)",
+                "value": 0.8
+              },
+              {
+                "color": "#d44a3a",
+                "value": 0.9
+              }
+            ],
+            "unit": "percentunit"
+          },
+          "override": {},
+          "values": false
+        },
+        "orientation": "horizontal",
+        "showThresholdLabels": false,
+        "showThresholdMarkers": true
+      },
+      "pluginVersion": "6.4.1",
+      "targets": [
+        {
+          "expr": "sum(openstack_hypervisor_used_ram_mb{cloud_name=\"$cloud\",aggregate=\"$aggregate\"}) / sum(openstack_hypervisor_total_ram_mb{cloud_name=\"$cloud\",aggregate=\"$aggregate\"})",
+          "refId": "A"
+        }
+      ],
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "Aggregate RAM Usage",
+      "type": "gauge"
+    },
+    {
+      "cacheTimeout": null,
+      "datasource": "${DS_PROMETHEUS}",
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 3,
+        "y": 17
+      },
+      "id": 15,
+      "links": [],
+      "options": {
+        "fieldOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "defaults": {
+            "mappings": [
+              {
+                "id": 0,
+                "op": "=",
+                "text": "N/A",
+                "type": 1,
+                "value": "null"
+              }
+            ],
+            "max": 1,
+            "min": 0,
+            "nullValueMode": "connected",
+            "thresholds": [
+              {
+                "color": "#299c46",
+                "value": null
+              },
+              {
+                "color": "rgba(237, 129, 40, 0.89)",
+                "value": 0.8
+              },
+              {
+                "color": "#d44a3a",
+                "value": 0.9
+              }
+            ],
+            "unit": "percentunit"
+          },
+          "override": {},
+          "values": false
+        },
+        "orientation": "horizontal",
+        "showThresholdLabels": false,
+        "showThresholdMarkers": true
+      },
+      "pluginVersion": "6.4.1",
+      "targets": [
+        {
+          "expr": "sum(openstack_hypervisor_used_cpus{cloud_name=\"$cloud\",aggregate=\"$aggregate\"}) / sum(openstack_hypervisor_total_cpus{cloud_name=\"$cloud\",aggregate=\"$aggregate\"})",
+          "refId": "A"
+        }
+      ],
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "Aggregate VCPU Usage",
+      "type": "gauge"
+    },
+    {
+      "cacheTimeout": null,
+      "datasource": "${DS_PROMETHEUS}",
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 6,
+        "y": 17
+      },
+      "id": 16,
+      "links": [],
+      "options": {
+        "fieldOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "defaults": {
+            "mappings": [
+              {
+                "id": 0,
+                "op": "=",
+                "text": "N/A",
+                "type": 1,
+                "value": "null"
+              }
+            ],
+            "max": 1,
+            "min": 0,
+            "nullValueMode": "connected",
+            "thresholds": [
+              {
+                "color": "#299c46",
+                "value": null
+              },
+              {
+                "color": "rgba(237, 129, 40, 0.89)",
+                "value": 0.8
+              },
+              {
+                "color": "#d44a3a",
+                "value": 0.9
+              }
+            ],
+            "unit": "percentunit"
+          },
+          "override": {},
+          "values": false
+        },
+        "orientation": "horizontal",
+        "showThresholdLabels": false,
+        "showThresholdMarkers": true
+      },
+      "pluginVersion": "6.4.1",
+      "targets": [
+        {
+          "expr": "sum(openstack_hypervisor_local_gb_used{cloud_name=\"$cloud\",aggregate=\"$aggregate\"}) / sum(openstack_hypervisor_local_gb_total{cloud_name=\"$cloud\",aggregate=\"$aggregate\"})",
+          "refId": "A"
+        }
+      ],
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "Aggregate Local Disk Usage",
+      "type": "gauge"
+    },
+    {
+      "cacheTimeout": null,
+      "colorBackground": false,
+      "colorValue": false,
+      "colors": [
+        "#299c46",
+        "rgba(237, 129, 40, 0.89)",
+        "#d44a3a"
+      ],
+      "datasource": "${DS_PROMETHEUS}",
+      "format": "decmbytes",
+      "gauge": {
+        "maxValue": 100,
+        "minValue": 0,
+        "show": false,
+        "thresholdLabels": false,
+        "thresholdMarkers": true
+      },
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 0,
+        "y": 22
+      },
+      "id": 20,
+      "interval": null,
+      "links": [],
+      "mappingType": 1,
+      "mappingTypes": [
+        {
+          "name": "value to text",
+          "value": 1
+        },
+        {
+          "name": "range to text",
+          "value": 2
+        }
+      ],
+      "maxDataPoints": 100,
+      "nullPointMode": "connected",
+      "nullText": null,
+      "options": {},
+      "pluginVersion": "6.4.1",
+      "postfix": "",
+      "postfixFontSize": "50%",
+      "prefix": "",
+      "prefixFontSize": "50%",
+      "rangeMaps": [
+        {
+          "from": "null",
+          "text": "N/A",
+          "to": "null"
+        }
+      ],
+      "sparkline": {
+        "fillColor": "rgba(31, 118, 189, 0.18)",
+        "full": false,
+        "lineColor": "rgb(31, 120, 193)",
+        "show": false,
+        "ymax": null,
+        "ymin": null
+      },
+      "tableColumn": "",
+      "targets": [
+        {
+          "expr": "sum(openstack_hypervisor_total_ram_mb{cloud_name=\"$cloud\",aggregate=\"$aggregate\"}) - sum(openstack_hypervisor_used_ram_mb{cloud_name=\"$cloud\",aggregate=\"$aggregate\"})",
+          "refId": "A"
+        }
+      ],
+      "thresholds": "",
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "Aggregate RAM Available",
+      "type": "singlestat",
+      "valueFontSize": "80%",
+      "valueMaps": [
+        {
+          "op": "=",
+          "text": "N/A",
+          "value": "null"
+        }
+      ],
+      "valueName": "avg"
+    },
+    {
+      "cacheTimeout": null,
+      "colorBackground": false,
+      "colorValue": false,
+      "colors": [
+        "#299c46",
+        "rgba(237, 129, 40, 0.89)",
+        "#d44a3a"
+      ],
+      "datasource": "${DS_PROMETHEUS}",
+      "format": "none",
+      "gauge": {
+        "maxValue": 100,
+        "minValue": 0,
+        "show": false,
+        "thresholdLabels": false,
+        "thresholdMarkers": true
+      },
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 3,
+        "y": 22
+      },
+      "id": 21,
+      "interval": null,
+      "links": [],
+      "mappingType": 1,
+      "mappingTypes": [
+        {
+          "name": "value to text",
+          "value": 1
+        },
+        {
+          "name": "range to text",
+          "value": 2
+        }
+      ],
+      "maxDataPoints": 100,
+      "nullPointMode": "connected",
+      "nullText": null,
+      "options": {},
+      "pluginVersion": "6.4.1",
+      "postfix": "",
+      "postfixFontSize": "50%",
+      "prefix": "",
+      "prefixFontSize": "50%",
+      "rangeMaps": [
+        {
+          "from": "null",
+          "text": "N/A",
+          "to": "null"
+        }
+      ],
+      "sparkline": {
+        "fillColor": "rgba(31, 118, 189, 0.18)",
+        "full": false,
+        "lineColor": "rgb(31, 120, 193)",
+        "show": false,
+        "ymax": null,
+        "ymin": null
+      },
+      "tableColumn": "",
+      "targets": [
+        {
+          "expr": "sum(openstack_hypervisor_total_cpus{cloud_name=\"$cloud\",aggregate=\"$aggregate\"}) - sum(openstack_hypervisor_used_cpus{cloud_name=\"$cloud\",aggregate=\"$aggregate\"})",
+          "refId": "A"
+        }
+      ],
+      "thresholds": "",
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "Aggregate VCPU Available",
+      "type": "singlestat",
+      "valueFontSize": "80%",
+      "valueMaps": [
+        {
+          "op": "=",
+          "text": "N/A",
+          "value": "null"
+        }
+      ],
+      "valueName": "avg"
+    },
+    {
+      "cacheTimeout": null,
+      "colorBackground": false,
+      "colorValue": false,
+      "colors": [
+        "#299c46",
+        "rgba(237, 129, 40, 0.89)",
+        "#d44a3a"
+      ],
+      "datasource": "${DS_PROMETHEUS}",
+      "format": "decgbytes",
+      "gauge": {
+        "maxValue": 100,
+        "minValue": 0,
+        "show": false,
+        "thresholdLabels": false,
+        "thresholdMarkers": true
+      },
+      "gridPos": {
+        "h": 5,
+        "w": 3,
+        "x": 6,
+        "y": 22
+      },
+      "id": 22,
+      "interval": null,
+      "links": [],
+      "mappingType": 1,
+      "mappingTypes": [
+        {
+          "name": "value to text",
+          "value": 1
+        },
+        {
+          "name": "range to text",
+          "value": 2
+        }
+      ],
+      "maxDataPoints": 100,
+      "nullPointMode": "connected",
+      "nullText": null,
+      "options": {},
+      "pluginVersion": "6.4.1",
+      "postfix": "",
+      "postfixFontSize": "50%",
+      "prefix": "",
+      "prefixFontSize": "50%",
+      "rangeMaps": [
+        {
+          "from": "null",
+          "text": "N/A",
+          "to": "null"
+        }
+      ],
+      "sparkline": {
+        "fillColor": "rgba(31, 118, 189, 0.18)",
+        "full": false,
+        "lineColor": "rgb(31, 120, 193)",
+        "show": false,
+        "ymax": null,
+        "ymin": null
+      },
+      "tableColumn": "",
+      "targets": [
+        {
+          "expr": "sum(openstack_hypervisor_local_gb_total{cloud_name=\"$cloud\",aggregate=\"$aggregate\"}) - sum(openstack_hypervisor_local_gb_used{cloud_name=\"$cloud\",aggregate=\"$aggregate\"})",
+          "refId": "A"
+        }
+      ],
+      "thresholds": "",
+      "timeFrom": null,
+      "timeShift": null,
+      "title": "Aggregate Local Disk Available",
+      "type": "singlestat",
+      "valueFontSize": "80%",
+      "valueMaps": [
+        {
+          "op": "=",
+          "text": "N/A",
+          "value": "null"
+        }
+      ],
+      "valueName": "avg"
+    },
+    {
+      "aliasColors": {},
+      "bars": false,
+      "cacheTimeout": null,
+      "dashLength": 10,
+      "dashes": false,
+      "datasource": "${DS_PROMETHEUS}",
+      "fill": 1,
+      "fillGradient": 0,
+      "gridPos": {
+        "h": 5,
+        "w": 15,
+        "x": 0,
+        "y": 27
+      },
+      "id": 13,
+      "legend": {
+        "avg": true,
+        "current": true,
+        "max": true,
+        "min": true,
+        "show": true,
+        "total": false,
+        "values": true
+      },
+      "lines": true,
+      "linewidth": 1,
+      "links": [],
+      "nullPointMode": "null",
+      "options": {
+        "dataLinks": []
+      },
+      "percentage": false,
+      "pluginVersion": "6.4.1",
+      "pointradius": 2,
+      "points": false,
+      "renderer": "flot",
+      "seriesOverrides": [],
+      "spaceLength": 10,
+      "stack": false,
+      "steppedLine": false,
+      "targets": [
+        {
+          "expr": "sum(openstack_hypervisor_running_vms{cloud_name=\"$cloud\",aggregate=\"$aggregate\"})",
+          "legendFormat": "VMs",
+          "refId": "A"
+        }
+      ],
+      "thresholds": [],
+      "timeFrom": null,
+      "timeRegions": [],
+      "timeShift": null,
+      "title": "Running VMs",
+      "tooltip": {
+        "shared": true,
+        "sort": 0,
+        "value_type": "individual"
+      },
+      "type": "graph",
+      "xaxis": {
+        "buckets": null,
+        "mode": "time",
+        "name": null,
+        "show": true,
+        "values": []
+      },
+      "yaxes": [
+        {
+          "format": "short",
+          "label": null,
+          "logBase": 1,
+          "max": null,
+          "min": null,
+          "show": true
+        },
+        {
+          "format": "short",
+          "label": null,
+          "logBase": 1,
+          "max": null,
+          "min": null,
+          "show": true
+        }
+      ],
+      "yaxis": {
+        "align": false,
+        "alignLevel": null
+      }
    }
  ],
  "schemaVersion": 20,
@@ -552,14 +1366,36 @@
        "allValue": null,
        "current": {},
        "datasource": "${DS_PROMETHEUS}",
-        "definition": "label_values(openstack_hypervisor_used_ram_mb{cloud_name=~\"$cloud\"}, hypervisor_hostname)",
+        "definition": "label_values(openstack_hypervisor_used_ram_mb{cloud_name=~\"$cloud\"}, aggregate)",
+        "hide": 0,
+        "includeAll": false,
+        "label": "Aggregate",
+        "multi": false,
+        "name": "aggregate",
+        "options": [],
+        "query": "label_values(openstack_hypervisor_used_ram_mb{cloud_name=~\"$cloud\"}, aggregate)",
+        "refresh": 1,
+        "regex": "",
+        "skipUrlSync": false,
+        "sort": 0,
+        "tagValuesQuery": "",
+        "tags": [],
+        "tagsQuery": "",
+        "type": "query",
+        "useTags": false
+      },
+      {
+        "allValue": null,
+        "current": {},
+        "datasource": "${DS_PROMETHEUS}",
+        "definition": "label_values(openstack_hypervisor_used_ram_mb{cloud_name=~\"$cloud\", aggregate=~\"$aggregate\"}, hypervisor_hostname)",
        "hide": 0,
        "includeAll": false,
        "label": "Hypervisor",
        "multi": false,
        "name": "hypervisor",
        "options": [],
-        "query": "label_values(openstack_hypervisor_used_ram_mb{cloud_name=~\"$cloud\"}, hypervisor_hostname)",
+        "query": "label_values(openstack_hypervisor_used_ram_mb{cloud_name=~\"$cloud\", aggregate=~\"$aggregate\"}, hypervisor_hostname)",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
@@ -573,7 +1409,7 @@
    ]
  },
  "time": {
-    "from": "now-24h",
+    "from": "now-1h",
    "to": "now"
  },
  "timepicker": {
@@ -593,5 +1429,5 @@
  "timezone": "",
  "title": "Openstack Hypervisors",
  "uid": "JdR8Hk0Mk",
-  "version": 5
+  "version": 10
 }
--- a/lib/api_metrics.py
+++ b/lib/api_metrics.py
@@ -5,19 +5,20 @@ import datetime
 import traceback
 import prometheus_client as prom

-api_metrics = prom.Gauge('openstack_api_response_seconds', 'Time for openstack api to execute.', ['api_name','cloud_name'])
+api_metrics = prom.Gauge('openstack_api_response_milliseconds', 'Time for openstack api to execute in milliseconds.', ['api_name','cloud_name'])
 api_status = prom.Gauge('openstack_api_status', 'API current status. 1 = up 0 = down.',['api_name','cloud_name'])

 def generate_nova_metrics(connection,cloud_name):
    try:
        start_time = datetime.datetime.now()
        for server in connection.compute.servers():
-            name = server
+            name = server.name
+            break
        end_time = datetime.datetime.now()
        time_took = end_time - start_time
-        seconds_took = time_took.seconds
-        print(f'Nova took {seconds_took} seconds')
-        api_metrics.labels('nova',cloud_name).set(seconds_took)
+        milliseconds_took = time_took.microseconds / 1000
+        print(f'Nova took {milliseconds_took} milliseconds')
+        api_metrics.labels('nova',cloud_name).set(milliseconds_took)
        api_status.labels('nova',cloud_name).set(1)
    except:
        print(traceback.print_exc())
@@ -29,12 +30,13 @@ def generate_neutron_metrics(connection,cloud_name):
        project = connection.current_project
        start_time = datetime.datetime.now()
        for network in connection.network.networks(project_id=project.id):
-            name = network
+            name = network.name
+            break
        end_time = datetime.datetime.now()
        time_took = end_time - start_time
-        seconds_took = time_took.seconds
-        print(f'Neutron took {seconds_took} seconds')
-        api_metrics.labels('neutron',cloud_name).set(seconds_took)
+        milliseconds_took = time_took.microseconds / 1000
+        print(f'Neutron took {milliseconds_took} milliseconds')
+        api_metrics.labels('neutron',cloud_name).set(milliseconds_took)
        api_status.labels('neutron',cloud_name).set(1)
    except:
        print(traceback.print_exc())
@@ -45,12 +47,13 @@ def generate_cinder_metrics(connection,cloud_name):
    try:
        start_time = datetime.datetime.now()
        for volume in  connection.volume.volumes():
-            name = volume
+            name = volume.name
+            break
        end_time = datetime.datetime.now()
        time_took = end_time - start_time
-        seconds_took = time_took.seconds
-        print(f'Cinder took {seconds_took} seconds')
-        api_metrics.labels('cinder',cloud_name).set(seconds_took)
+        milliseconds_took = time_took.microseconds / 1000
+        print(f'Cinder took {milliseconds_took} milliseconds')
+        api_metrics.labels('cinder',cloud_name).set(milliseconds_took)
        api_status.labels('cinder',cloud_name).set(1)
    except:
        print(traceback.print_exc())
--- a/lib/horizon.py
+++ b/lib/horizon.py
@@ -14,6 +14,31 @@ openstack_password = os.getenv('OS_PASSWORD')
 api_metrics = prom.Gauge('openstack_horizon_response_seconds', 'Time for horizon login via Chrome.', ['cloud_name'])
 api_status = prom.Gauge('openstack_horizon_status', 'Horizon current status. 1 = up 0 = down.',['cloud_name'])

+def quit_driver_and_reap_children(driver):
+    # inside docker, driver.quit() leaves behind defunct pids
+    # this can cause your system to run out of processes
+    # docker exec -it container_name ps -ef
+    # will show many defunct Chromium processes
+    print('Quitting session: %s' % driver.session_id)
+    driver.quit()
+    try:
+        pid = True
+        while pid:
+            pid = os.waitpid(-1, os.WNOHANG)
+            print("Reaped child: %s" % str(pid))
+
+            #Solution to avoid infinite loop cause pid value -> (0, 0)
+            try:
+                if pid[0] == 0:
+                    pid = False
+            except:
+                pass
+            #---- ----
+
+    except ChildProcessError:
+        pass
+
+
 def get_metrics(horizon_url,cloud_name):
    chrome_options = webdriver.ChromeOptions()
    chrome_options.binary_location = "/usr/bin/chromium"
@@ -70,7 +95,8 @@ def get_metrics(horizon_url,cloud_name):
        api_status.labels(cloud_name).set(1)
    except:
        print("Timed out waiting for login to load")
-        api_status.labels(cloud_name).set(1)
+        api_status.labels(cloud_name).set(0)

    finally:
-        driver.quit()
+        print("Closing chrome")
+        quit_driver_and_reap_children(driver)
--- a/lib/hypervisor_metrics.py
+++ b/lib/hypervisor_metrics.py
@@ -4,33 +4,41 @@ import openstack
 import datetime
 import prometheus_client as prom

-hypervisor_running_vms = prom.Gauge('openstack_hypervisor_running_vms', 'Number of VMs running on this hypervisor.',['hypervisor_hostname','cloud_name'])
-hypervisor_used_ram_mb = prom.Gauge('openstack_hypervisor_used_ram_mb', 'Total MB of used RAM on the hypervisor.',['hypervisor_hostname','cloud_name'])
-hypervisor_total_ram_mb = prom.Gauge('openstack_hypervisor_total_ram_mb', 'Total MB of RAM on the hypervisor.',['hypervisor_hostname','cloud_name'])
-hypervisor_used_cpus = prom.Gauge('openstack_hypervisor_used_cpus', 'Total VCPUs used on the hypervisor.',['hypervisor_hostname','cloud_name'])
-hypervisor_total_cpus = prom.Gauge('openstack_hypervisor_total_cpus', 'Total VCPUs on the hypervisor.',['hypervisor_hostname','cloud_name'])
-hypervisor_enabled = prom.Gauge('openstack_hypervisor_enabled', 'nova-compute service status on hypervisor. 1 is enabled 0 is disabled.',['hypervisor_hostname','cloud_name'])
-hypervisor_up = prom.Gauge('openstack_hypervisor_up', 'nova-compute service state on hypervisor. 1 is up 0 is down.',['hypervisor_hostname','cloud_name'])
-hypervisor_local_gb_total = prom.Gauge('openstack_hypervisor_local_gb_total', 'Total local disk in GB.',['hypervisor_hostname','cloud_name'])
-hypervisor_local_gb_used = prom.Gauge('openstack_hypervisor_local_gb_used', 'Used local disk in GB.',['hypervisor_hostname','cloud_name'])
+hypervisor_running_vms = prom.Gauge('openstack_hypervisor_running_vms', 'Number of VMs running on this hypervisor.',['hypervisor_hostname','cloud_name','aggregate'])
+hypervisor_used_ram_mb = prom.Gauge('openstack_hypervisor_used_ram_mb', 'Total MB of used RAM on the hypervisor.',['hypervisor_hostname','cloud_name','aggregate'])
+hypervisor_total_ram_mb = prom.Gauge('openstack_hypervisor_total_ram_mb', 'Total MB of RAM on the hypervisor.',['hypervisor_hostname','cloud_name','aggregate'])
+hypervisor_used_cpus = prom.Gauge('openstack_hypervisor_used_cpus', 'Total VCPUs used on the hypervisor.',['hypervisor_hostname','cloud_name','aggregate'])
+hypervisor_total_cpus = prom.Gauge('openstack_hypervisor_total_cpus', 'Total VCPUs on the hypervisor.',['hypervisor_hostname','cloud_name','aggregate'])
+hypervisor_enabled = prom.Gauge('openstack_hypervisor_enabled', 'nova-compute service status on hypervisor. 1 is enabled 0 is disabled.',['hypervisor_hostname','cloud_name','aggregate'])
+hypervisor_up = prom.Gauge('openstack_hypervisor_up', 'nova-compute service state on hypervisor. 1 is up 0 is down.',['hypervisor_hostname','cloud_name','aggregate'])
+hypervisor_local_gb_total = prom.Gauge('openstack_hypervisor_local_gb_total', 'Total local disk in GB.',['hypervisor_hostname','cloud_name','aggregate'])
+hypervisor_local_gb_used = prom.Gauge('openstack_hypervisor_local_gb_used', 'Used local disk in GB.',['hypervisor_hostname','cloud_name','aggregate'])

 def generate_hypervisor_metrics(connection,cloud_name):
    for hypervisor in connection.list_hypervisors():
        print(f'Getting hypervisor {hypervisor.name} metrics.')
+        aggregate_member = ""
+        for aggregate in connection.list_aggregates():
+            for host in aggregate.hosts:
+                if host == hypervisor.service_details['host']:
+                    aggregate_member = aggregate.name
+        if aggregate_member == "":
+            aggregate_member = "None"
+        print(f"Hypervisor {hypervisor.name} is a member of aggregate {aggregate_member}")
        # See: https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/compute/v2/hypervisor.py
-        hypervisor_running_vms.labels(hypervisor.name,cloud_name).set(hypervisor.running_vms)
-        hypervisor_used_ram_mb.labels(hypervisor.name,cloud_name).set(hypervisor.memory_used)
-        hypervisor_total_ram_mb.labels(hypervisor.name,cloud_name).set(hypervisor.memory_size)
-        hypervisor_used_cpus.labels(hypervisor.name,cloud_name).set(hypervisor.vcpus_used)
-        hypervisor_total_cpus.labels(hypervisor.name,cloud_name).set(hypervisor.vcpus)
-        hypervisor_local_gb_total.labels(hypervisor.name,cloud_name).set(hypervisor.local_disk_size)
-        hypervisor_local_gb_used.labels(hypervisor.name,cloud_name).set(hypervisor.local_disk_used)
+        hypervisor_running_vms.labels(hypervisor.name,cloud_name,aggregate_member).set(hypervisor.running_vms)
+        hypervisor_used_ram_mb.labels(hypervisor.name,cloud_name,aggregate_member).set(hypervisor.memory_used)
+        hypervisor_total_ram_mb.labels(hypervisor.name,cloud_name,aggregate_member).set(hypervisor.memory_size)
+        hypervisor_used_cpus.labels(hypervisor.name,cloud_name,aggregate_member).set(hypervisor.vcpus_used)
+        hypervisor_total_cpus.labels(hypervisor.name,cloud_name,aggregate_member).set(hypervisor.vcpus)
+        hypervisor_local_gb_total.labels(hypervisor.name,cloud_name,aggregate_member).set(hypervisor.local_disk_size)
+        hypervisor_local_gb_used.labels(hypervisor.name,cloud_name,aggregate_member).set(hypervisor.local_disk_used)

        if hypervisor.status == "enabled":
-            hypervisor_enabled.labels(hypervisor.name,cloud_name).set(1)
+            hypervisor_enabled.labels(hypervisor.name,cloud_name,aggregate_member).set(1)
        else:
-            hypervisor_enabled.labels(hypervisor.name,cloud_name).set(0)
+            hypervisor_enabled.labels(hypervisor.name,cloud_name,aggregate_member).set(0)
        if hypervisor.state == "up":
-            hypervisor_up.labels(hypervisor.name,cloud_name).set(1)
+            hypervisor_up.labels(hypervisor.name,cloud_name,aggregate_member).set(1)
        else:
-            hypervisor_up.labels(hypervisor.name,cloud_name).set(0)
+            hypervisor_up.labels(hypervisor.name,cloud_name,aggregate_member).set(0)
--- a/lib/instance_deploy.py
+++ b/lib/instance_deploy.py
@@ -60,10 +60,10 @@ def get_network(connection, network):
        return None

 def cleanup(connection, instance_name):
-    print(f"Cleaning up {instance_name} instance.")
-    server = connection.compute.find_server(instance_name)
+    print(f"Cleaning up all instances with the name {instance_name}.")
+    servers = connection.compute.servers(all_projects=True,name=instance_name)

-    if server:
+    for server in servers:
        try:
            connection.compute.delete_server(server.id)
        except:
@@ -83,7 +83,17 @@ def create_instance(connection, flavor, image, network, hypervisor):
            name=f"{instance_name}",
            availability_zone=availability_zone,
        )
-        server = connection.compute.wait_for_server(server, status="ACTIVE", wait=600)
+        server = connection.compute.wait_for_server(
+            server, 
+            status="ACTIVE", 
+            wait=600,
+            failures=[
+                "ERROR",
+                "PAUSED",
+                "SUSPENDED",
+                "UNKNOWN",
+            ],
+        )
        ip_address = server.addresses[network.name][0]['addr']
        if wait_for_ping(ip_address) is True:
            return True
--- a/openstack_exporter.py
+++ b/openstack_exporter.py
@@ -81,7 +81,7 @@ def parse_cli_arguments():

 if __name__ == '__main__':
    print("Starting server on port 8000")
-    prom.start_http_server(8000)
+    prom.start_http_server(port=8000, addr='0.0.0.0')
    args = parse_cli_arguments()
    while True:
        try:
@@ -95,9 +95,11 @@ if __name__ == '__main__':
                instance_deploy.get_metrics(connection, args.flavor, args.image, args.network,args.cloud_name)
            if args.horizon_url is not None:
                horizon.get_metrics(args.horizon_url, args.cloud_name)
+            connection.close()
            print("Waiting 30 seconds to gather more metrics.")
            time.sleep(30)
        except Exception:
+            connection.close()
            print(traceback.print_exc())
            print("Waiting 30 seconds to gather more metrics.")
            time.sleep(30)
Author	SHA1	Message	Date
Jacob Cody Wimer	992c80c8bc	Define the bind mapping for prometheus http server	2021-03-10 13:36:27 -05:00
Jacob Cody Wimer	c6b1b9f211	Changed api response time to milliseconds, updated grafan dashbaords, and updated README to reflect that.	2020-12-25 09:57:18 -05:00
Jacob Cody Wimer	c9c7925e85	Added logic to capture instance deploy failures	2020-12-25 09:57:18 -05:00
Jacob Cody Wimer	106f69083b	Fix defunct chromium processes in docker	2020-12-11 15:37:04 -05:00
Jacob Cody Wimer	49e7ba3031	Fix bug with horizon status	2020-12-11 15:11:51 -05:00
Jacob Cody Wimer	f34abc1a6d	Close openstack connections to prevent pid growth	2020-12-11 15:11:35 -05:00
Jacob Cody Wimer	211df849c2	Better table view in README	2020-12-03 20:01:06 -05:00
Jacob Cody Wimer	16f45519e9	Added aggregate metrics	2020-12-03 19:56:40 -05:00
Jacob Cody Wimer	fb04acd27d	Clean up all instances with that name	2020-12-03 19:55:33 -05:00