azuredatastudio/extensions/mssql/notebooks/TSG/cluster-status.ipynb

{
    "metadata": {
        "kernelspec": {
            "name": "python3",
            "display_name": "Python 3"
        },
        "language_info": {
            "name": "python",
            "version": "3.6.6",
            "mimetype": "text/x-python",
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "pygments_lexer": "ipython3",
            "nbconvert_exporter": "python",
            "file_extension": ".py"
        }
    },
    "nbformat_minor": 2,
    "nbformat": 4,
    "cells": [
        {
            "cell_type": "markdown",
            "source": [
                "![11811317_10153406249401648_2787740058697948111_n](https://raw.githubusercontent.com/Microsoft/sqlworkshops/master/graphics/solutions-microsoft-logo-small.png)\n",
                "\n",
                "# View the status of your SQL Server Big Data Cluster\n",
                "This notebook allows you to see the status of the controller, master instance, and pools in your SQL Server Big Data Cluster.\n",
                "\n",
                "> ## **Important Instructions**\n",
                "> ### **Before you begin, you will need:**\n",
                ">* Big Data Cluster name\n",
                ">* Controller username\n",
                ">* Controller password\n",
                ">* Controller endpoint \n",
                "\n",
                "You can find the controller endpoint from the SQL Big Data Cluster dashboard in the Service Endpoints table. The endpoint is listed as **Cluster Management Service.**\n",
                "\n",
                "If you do not know the credentials, ask the admin who deployed your cluster.\n",
                "\n",
                "### **Prerequisites**\n",
                "Ensure the following tools are installed and added to PATH before proceeding.\n",
                "\n",
                "|Tools|Description|Installation|\n",
                "|---|---|---|\n",
                "|kubectl | Command-line tool for monitoring the underlying Kubernetes cluster | [Installation](https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-kubectl-binary-using-native-package-management) |\n",
                "|azdata | Command-line tool for installing and managing a big data cluster |[Installation](https://docs.microsoft.com/en-us/sql/big-data-cluster/deploy-install-azdata?view=sqlallproducts-allversions) |\n",
                "|Pandas Package | Python package for data manipulation | Will be installed by the notebook if not present |\n",
                "\n",
                "\n",
                "### **Instructions**\n",
                "* For the best experience, click **Run Cells** on the toolbar above. This will automatically execute all code cells below and show the cluster status in each table.\n",
                "* When you click **Run Cells** for this Notebook, you will be prompted at the *Log in to your Big Data Cluster* code cell to provide your login credentials. Follow the prompts and press enter to proceed.\n",
                "* **You won't need to modify any of the code cell contents** in this Notebook. If you accidentally made a change, you can reopen this Notebook from the cluster dashboard.\n",
                "\n",
                "\n",
                ""
            ],
            "metadata": {}
        },
        {
            "cell_type": "markdown",
            "source": "### **Check azdata version**",
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "import sys, os\r\n",
                "cmd = f'azdata --version'\r\n",
                "cmdOutput = !{cmd}\r\n",
                "azdataStr = '\\'azdata\\''\r\n",
                "if len(cmdOutput) > 0 and ('command not found' in cmdOutput[1] or f'{azdataStr} is not recognized as an internal or external command' in cmdOutput[0]):\r\n",
                "    raise SystemExit('azdata not found! Please make sure azdata is installed and added to path' + '.\\n')\r\n",
                "if '15.0' in cmdOutput[0]:\r\n",
                "    print('azdata version: ' + cmdOutput[0])\r\n",
                ""
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": "### **Install latest version of pandas**",
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "#install pandas\r\n",
                "import pandas\r\n",
                "pandas_version = pandas.__version__.split('.')\r\n",
                "pandas_major = int(pandas_version[0])\r\n",
                "pandas_minor = int(pandas_version[1])\r\n",
                "pandas_patch = int(pandas_version[2])\r\n",
                "if not (pandas_major > 0 or (pandas_major == 0 and pandas_minor > 24) or (pandas_major == 0 and pandas_minor == 24 and pandas_patch >= 2)):\r\n",
                "    pandasVersion = 'pandas==0.24.2'\r\n",
                "    cmd = f'{sys.executable} -m pip install {pandasVersion}'\r\n",
                "    cmdOutput = !{cmd}\r\n",
                "    print(f'\\nSuccess: Upgraded pandas to 0.24.2.')\r\n",
                "else:\r\n",
                "    print('Pandas required version is already installed!')    "
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": [
                "## **Log in to your Big Data Cluster**\r\n",
                "To view cluster status, you will need to connect to your Big Data Cluster through azdata. \r\n",
                "\r\n",
                "When you run this code cell, you will be prompted for:\r\n",
                "- Cluster name\r\n",
                "- Controller username\r\n",
                "- Controller password\r\n",
                "\r\n",
                "To proceed:\r\n",
                "- **Click** on the input box\r\n",
                "- **Type** the login info\r\n",
                "- **Press** enter.\r\n",
                "\r\n",
                "If your cluster is missing a configuration file, you will be asked to provide your controller endpoint. (Format: **https://00.00.00.000:00000**) You can find the controller endpoint from the big data cluster dashboard in the Service Endpoints table. The endpoint is listed as **Cluster Management Service.**\r\n",
                ""
            ],
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "import os, getpass, json\n",
                "import pandas as pd\n",
                "import numpy as np\n",
                "from IPython.display import *\n",
                "\n",
                "def PromptForInfo(promptMsg, isPassword, errorMsg):\n",
                "    if isPassword:\n",
                "        promptResponse = getpass.getpass(prompt=promptMsg)\n",
                "    else:\n",
                "        promptResponse = input(promptMsg)\n",
                "    if promptResponse == \"\":\n",
                "        raise SystemExit(errorMsg + '\\n')\n",
                "    return promptResponse\n",
                "\n",
                "# Prompt user inputs:\n",
                "cluster_name = PromptForInfo('Please provide your Cluster Name: ', False, 'Cluster Name is required!')\n",
                "\n",
                "controller_username = PromptForInfo('Please provide your Controller Username for login: ', False, 'Controller Username is required!')\n",
                "\n",
                "controller_password = PromptForInfo('Controller Password: ', True, 'Password is required!')\n",
                "print('***********')\n",
                "\n",
                "!azdata logout\n",
                "# Login in to your big data cluster \n",
                "cmd = f'azdata login -n {cluster_name} -u {controller_username} -a yes'\n",
                "print(\"Start \" + cmd)\n",
                "os.environ['CONTROLLER_USERNAME'] = controller_username\n",
                "os.environ['CONTROLLER_PASSWORD'] = controller_password\n",
                "os.environ['ACCEPT_EULA'] = 'yes'\n",
                "\n",
                "loginResult = !{cmd}\n",
                "if 'ERROR: Please check your kube config or specify the correct controller endpoint with: --controller-endpoint https://<ip>:<port>.' in loginResult[0] or 'ERROR' in loginResult[0]:\n",
                "    controller_ip = input('Please provide your Controller endpoint: ')\n",
                "    if controller_ip == \"\":\n",
                "        raise SystemExit(f'Controller IP is required!' + '\\n')\n",
                "    else:\n",
                "        cmd = f'azdata login -n {cluster_name} -e {controller_ip} -u {controller_username} -a yes'\n",
                "        loginResult = !{cmd}\n",
                "print(loginResult)"
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": [
                "## **Status of Big Data Cluster**\r\n",
                "After you successfully login to your bdc, you can view the overall status of each container before drilling down into each component."
            ],
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "# Helper methods for formatting\n",
                "def formatColumnNames(column):\n",
                "    return ' '.join(word[0].upper() + word[1:] for word in column.split())\n",
                "\n",
                "pd.set_option('display.max_colwidth', -1)\n",
                "def show_results(results):\n",
                "    strResult = ''.join(results)\n",
                "    jsonResults = json.loads(strResult)\n",
                "    results = jsonResults['result']\n",
                "    if isinstance(results, list):\n",
                "        for result in results:\n",
                "            if isinstance(result, list):\n",
                "                show_formattedArray(result)\n",
                "            else:\n",
                "                show_keys(result)\n",
                "    else:\n",
                "        show_keys(results)\n",
                "\n",
                "def show_keys(results):\n",
                "    listKeys = []\n",
                "    if isinstance(results, dict):\n",
                "        for key in results.keys():\n",
                "            if results[key] and not isinstance(results[key], list):\n",
                "                print('\\033[1m' + formatColumnNames(key) + ': \\033[0m' + results[key])\n",
                "            if results[key] and isinstance(results[key], list):\n",
                "                listKeys.append(key)\n",
                "        for key in listKeys:\n",
                "            show_formattedArray(results[key])\n",
                "    if isinstance(results, str):\n",
                "        print('\\033[1m' + results + ': \\033[0m')\n",
                "\n",
                "def show_formattedArray(results):\n",
                "    fomattedRow = []\n",
                "    if not isinstance(results, list):\n",
                "        show_formattedResults(results)\n",
                "    else:\n",
                "        for row in results:\n",
                "            if isinstance(row, str):\n",
                "                show_keys(row)\n",
                "            else:\n",
                "                fomattedRow.append({ k : v for k,v in row.items() if isinstance(v, str) or v is None})\n",
                "        df = pd.DataFrame(fomattedRow)\n",
                "        df.columns = [formatColumnNames(n) for n in fomattedRow[0].keys()]\n",
                "        mydata = HTML(df.to_html(render_links=True))\n",
                "        display(mydata)\n",
                "        nameKeys = [k for k in fomattedRow[0].keys() if 'Name' in k]\n",
                "        for key in results[0].keys():\n",
                "            if key not in fomattedRow[0].keys():\n",
                "                for result in results:\n",
                "                    print('\\033[1m' + formatColumnNames(nameKeys[0]) + ': \\033[0m' + result[nameKeys[0]])\n",
                "                    show_formattedArray(result[key])\n",
                "\n",
                "def show_formattedResults(input):\n",
                "    df = pd.DataFrame([input])\n",
                "    df.columns = [formatColumnNames(n) for n in [input][0].keys()]\n",
                "    mydata = HTML(df.to_html(render_links=True))\n",
                "    display(mydata)\n",
                "    \n",
                "# Display status of big data cluster\n",
                "results = !azdata bdc status show -o json\n",
                "show_results(results)"
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": [
                "## **Cluster Status**\r\n",
                "For each cluster component below, running each code cell will generate a table. This table will include:\r\n",
                "\r\n",
                "|Column Name|Description|\r\n",
                "|---|---|\r\n",
                "|**Kind** | Identifies if component is a pod or a set. |\r\n",
                "|**LogsURL** | Link to [Kibana](https://www.elastic.co/guide/en/kibana/current/introduction.html) logs which is used for troubleshooting. |\r\n",
                "|**Name** | Provides the specific name of the pod or set. |\r\n",
                "|**NodeMetricsURL** | Link to [Grafana](https://grafana.com/docs/guides/basic_concepts/) dashboard to view key metrics of the node. |\r\n",
                "|**SQLMetricsURL** | Link to [Grafana](https://grafana.com/docs/guides/basic_concepts/) dashboard to view key metrics of the SQL instance. |\r\n",
                "|**State** | Indicates state of the pod or set. |\r\n",
                "\r\n",
                "----------------------------------------------------------------"
            ],
            "metadata": {}
        },
        {
            "cell_type": "markdown",
            "source": [
                "### **Controller status**\n",
                "To learn more about the controller, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-controller?view=sql-server-ver15)"
            ],
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "# Display status of controller\n",
                "results = !azdata bdc control status show --all -o json\n",
                "show_results(results)"
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": [
                "### **Master Instance status**\n",
                "To learn more about the master instance, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-master-instance?view=sqlallproducts-allversions)"
            ],
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "results = !azdata bdc sql status show --resource master --all -o json\n",
                "show_results(results)"
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": [
                "### **Compute Pool status**\n",
                "To learn more about compute pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-compute-pool?view=sqlallproducts-allversions)"
            ],
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "# Display status of compute pool\n",
                "results = !azdata bdc sql status show --resource compute-0 --all -o json\n",
                "show_results(results)"
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": [
                "### **Storage Pool status**\n",
                "To learn more about storage pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-storage-pool?view=sqlallproducts-allversions)"
            ],
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "# Display status of storage pools\n",
                "results = !azdata bdc sql status show --resource storage-0 --all -o json\n",
                "show_results(results)"
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": [
                "### **Data Pool status**\n",
                "To learn more about data pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-data-pool?view=sqlallproducts-allversions)"
            ],
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "# Display status of data pools\n",
                "results = !azdata bdc sql status show --resource data-0 --all -o json\n",
                "show_results(results)"
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        },
        {
            "cell_type": "markdown",
            "source": [
                "### **Spark Pool status**\n",
                "Displays status of spark pool if it exists. Otherwise, will show as \"No spark pool.\""
            ],
            "metadata": {}
        },
        {
            "cell_type": "code",
            "source": [
                "# Display status of spark pool\n",
                "results = !azdata bdc spark status show --all -o json\n",
                "show_results(results)\n",
                ""
            ],
            "metadata": {},
            "outputs": [],
            "execution_count": 0
        }
    ]
}