Files
azuredatastudio/extensions/mssql/notebooks/TSG/cluster-status.ipynb
Maddy 32235b0cb6 Cluster management/newdashboard task (#6060)
* initial commit: added cluster status notebook and dashboard task

* following the previous naming conventions

* endpoint widget changes to accomodatw naming changes

* management-proxy/mgmtproxy chnages

* updates to address the comments and added the new copy image with hover text.

* added user select for making the table selectable

* localize changes

* added the final documented notebook

* reset execution_count to 0 for all cells

* style changes

* updated the url to point to private repo
2019-06-25 22:46:11 -07:00

159 lines
13 KiB
Plaintext

{
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python",
"version": "3.6.6",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
}
},
"nbformat_minor": 2,
"nbformat": 4,
"cells": [
{
"cell_type": "markdown",
"source": "![11811317_10153406249401648_2787740058697948111_n](https://raw.githubusercontent.com/Microsoft/sqlworkshops/master/graphics/solutions-microsoft-logo-small.png)\n\n# View the status of your big data cluster\nThis notebook allows you to see the status of the controller, master instance, and pools in your SQL Server big data cluster.\n\n## <span style=\"color:red\">Important Instructions</span>\n### **Before you begin, you will need:**\n* Big data cluster name\n* Controller username\n* Controller password\n* Controller endpoint \n\nYou can find the controller endpoint from the big data cluster dashboard in the Service Endpoints table. The endpoint is listed as **Cluster Management Service.**\n\nIf you do not know the credentials, ask the admin who deployed your cluster.\n\n### **Instructions**\n* For the best experience, click **Run Cells** on the toolbar above. This will automatically execute all code cells below and show the cluster status in each table.\n* When you click **Run Cells** for this Notebook, you will be prompted at the *Log in to your big data cluster* code cell to provide your login credentials. Follow the prompts and press enter to proceed.\n* **You won't need to modify any of the code cell contents** in this Notebook. If you accidentally made a change, you can reopen this Notebook from the bdc dashboard.\n\n\n",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "## **Dependencies**\r\n\r\n> This Notebook will try to install these dependencies for you.\r\n\r\n----------------------------------------------------------------\r\n<table>\r\n<colgroup>\r\n<col style=\"width: 10%\" />\r\n<col style=\"width: 10%\" />\r\n<col style=\"width: 10%\" />\r\n<col style=\"width: 10%\" />\r\n</colgroup>\r\n<thead>\r\n<tr class=\"header\">\r\n<th>Tool</th>\r\n<th>Required</th>\r\n<th>Description</th>\r\n</tr>\r\n</thead>\r\n<tbody>\r\n<tr class=\"odd\">\r\n<td><strong>mssqlctl</strong></td>\r\n<td>Yes</td>\r\n<td>Command-line tool for installing and managing a big data cluster.</td>\r\n</tr>\r\n<tr class=\"even\">\r\n<td><strong>pandas</strong></td>\r\n<td>Yes</td>\r\n<td>Python Library for formatting data (<a href=\"https://www.learnpython.org/en/Pandas_Basics\">More info</a>).</td>\r\n</tr>\r\n</tbody>\r\n</table>\r\n<p>",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "### **Install latest version of mssqlctl**",
"metadata": {}
},
{
"cell_type": "code",
"source": "import sys, platform\r\n\r\nif platform.system()==\"Windows\":\r\n user = ' --user'\r\nelse:\r\n user = ''\r\n\r\ndef executeCommand(cmd, successMsgs, printMsg):\r\n print(printMsg)\r\n cmdOutput = !{cmd}\r\n cmdOutput = ''.join(cmdOutput)\r\n if any(msg in cmdOutput for msg in successMsgs):\r\n print(f\"\\nSuccess >> \" + cmd)\r\n else:\r\n raise SystemExit(f'\\nFailed during:\\n\\n\\t{cmd}\\n\\nreturned: \\n' + ''.join(cmdOutput) + '.\\n')\r\n\r\ninstallPath = 'https://private-repo.microsoft.com/python/ctp3.1/mssqlctl/requirements.txt'\r\ncmd = f'{sys.executable} -m pip uninstall --yes mssqlctl-cli-storage'\r\ncmdOutput = !{cmd}\r\n\r\ncmd = f'{sys.executable} -m pip uninstall -r {installPath} --yes'\r\nexecuteCommand(cmd, ['is not installed', 'Successfully uninstalled mssqlctl'], 'Uninstalling mssqlctl:')\r\n\r\ncmd = f'{sys.executable} -m pip install -r {installPath}{user} --trusted-host helsinki'\r\ncmdOutput = !{cmd}\r\nexecuteCommand(cmd, ['Requirement already satisfied', 'Successfully installed mssqlctl'], 'Installing the latest version of mssqlctl:')",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "### **Install latest version of pandas**",
"metadata": {}
},
{
"cell_type": "code",
"source": "#install pandas\r\ncmd = f'{sys.executable} -m pip show pandas'\r\ncmdOutput = !{cmd}\r\nif len(cmdOutput) > 0 and '0.24' in cmdOutput[1]:\r\n print('Pandas required version is already installed!')\r\nelse:\r\n pandasVersion = 'pandas==0.24.2'\r\n cmd = f'{sys.executable} -m pip install {pandasVersion}'\r\n cmdOutput = !{cmd}\r\n print(f'\\nSuccess: Upgraded pandas.')",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "## **Log in to your big data cluster**\r\nTo view cluster status, you will need to connect to your big data cluster through mssqlctl. \r\n\r\nWhen you run this code cell, you will be prompted for:\r\n- Cluster name\r\n- Controller username\r\n- Controller password\r\n\r\nTo proceed:\r\n- **Click** on the input box\r\n- **Type** the login info\r\n- **Press** enter.\r\n\r\nIf your cluster is missing a configuration file, you will be asked to provide your controller endpoint. (Format: **https://00.00.00.000:00000**)",
"metadata": {}
},
{
"cell_type": "code",
"source": "import os, getpass, json\nimport pandas as pd\nimport numpy as np\nfrom IPython.display import *\n\ndef PromptForInfo(promptMsg, isPassword, errorMsg):\n if isPassword:\n promptResponse = getpass.getpass(prompt=promptMsg)\n else:\n promptResponse = input(promptMsg)\n if promptResponse == \"\":\n raise SystemExit(errorMsg + '\\n')\n return promptResponse\n\n# Prompt user inputs:\ncluster_name = PromptForInfo('Please provide your Cluster Name: ', False, 'Cluster Name is required!')\n\ncontroller_username = PromptForInfo('Please provide your Controller Username for login: ', False, 'Controller Username is required!')\n\ncontroller_password = PromptForInfo('Controller Password: ', True, 'Password is required!')\nprint('***********')\n\n!mssqlctl logout\n# Login in to your big data cluster \ncmd = f'mssqlctl login -n {cluster_name} -u {controller_username} -a yes'\nprint(\"Start \" + cmd)\nos.environ['CONTROLLER_USERNAME'] = controller_username\nos.environ['CONTROLLER_PASSWORD'] = controller_password\nos.environ['ACCEPT_EULA'] = 'yes'\n\nloginResult = !{cmd}\nif 'ERROR: Please check your kube config or specify the correct controller endpoint with: --controller-endpoint https://<ip>:<port>.' in loginResult[0] or 'ERROR' in loginResult[0]:\n controller_ip = input('Please provide your Controller endpoint: ')\n if controller_ip == \"\":\n raise SystemExit(f'Controller IP is required!' + '\\n')\n else:\n cmd = f'mssqlctl login -n {cluster_name} -e {controller_ip} -u {controller_username} -a yes'\n loginResult = !{cmd}\nprint(loginResult)\n",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "## **Status of big data cluster**\r\nAfter you successfully login to your bdc, you can view the overall status of each container before drilling down into each component.",
"metadata": {}
},
{
"cell_type": "code",
"source": "# Display status of big data cluster\ndef formatColumnNames(column):\n return ' '.join(word[0].upper() + word[1:] for word in column.split())\n\npd.set_option('display.max_colwidth', -1)\ndef show_results(input):\n input = ''.join(input)\n results = json.loads(input)\n df = pd.DataFrame(results)\n df.columns = [formatColumnNames(n) for n in results[0].keys()]\n mydata = HTML(df.to_html(render_links=True))\n display(mydata)\n\nresults = !mssqlctl bdc status show\nstrRes = ''.join(results)\njsonRes = json.loads(strRes)\ndtypes = '{'\nspark = [x for x in jsonRes if x['kind'] == 'Spark']\nif spark:\n spark_exists = True\nelse:\n spark_exists = False\nshow_results(results)",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "## **Cluster Status**\r\nFor each cluster component below, running each code cell will generate a table. This table will include:\r\n\r\n----------------------------------------------------------------\r\n<table>\r\n<colgroup>\r\n<col style=\"width: 10%\" />\r\n<col style=\"width: 10%\" />\r\n</colgroup>\r\n<thead>\r\n<tr class=\"header\">\r\n<th>Column Name</th>\r\n<th>Description</th>\r\n</tr>\r\n</thead>\r\n<tbody>\r\n<tr class=\"odd\">\r\n<td><strong>Kind</strong></td>\r\n<td>Identifies if component is a pod or a set.</td>\r\n</tr>\r\n<tr class=\"even\">\r\n<td><strong>LogsURL</strong></td>\r\n<td>Link to <a href=\"https://www.elastic.co/guide/en/kibana/current/introduction.html\">Kibana</a> logs which is used for troubleshooting.</td>\r\n</tr>\r\n<tr class=\"odd\">\r\n<td><strong>Name</strong></td>\r\n<td>Provides the specific name of the pod or set.</td>\r\n</tr>\r\n<tr class=\"even\">\r\n<td><strong>NodeMetricsURL</strong></td>\r\n<td>Link to <a href=\"https://grafana.com/docs/guides/basic_concepts/\">Grafana</a> dashboard to view key metrics of the node.</td>\r\n</tr>\r\n<tr class=\"odd\">\r\n<td><strong>SQLMetricsURL</strong></td>\r\n<td>Link to <a href=\"https://grafana.com/docs/guides/basic_concepts/\">Grafana</a> dashboard to view key metrics of the SQL instance</td>\r\n</tr>\r\n<tr class=\"even\">\r\n<td><strong>State</strong></td>\r\n<td>Indicates state of the pod or set..</td>\r\n</tr>\r\n</tbody>\r\n</table>\r\n<p>",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "### **Controller status**\nTo learn more about the controller, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-controller?view=sql-server-ver15)",
"metadata": {}
},
{
"cell_type": "code",
"source": "# Display status of controller\nresults = !mssqlctl bdc control status show\nshow_results(results)",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "### **Master Instance status**\nTo learn more about the master instance, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-master-instance?view=sqlallproducts-allversions)",
"metadata": {}
},
{
"cell_type": "code",
"source": "# Display status of master instance\nresults = !mssqlctl bdc pool status show -k master -n default\nshow_results(results)",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "### **Compute Pool status**\nTo learn more about compute pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-compute-pool?view=sqlallproducts-allversions)",
"metadata": {}
},
{
"cell_type": "code",
"source": "# Display status of compute pool\nresults = !mssqlctl bdc pool status show -k compute -n default\nshow_results(results)",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "### **Storage Pool status**\nTo learn more about storage pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-storage-pool?view=sqlallproducts-allversions)",
"metadata": {}
},
{
"cell_type": "code",
"source": "# Display status of storage pools\nresults = !mssqlctl bdc pool status show -k storage -n default\nshow_results(results)",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "### **Data Pool status**\nTo learn more about data pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-data-pool?view=sqlallproducts-allversions)",
"metadata": {}
},
{
"cell_type": "code",
"source": "# Display status of data pools\nresults = !mssqlctl bdc pool status show -k data -n default\nshow_results(results)",
"metadata": {},
"outputs": [],
"execution_count": 0
},
{
"cell_type": "markdown",
"source": "### **Spark Pool status**\nDisplays status of spark pool if it exists. Otherwise, will show as \"No spark pool.\"",
"metadata": {}
},
{
"cell_type": "code",
"source": "# Display status of spark pool\nif spark_exists:\n results = !mssqlctl bdc pool status show -k spark -n default\n show_results(results)\nelse:\n print('No spark pool.')",
"metadata": {},
"outputs": [],
"execution_count": 0
}
]
}