Troubleshoot cloud monitoring with Performance Analysis
The Performance Analysis Dashboard (or PerfStack) gives you the ability to troubleshoot multi-faceted issues in your applications and infrastructure. Create consolidated data views as charts and graphs to collect and compare metrics, data, and logs for end-to-end hybrid troubleshooting for monitored and managed nodes and more.
When something goes wrong in your applications and infrastructure, you typically need to review numerous views and resources in the Orion Web Console depending on Orion products, features, metrics, and nodes. The Performance Analysis dashboard provides visualizations of correlated data to analyze and sift through metrics, relationships, and noise to focus on the true issue and related data.
- Compile all of the metrics into a single dashboard to analyze and find the key issues
- Merge metrics into the same charts and graphs to see gaps and spikes
- Navigate trending issues and triggered alerts by walking back through time across all metrics
- Play with data to see trends of usage and issues and walk-through history performance
- Determine when performance starts to ebb and flow across resources and applications
- Troubleshoot cloud instances encountering problems and to take action through cloud management pages
- Continue monitoring resources after resolving issues to verify performance
Application investigation with SAM
For example, a Windows Server 2003 application on cloud instances is encountering degraded performance issues. Alerts trigger sending information to your application owner, who escalates the issue to system and network administrators. Rather than digging into the alerts through Node Details pages for the monitored application, server, and network, create a new project to investigate the issue by laying alert metrics into the same dashboard.
In the Orion Web Console, select My Dashboards > Home > Performance Analysis.
This opens the Performance Analysis, or PerfStack, dashboard to build charts and graphs using metrics pulled from monitored applications and servers in the Metric Palette.
In the New Analysis Project, click Add Entities.
Search and select the checkbox for the Windows 2003 application in distress. Entering Windows gives a list of all monitored nodes, component monitors, and more with Windows in the name or type. Expand and select Types or Status to further filter the list.
The selected application node displays in the Metric Palette.
Select the Windows 2003 node and find metrics to drag and drop onto the dashboard.
The metrics list for the selected node. For the Windows 2003 application, expand and select the following metrics into charts: Logical Disk Average: Disk Queuing, Average IOPS Read, Maximum IOPS Write, Maximum IOPS Read, Average IOPS Write, IO Latency Write, IO Latency Read, and IOPS Total.
The data begins to show us where issues occurred. To further investigate, add the following metrics into another chart: Throughput Total: IO data from the virtual layer and IO data from the storage layer.
- Click Save.
Analyzing the data, the issue looks to be a noisy neighbor. This gives your network and system administrators a direction for further investigation and resolving latency issues. Share the link with all system and network administrators to help resolve the issue and inform the application owner. Everyone can look over the data, add it to tickets, and verify changes as SAM polls after updates.