Authors: Silvano Coriani, Jasraj Dange, Ewan Fairweather, Xin Jin, Alexei Khalyako, Sanjay Mishra, Selcin Turkarslan Technical Reviewers

Factors that govern performance for SQL Server in Azure Virtual Machines

Yüklə 268,61 Kb.

səhifə	6/8
tarix	16.08.2018
ölçüsü	268,61 Kb.
	#63132

1 2 3 4 5 6 7 8

Performance monitoring methodology for SQL Server in Azure Virtual Machine

Factors that govern performance for SQL Server in Azure Virtual Machines

Although the SQL Server binaries running in a traditional on-premises and Microsoft Azure virtual machine environments are identical, there are some infrastructure differences that affect the way that applications perform in SQL Server in Azure Virtual Machines compared to an on-premises dedicated enterprise server. An analogy that can be used to compare the experience of running SQL Server in Azure virtual machine is an on-premises virtualized SQL Server environment with dedicated virtualized resources (such as processor and memory) but no Hyper-V over-commit of memory or processors. In addition, the I/O performance varies in Microsoft Azure storage as it is a managed shared disk environment and performance at a point in time depends on other tenants’ activity and overall system load; this can happen in shared on-premises storage environments as well.

All the traditional SQL Server performance factors described in the previous section are still applicable when running SQL Server in Azure virtual machine environment. The following table summarizes most common resource bottleneck issues and provides a list of actions that you can take to resolve them.

Issue	Key performance indicators (KPIs) to monitor	Actions to consider
SQL Server CPU at or near 80%	% Processor Time (_Total) SOS_SCHEDULER_YIELD waits	Increase number of CPUs by increasing your SQL Server virtual machine instance size (if possible). Identify top consuming queries and tune. Split out workload (for example, move a database off the SQL Server instance).
Near I/O capacity limits or I/O latency increases	Average disk reads per second Average disk writes per second Disk reads per second Disk writes per second io_virtual_file_stats PAGEIOLATCH waits SQL Server: Buffer Manager\Page Life Expectancy	Check Page Life Expectancy counter value is low (<300 seconds). This can indicate memory pressure on the system is causing increased disk IO. Consider increasing instance size (if possible). Identify which database and log files have I/O bottleneck. Add more data disks and separate data files if you are at or near IOPS limits per disk. Note: This can apply to any user created or tempdb databases. Tune queries to reduce reads and writes. Consider enabling row or page compression to reduce the number of I/Os.
Memory resource pressure	Memory: Available Bytes Memory: Pages per second SQL Server: Buffer Manager\Page Life Expectancy Process: Working Set (for SQL Server) RESOURCE_SEMAPHORE waits	Check max server memory setting for SQL Server. Increase memory by Increasing instance size or use high memory instance if possible. Check which component of SQL Server utilizes memory (such as, CLR, high memory grants for application queries, and so on) and tune appropriately.

As described in the table, you can resolve performance issues by following different approaches and actions. In traditional on-premises environment, you might prefer adding or purchasing more hardware resources to alleviate performance e problems. In Azure environment, the resources that are available per machine are smaller in number and less powerful than the typical on-premises enterprise servers. For example, while adding a data disk may increase disk throughput by 25 percent; tuning query workloads might reduce the overall I/O requirement by 90 percent in some cases. Therefore we recommend that you always follow a systematic approach that involves analyzing, tuning and redesigning to achieve better performance results.

Optimizing your application for Azure Virtual Machine environment will provide valuable cost benefits because you can achieve a higher density per unit of compute. Unlike the traditional on-premises environment, Azure allows you reduce the number and size virtual machines immediately to reduce the operational costs. In addition, you can dynamically re-balance the size of machines based on seasonal usage peaks.

It is important that you develop a systematic monitoring and troubleshooting methodology to effectively run your SQL Server in Azure Virtual Machine environment.

Performance monitoring methodology for SQL Server in Azure Virtual Machine

This section explains the approaches that can be used to identify and isolate performance issues while running SQL Server in Azure Virtual Machine environment:

Define key performance indicators (KPIs) to monitor resource utilization: We recommend that you define KPIs for SQL Server and each important application tier. These should include Windows Performance Monitor (Performance Monitor) counters for your application’s important tiers and components as well as SQL Server. In addition, you should monitor SQL Server’s performance related dynamic management views (DMVs) to identify the underlying causes of performance problems. We recommend that you define five to eight KPIs for each major application entity or component, such as SQL Server, application-related counters, and caching component counters. For more information, see the Guidance on defining key performance indicators section in the Appendix.
Monitor your KPIs to track resource utilization: We recommend that you track and monitor KPIs by using tools such as Performance Monitor and SQL Server Management Studio. If application requests increase by a certain percent capacity, the proportional increase in the underlying system resources depends on how uniform the workload is. Make sure you track a latency metric, particularly for web applications running on SQL Server in Azure virtual machine.
Examine trends and patterns to identify issues as the workload increases: As the number of users that are using your application increases, some system resources (such as processor, I/O, memory, and network) might become under pressure. For example, if the SQL Server workload reaches to its sustainable I/O limits, it becomes I/O bound. When this happens, the latency of the I/O subsystem increases first. Therefore, you may notice a corresponding increase on individual query execution times, which increase latency for end users. Secondly, throughput, such as the number of concurrent user queries that SQL Server can support, begins to level-off or decrease. Finally, if the pressure on a specific resource increases, you may notice unavailability issues, such as query and application timeouts. By monitoring Performance Monitor and the sys.dm_os_wait_stats DMV, you can identify the potential performance problems before end users notice them. For example, this scenario would initially cause an increase in disk response time as measured by the logical disk performance counters, and an increase in the number of PAGEIOLATCH or log related waits that SQL Server provides.
Monitor DMVs to determine what resources your application is competing and waiting for:
- Monitor and identify instance level waits by using sys.dm_os_wait_stats. Make sure to review the total wait time as a percentage of the total. The following table describes the wait profile that are indicative of Page Latch contention.

Other common profiles depend on the predominant wait type and include: SOS_SCHEDULER_YIELD, which indicates that the operation is waiting on the CPU to become free; long PAGEIOLATCH waits, which indicate that SQL Server is generating I/O requests faster than the I/O system can process them; and RESOURCE_SEMAPHORE, which indicates memory pressure on memory. See the Snapshot wait stats script provided in the Appendix. It demonstrates how to automate the calculation of waits during a time interval. This is useful for identifying the dominant underlying wait in SQL Server when symptoms appear with your workload.

Monitor query resource consumers by using sys.dm_exec_query_stats to identify top resource consumers and monitor query execution and efficiency
Monitor I/O consumptions and characteristics by using sys.dm_io_virtual_file_stats and the logical disk Performance Monitor counters. The following table summarizes the key performance counters to monitor.

Logical disk counter	Typical storage term	Suggested actions in Microsoft Azure virtual machine environment
Disk reads / second Disk writes / second	IOPS	Measure the number of I/O’s per second. Consider adding more data disks in line with your IOPS requirements.
Average disk sec / read Average disk sec / write	Latency	Measure disk latency. Note: Numbers might vary; look at averages over time.
Average disk bytes / read Average disk bytes / write	Block size	Measure the size of I/O’s being issued. Note: Larger I/O’s tend to have higher latency, such as those associated with BACKUP/RESTORE.
Average / current disk queue length	Outstanding or waiting IOPS	Provides insight into the applications I/O pattern.
Disk read bytes/sec Disk write bytes/sec	Throughput or aggregate throughput	Measure of total disk throughput. Note: Ideally, larger block scans should be able to heavily utilize connection bandwidth (for example, your throughput can be higher with a smaller number of larger IOPS).

Take snapshots of currently executing SQL Server requests by using sys.dm_exec_requests to check for locking, blocking, latching and other performance issues caused by resource contention related performance issues.
Monitor the application and SQL Server event logs to identify errors.

Use a delta approach to monitor beyond DMVs: Some DMVs provide cumulative information from the last time that the SQL Server process was started, such as sys.dm_os_wait_stats. Others contain a snapshot from a point in time, such as sys.dm_exec_requests. The performance of a query is affected by factors including plan selection and resource availability. An effective approach that works in on-premises and Azure Virtual Machine environments is to combine the usage of sys.dm_os_wait_stats and sys.dm_exec_query_stats. This helps you to understand the query performance and resource constraints, which can be inferred from the system-wait information. To identify locking and blocking issues, you should routinely monitor active request information using sys.dm_exec_requests. In summary, we recommend that you:
- Take a regular periodic snapshots of query stats, wait stats, and exec requests.
- Calculate the delta between two snapshots to understand what happened during that period. You can use the sample script Snapshot wait stats in the Appendix for this purpose.

Monitor Spinlock and Backoff events: Spinlocks are used to synchronize access to key memory regions that SQL Server uses for internal constructs. Use sys.dm_os_spinlock_stats to monitor the number of spins. For more information, see Diagnosing and Resolving Spinlock Contention on SQL Server article.

Yüklə 268,61 Kb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8