Authors: Silvano Coriani, Jasraj Dange, Ewan Fairweather, Xin Jin, Alexei Khalyako, Sanjay Mishra, Selcin Turkarslan Technical Reviewers

Best practices and recommendations for optimizing SQL Server performance in Azure VMs

Yüklə 268,61 Kb.

səhifə	4/8
tarix	16.08.2018
ölçüsü	268,61 Kb.
	#63132

1 2 3 4 5 6 7 8

Best practices and recommendations for optimizing SQL Server performance in Azure VMs

Many of the same techniques used to optimize SQL Server performance in your on-premises environment can be used to tune your SQL Server workloads in Azure Infrastructure Services. Having said that, running your database workload in a hosted multi-tenant cloud service like Azure Infrastructure Services is fundamentally different and if you want to be successful you will need to consider some new best practices. This section provides new best practices and recommendations for optimizing SQL Server performance in Azure Infrastructure Services.

Virtual machine sizes

In general, smaller VM sizes are best suited for lightweight development and test systems or applications. For production and intensive workload deployments, bigger virtual machine sizes (such as A3 or high memory instances) are often a better choice because they can provide more capacity in terms of virtual cores, memory and data disks.

For SQL Server production workloads, we recommend that you use minimum Standard Tier A2 VM sizes or bigger instances. Starting with May 2014, new VM sizes (A8 and A9) have been introduced sporting faster Intel Xeon processor and increased memory sizes. Based on various performance tests, these VMs provide important benefits in terms of CPU performance, IO throughput and bandwidth. If you plan to run very high SQL Server workloads in Azure Virtual Machines, we recommend that you consider these new VM sizes.

Azure virtual machine disks and cache settings

Azure Virtual Machines provide three types of disks: operating system (OS) disk, temporary disk, and data disks. For a description of each disk type, see section Azure Infrastructure services fundamentals in this article.

Operating system disk vs. data disk

When placing your data and log files you should consider disk cache settings in addition to size limits. For a description of cache settings, see section Azure Infrastructure services fundamentals in this article.

While “Read Write” cache (default setting) for the operating system disk helps improve the overall operating system performance, boot times and reducing the read latency for the IO patterns that OS usually generates, we recommend that you do not use OS disk for hosting system and user database files. Instead, we recommend that you use data disks. When the workload demands a high rate of random I/Os (such as a SQL Server OLTP workload) and throughput is important to you, the general guideline is to keep the cache set to the default value of “None” (disabled). Because Azure storage is capable of more IOPS than a direct attached storage disk, this setting causes the physical host local disks to be bypassed, therefore providing the highest I/O rate.

Temporary disk

Unlike Azure disks (operating system and data disks) which are essentially VHDs stored as page blobs in Azure Storage, the temporary disk (labeled as D:) is not persistent and is not implemented using Azure Storage. It is reserved by the operating system for the page file and its performance is not guaranteed to be predictable. Any data stored on it may be lost after your virtual machine is restarted or resized. Hence, we do not recommend the D: drive for storing any user or system database files, including tempdb.

Data disks performance options and considerations

This section discusses the best practices and recommendations on data disk performance options based on testing done by Microsoft. You should be familiar with how SQL Server I/O operations work in order to interpret the test results reported in this section. For more information, see Pages and Extents Architecture.

It is important to note that the results we provide in this section were achieved without SQL Server High Availability and Disaster Recovery Solutions enabled (such as, AlwaysOn Availability Groups, database mirroring or log shipping). We recommend that you deploy one of these features to maintain multiple redundant copies of your databases across at least two virtual machines in an availability set in order to be covered by the Azure Cloud Services, Virtual Machines, and Virtual Network Service Level Agreement. Enabling any of these features affects performance, so you should consider incorporating one of them in your own performance testing to get more accurate results.

As a general rule, we recommend that you attach maximum number of disks allowed by the VM size (such as, 16 data disks for an A7 VM) for throughput sensitive applications. While latency may not necessarily improve by adding more data disks when your workload is within the maximum IOPS limit, the additional IOPS and bandwidth that you get from the attached additional disks can help to avoid reaching the single disk 500 IOPS limit. Note that this might trigger throttling events that might increase disk response times and disk latency.

Single data disk configuration

In our performance tests, we’ve executed several SQL Server I/O measurements to understand data disk response characteristics with respect to the typical I/O patterns generated by SQL Server based on different kind of workloads. The results for a single disk configuration on an A7 VM instance are summarized here:

	Random I/O (8 KB Pages)				Sequential I/O (64 KB Extents)
		Reads	Writes	Reads		Writes
IOPS		500	500	500		300
Bandwidth		4 MB/s	4 MB/s	30 MB/s		20 MB/s

Note: Because Azure Infrastructure Services is a multi-tenant environment, performance results may vary. You should consider these results as an indication of what you can achieve, but not a guarantee. We suggest you repeat these tests and measurements based on your specific workload.

Multiple data disk configuration

If your workload exceeds or is close to the I/O performance numbers mentioned in the previous section, we recommend that you add multiple disks (depending on your virtual machine size) and stripe multiple disks in volumes. This configuration gives you the ability to create volumes with specific throughput and bandwidth, based on your data and log performance needs by combining multiple data disks together.

Adding multiple data disks to Azure virtual machine

After you create a virtual machine in Azure, you can attach a data disk to it using either the Azure Management Portal or the Add-AzureDataDisk Azure PowerShell cmdlet. Both techniques allow you to select an existing data disk from a storage account, or create a new blank data disk.

If you choose to create a new blank data disk at the Management Portal, you can choose the storage account that your virtual machine was created in but not a different storage account.

To place your existing data disk (.vhd file) into a specific storage account, you need to use the Azure PowerShell cmdlets. The following example demonstrates how to update a virtual machine using the Get-AzureVM and the Add-AzureDataDisk cmdlets. The Get-AzureVM cmdlet retrieves information on a specific virtual machine. The Add-AzureDataDisk cmdlet creates a new data disk with specified size and label in a previously created Storage Account.

Get-AzureVM "CloudServiceName" -Name "VMNAme" | Add-AzureDataDisk -CreateNew -DiskSizeInGB 100 -MediaLocation ` "https://.blob.core.windows.net/vmdisk/Disk1.vhd" -DiskLabel "disk1" -LUN 1 | Update-AzureVM

To create a new storage account, use the New-AzureStorageAccount cmdlet as follows:

New-AzureStorageAccount -StorageAccountName "StorageAccountX" -Label "StorageAccountX" -Location "North Central US"

For more information about Azure PowerShell cmdlets, see Azure PowerShell on MSDN and Azure command line tools.

Disk striping options for Azure Virtual Machines

For Azure VMs running on Windows Server 2008 R2 and previous releases, the only striping technology available is striped volumes for dynamic disks. You can use this option to stripe multiple data disks into volumes that provide more throughput and bandwidth than what a single disk can provide.

Starting with Windows Server 2012, Storage Pools are introduced and operating system software RAID capabilities are deprecated. Storage Pools enable you to virtualize storage by grouping industry-standard disks into “pools”, and then create virtual disks called Storage Spaces from the available capacity in the storage pools. You can then configure these virtual disks to provide striping capabilities across all disks in the pool, combining good performance characteristics. In addition, it enables you to add and remove disk space based on your needs.

During our tests, after adding a number of data disks (4, 8 and 16) as shown in the previous section, we created a new storage pool by using the following Windows PowerShell command:

New-StoragePool –FriendlyName StoragePool1 –StorageSubsystemFriendlyName "Storage Spaces*" –PhysicalDisks (Get-PhysicalDisk –CanPool $True)

Next, we created a virtual disk on top of the new storage pool and specified resiliency setting and virtual disk size.

$disks = Get-StoragePool –FriendlyName StoragePool1 -IsPrimordial $false | Get-PhysicalDisk

New-VirtualDisk –FriendlyName VirtualDisk1 -ResiliencySettingName Simple –NumberOfColumns $disks.Count –UseMaximumSize –Interleave 256KB

Important Note: For performance, it is very important that the –NumberOfColumns parameter is set to the number of disks utilized to create the underlying Storage Pool. Otherwise, IO requests cannot be evenly distributed across all data disks in the pool and you will get suboptimal performance.

The –Interleave parameter enables you to specify the number of bytes written in each underlying data disk in a virtual disk. We recommend that you use 256 KB for all workloads.

Lastly, we created and formatted the volume to make it usable to the operating system and applications by using the following Windows PowerShell commands:

Get-VirtualDisk –FriendlyName VirtualDisk1 | Get-Disk | Initialize-Disk –Passthru | New-Partition –AssignDriveLetter –UseMaximumSize | Format-Volume –AllocationUnitSize 64K

Once the volume created, it is possible to dynamically increase the disk capacity by attaching new data disks. To achieve optimal capacity utilization, consider the number of columns your storage spaces have and add disks in multiples of that number. See Windows Server Storage spaces Frequently Asked Questions for more information.

Using Storage Pools instead of traditional Windows operating system striping in dynamic disks brings several advantages in terms of performance and manageability. We recommend that you use Storage Pools for disk striping in Azure Virtual Machines.
During our internal testing, we have implemented the following scenarios with different number of disks as well as disk volume configurations. We tested the following scenarios with configurations of 4, 8 and 16 data disks respectively, and we observed increased IOPS for each data disk added as expected:

We arranged multiple data disks as simple volumes and leveraged the Database Files and Filegroups feature of SQL Server to stripe database files across multiple volumes.
We used Windows Server Storage Pools to create larger volumes, which contains multiple data disks, and we placed database and log files inside these volumes.

It’s important to notice that using multiple data disks provides performance benefits but it creates more management overhead. In addition, partial unavailability of one of the striped disks can result in unavailability of a database. Therefore, for such configurations, we recommend that you consider enhancing the availability of your databases using high availability and disaster recovery capabilities of SQL Server as described in High Availability and Disaster Recovery for SQL Server in Azure Virtual Machines.

The following tables summarize the results of tests that we performed using multiple data disks configurations at Microsoft.
Aggregated throughput and bandwidth across 4 data disks

	Random I/O (8 KB Pages)				Sequential I/O (64 KB Extents)
		Reads	Writes	Reads		Writes
IOPS		2000	2000	1600		1200
Bandwidth		16 MB/s	16 MB/s	100 MB/s		75 MB/s

Aggregated throughput and bandwidth across 8 data disks

	Random I/O (8 KB Pages)				Sequential I/O (64 KB Extents)
		Reads	Writes	Reads		Writes
IOPS		4000	4000	2400		2400
Bandwidth		30 MB/s	30 MB/s	150 MB/s		150 MB/s

Aggregated throughput and bandwidth across 16 data disks

	Random I/O (8 KB Pages)				Sequential I/O (64 KB Extents)
		Reads	Writes	Reads		Writes
IOPS		8000	8000	2400		4000
Bandwidth		60 MB/s	60 MB/s	150 MB/s		250 MB/s

Note: Because Azure Infrastructure Services is a shared, multi-tenant environment, performance results may vary. You should consider these results as an indication of what you can achieve, but not a guarantee. We recommend that you repeat these tests and measurements based on your specific workload.

By using the newly introduced Intel-based A8 and A9 VM sizes, we repeated our IO performance tests and noticed a significant increase in bandwidth and throughput for larger sequential IO requests. If you use Intel-based A8 and A9 VM sizes, you can get a performance increase for 64 KB (and bigger) read and write operations. If your workload is IO intensive, these new VM sizes (A8 and A9) can help in achieving more linear scalability compare to smaller VM sizes, but always within the 500 IOPs per disk boundaries. For more information, see About the A8 and A9 Compute Intensive Instances.

Based on our tests, we have made the following observations about the Azure Virtual Machine environment:

Spreading your I/O workload across a number of data disks benefits smaller random operations (more common in OLTP scenarios) where IOPS and bandwidth scale in a nearly linear fashion.
As the I/O block size increases, for read operations adding more data disks does not result in higher IOPS or bandwidth. This means that if your workload is read intensive with more analytical queries, adding more disks will not necessarily help.
For write intensive workload, adding more data disks can increase performance in a nearly linear fashion. This means that you can benefit from placing each transaction log for multiple databases on a separate data disk.
For large sequential I/O block sizes (such as, 64 KB or greater), writes generally perform better than reads.
A8 and A9 VM sizes provide increased throughput for IO sensitive workloads.

Placement of database files

Depending on how you configure your storage, you should place and the data and log files for user and system databases accordingly to achieve your performance goals. This section provides guidance on how you should place database files when using SQL Server in Azure Virtual Machines:

Option 1: You can create a single striped volume using Windows Server Storage Spaces leveraging multiple data disks, and place all database and log files in this volume. In this scenario, all your database workload shares aggregated I/O throughput and bandwidth provided by these multiple disks, and you simplify the placement of database files. Individual database workloads are load balanced across all available disks, and you do not need to worry about single database spikes or workload distribution. You can find the graphical representation of this configuration below:

Option 2: You can create multiple striped volumes, each composed by the number of data disks required to achieve specific I/O performance, and do a careful placement of user and system database files on these volumes accordingly. You may have one important production database with a write-intensive workload that has high priority, and you may want to maximize the database and log file throughput by segregating them on two separate 4 disk volumes (each volume providing around 2000 IOPs and 100 MB/sec). For example, use:

4-disks volume for hosting TempDB data and log files.
4-disks volume for hosting other minor databases.

This option can give you precise file placement by optimizing available IO performance. You can find the graphical representation of this configuration below:

You can still create single disk volumes and leverage SQL Server files and filegroups placement for your databases. While this can still offer some benefits in terms of flexible storage layout organization, it introduces additional complexity and also limits single file (data or log) IO performance to a value that a single Azure data disk can provide such as 500 IOPs and 60 MB/sec.

Although Azure data disks have different behaviors than traditional rotating spindles (,in which competing random and sequential operations on the same disks can impact performance), we still recommend that you keep data and log files in different paths to achieve dedicated IOPs and bandwidth for them.

To help understand your IO requirements and performance while running your SQL Server workloads on Azure Virtual Machines, you need to analyze the following three tools and combine the results carefully:

SQL Server IO statistics: They reflect the database management system view of the IO subsystem.
Windows Server Logical Disk Performance Counters: They show how the operating system performs on IOs.
Azure Storage Analytics: Azure hosts data disks’ VHD files in Azure Storage. You can turn on logging and metrics for the storage account that hosts your data disks, and get useful information such as the number of successful and failed requests, timeout, throttling, network, authorization, and other errors. You can configure and get data from these metrics on the Azure Portal, or via PowerShell, REST APIs, and .NET Storage Client library.

By leveraging all these information, you can understand:

If your IO related stalls or wait types in SQL Server (manifesting as increased disk response times in OS Perf Counters) are related to throttling events happening in Azure Storage. And,
If rebalancing your data and log files across different volumes (and underlying disks) can help maintaining throughput and bandwidth between storage performance limits.

TempDB

As mentioned in section Azure virtual machine disks and cache settings, we recommend that you place tempDB on data disks instead of the temporary disk (D:). Following are the three primary reasons for this recommendation based on our internal testing with SQL Server test workloads.

Performance variance: In our testing, we noticed that you can get the same level of performance you get on D:, if not more IOPS, from a single data disk. However, the performance of D: drive is not guaranteed to be as predictable as the operating system or data disk. This is because the size of the D: drive and the performance you get from it depends on the size of the virtual machine you use, and the underlying physical disks shared between all VMs hosted by the same server.
Configuration upon VM downtime situation: If the virtual machine gets shutdown down (due to planned or unplanned reasons), in order for SQL Server to recreate the tempDB under the D: drive, the service account under which SQL Server service is started needs to have local administrator privileges. In addition, the common practice with on-premises SQL deployments is to keep database and log files (including tempDB) in a separate folder, in which case the folder needs to be created before SQL Server starts. For most customers, this extra re-configuration overhead is not worth the return.
Performance bottleneck: If you place tempdb on D: drive and your application workloads use tempDB heavily, this can cause performance bottleneck because the D: drive can introduce constraints in terms of IOPS throughput. Instead, place tempDB on data disks to gain more flexibility. For more information on configuration best practices for optimizing tempdb, see Compilation of SQL Server TempDB IO Best Practices.

We strongly recommend that you perform your own workload testing before implementing a desired SQL Server file layout strategy.

Effects of warm-up on data disks

With Azure disks, we have observed a “warm-up effect” that can result in a reduced rate of throughput and bandwidth for a short period of time. In situations where a data disk is not accessed for a period of time (approximately 20 minutes), adaptive partitioning and load balancing mechanisms kick in. If the disk is accessed while these algorithms are active, you may notice some degradation in throughput and bandwidth for a short period of time (approximately 10 minutes), after which they return to their normal levels. This warm-up effect happens because of the adaptive partitioning and load balancing mechanism of Azure, which dynamically adjusts to workload changes in a multi-tenant storage environment. You may observe similar effects in other widely known cloud storage systems as well. For more information, see Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency.

This warm-up effect is unlikely to be noticed for systems that are in continuous use. But we recommend you consider it during performance testing or when accessing systems that have been inactive for a while.

Single vs. multiple storage accounts for data disks attached to a single VM

To simplify management and reduce potential risks of consistency in case of failures, we recommend that you leave all the data disks attached to a single virtual machine in the same storage account. Storage accounts are implemented as a recovery unit in case of failures. So, keeping all the disks in the same account makes the recovery operations simple. There is no performance improvement if you store data disks attached to a single VM in multiple storage accounts. If you have multiple VMs, we recommend that you consider the storage account limits for throughput and bandwidth during capacity planning. In addition, distribute VMs and their data disks to multiple storage accounts if the aggregated throughput or bandwidth is higher than what a single storage account can provide. For information on storage account limits, see Azure Storage Scalability and Performance Targets. For information on max IOPS per disk, see Virtual Machine and Cloud Service Sizes for Azure.

NTFS allocation unit size

NTFS volumes use a default cluster size of 4 KB. Based on our performance tests, we recommend changing the default cluster size to 64 KB during volume creation for both single disk and multiple disks (storage spaces) volumes.

Yüklə 268,61 Kb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8