Microsoft sql server I/o basics: Chapter 2


Maintenance and Configuration



Yüklə 0,61 Mb.
səhifə3/7
tarix16.08.2018
ölçüsü0,61 Mb.
#63142
1   2   3   4   5   6   7

Maintenance and Configuration


This section outlines maintenance and configuration issues that should be fully understood before deploying SQL Server.
Pull-the-plug power outage testing

Power outage testing is a fundamental and critical part of the administrator’s data safety and integrity requirements. Because the number of caching, defragmentation, and other products that enhance I/O performance has greatly increased, it is important to perform pull-the-plug power outage tests before production deployment. Performing safety checks of SQL Server and all other system components before production deployment is very important. Many products successfully meet pull-the-plug requirements when they are configured correctly. Incorrect configurations lead to data loss.

Pull-the-plug testing should be done by using either the production environment or a replica of this environment. This includes third-party products that will be used in production.

The following is a list of pull-the-plug situations. To guarantee equipment safety, make sure that the correct electrical standards are followed.


  • Remove the power cord that supplies power to the mother board. Most tests implement this by using a switched outlet.

  • Remove the power cord that supplies power to storage devices. This may be the same as the motherboard or it may be separate for external storage devices.

  • Disconnect external devices, wait and then reconnect the data cabling.

As with all power outage testing, ensure sufficient delay during the outage tests.

We strongly encourage you to establish the appropriate pull-the-plug scenario testing for database log and data files in addition to backup files. Test actual power outages in a variety of scenarios to guarantee data integrity. Some pull-the-plug scenario examples follow.

Establish a wide recovery interval during these tests. While this is likely not the production configuration, it helps expose misconfigured I/O subsystem components. We suggest setting the sp_configure recovery interval to a high value, such as 5000.


Scenario

Basic Testing Steps

Transaction Safety: Commits

  1. Insert a known set of data and commit the transaction(s).

  2. Pull the plug.

  3. Restart the server.

  4. Validate database(s) with DBCC CHECKDB.

  5. Validate all committed data exists.

Transaction Safety: Aborts

  1. Insert a known set of data and commit the transaction(s).

  2. Create several active transactions that modify data and leave the transactions open.

  3. Issue a CHECKPOINT in each database.

  4. Pull the plug.

  5. Restart the server.

  6. Validate database(s) with DBCC CHECKDB.

  7. Validate all committed data exists and all uncommitted modifications where rolled back.




Backup Safety

  1. Take a full database backup.

  2. Take a transaction log backup.

  3. Start a series of known data insert, update, and delete transactions.

  4. While data modification is occurring, continue to take a series of log backups.

  5. Immediately after a backup is finished, pull the plug.

  6. Restart the server.

  7. Validate databases(s) with DBCC CHECKDB.

  8. Validate the state of the data.

  9. Restore the backups and execute appropriate validation steps.

Note: Always do these tests with the checksum option enabled on the databases.

Restore backups on secondary systems frequently to make sure that your complete backup strategy is functioning correctly. That way you can recover from a failure condition.

For more information about I/O caching requirements and SQL Server, see the following white papers on MSDN.

Microsoft SQL Server I/O subsystem requirements for the tempdb database
(http://support.microsoft.com/kb/917047/en-us)

Key factors to consider when evaluating third-party file cache systems with SQL Server
(http://support.microsoft.com/kb/917043/en-us)

Atomic writes

The I/O subsystem must support atomic writes. Controllers and I/O subsystem components should not allow physical writes to occur before all data is safely transferred to stable media. If writes can start before the complete I/O request has been transferred, a power outage leads to a partial write and data is torn and damaged.
Defragmenting disk drives

Because physical data access is the most expensive part of an I/O request, defragmentation can provide performance gains for SQL Server and other applications. Positioning related data close to each other reduces I/O operation requirements.

Various defragmentation utilities are available on the market today. Some utilities enable defragmentation on open files, whereas others require closed-file defragmentation or perform better when used under closed-file conditions. Additionally, some utilities have transactional capabilities, whereas others do not.

When you evaluate a defragmentation utility for use with SQL Server, the utility should provide transactional data capabilities. Use defragmentation utilities that provide the following transactional data capabilities.


  • The original sector should not be considered moved until the new sector has been successfully established and the data successfully copied.

  • The utility should protect against a system failure, such as a power outage, in a safe way that enables the files to remain logically and physically intact. To guarantee data integrity, a pull–the-plug test is highly recommended when a defragmentation utility is running on a SQL Server-based file.

  • The Write-Ahead Logging (WAL) protocol requires the prevention of sector re-writes to avoid data loss. The utility must maintain the physical integrity of the file as long as it does any data movement. In fact, it should work on sector boundaries in a transactional way to keep the SQL Server files correctly intact.

  • The utility should provide appropriate locking mechanisms to guarantee that the file retains a consistent image for any modifications. For example, the original sector cannot be modified when it is copied to a new location. Therefore, a defragmentation utility could lose the write if modifications are allowed.

Make sure that you understand any write-caching strategies that the utility uses. Caching by such a utility might involve a non-battery-backed cache and could violate WAL protocol requirements.

Open-file defragmenting raises several possible issues that closed-file defragmenting typically does not.



  • Open-file defragmenting affects performance. Defragmentation utilities may lock sections of the file, preventing SQL Server from completing a read or write operation. This can affect the concurrency of the server that is running SQL Server. Contact the defragmentation manufacturer to learn how files are locked and how this could affect SQL Server concurrency.

  • Open-file defragmenting can affect write caching and ordering. Open-file-based utilities require I/O path components; these components must not change the ordering or intended nature of the write operation. If the write-through or WAL protocol tenants are broken, database damage is likely to occur. The database and all associated files are considered to be a single entity. (This is covered in many Knowledge Base articles, SQL Server Books Online, various white papers, and in Remote Mirroring later in this paper.) All writes must retain the original write-ordering sequences and write-through capabilities.

We always recommend performing a full backup before you defragment those locations that contain SQL Server database and backup files.
Backup hardening

The hardening of backup streams has existed for all Microsoft versions of SQL Server, but the topic has not been formally documented. To harden a backup means that all data written to backup media has been successfully stored on stable media; sections of the backup are not just held in the cache. If hardening of the backup is not correctly achieved, the subsequent truncation of the database transaction log could be performed prematurely.

For an example related to hardening of the backup stream, consider a power outage. The sequence occurs in the following steps.



  1. The backup stream I/Os have finished, but the tail of the backup stream data is held in a cache.

  2. The SQL Server backup processing expects that the data was stored on stable media (hardened) and it continues processing, truncating the database transaction log.

  3. A pull-the-plug event occurs so the tail of the backup stream is lost.

  4. Upon restart, SQL Server crash recovery successfully recovers the database without error.

  5. However, the backup stream is missing data because of the power outage. No specific error about this damage is produced.

The backup strategy is now damaged and the administrator does not know about the damage until a restore attempt of the backup sequence finds the damaged stream. All log backups following this kind of event are unusable.

Note: SQL Server 2005 improves the restore verification capabilities and extends checksum capabilities to backups, but only a full restore can be used to fully validate the backup strategy. It is not safe to assume that a successful RESTORE VERIFYONLY immediately following a backup operation indicates all the data is correctly stored on stable media. A caching system could provide the data from cache and the backup stream could still be exposed to data loss. Always contact a cache manufacturer to understand the boundaries of the cache and its relationship to stable storage. Make sure that you have a way to read the backup stream data directly from stable media for verification.

The SQL Server backup process does not use FILE_FLAG_WRITE_THROUGH, as it does for the database data and log files; instead, it enables system-level I/O buffering. To guarantee data integrity, at the end of a backup operation SQL Server uses FlushFileBuffers to force all data written to the backup steams to stable media and then closes the file.

Notice that for a FlushFileBuffers API call, the NTFS file system issues a write for all dirty pages in the file cache and then invokes IRP_MJ_FLUSH_BUFFERS to make sure that all disk caches are flushed. All third-party caching systems should provide similar semantics to guarantee that data is correctly stored on stable media.

FlushFileBuffers is the mechanism that is used to harden the backup stream. Therefore, make sure that FlushFileBuffers achieves stable media storage because when the backup successfully completes the FlushFileBuffers, it closes the streams and then truncates the database transaction log.

One other caveat to remember about backup hardening involves moving or compressing the backup. Many backup strategies include the ability to take a backup and then compress it or copy/move it. The compression or copy/move operation is an extension of the implied WAL contract. In-place compression must include transactional capabilities (as outlined in Defragmenting Disk Drives earlier in this paper) or else the compression action exposes data loss possibilities. Always make sure that the integrity of the backup stream is maintained and correctly secured on stable media before destroying the original file.

4-KB disk sector sizes

New manufacturing techniques and specifications will produce products that include drives that support larger sectors than the current 512-byte sector formats where one 8-KB SQL Server data page requires sixteen 512-byte sectors. Alongside the increased sector size are various manufacturer implementations of the 512-bytes to 4-KB sectors. For example, actual sector sizes might be 512-bytes, 1-KB, and 4-KB. The increased sector sizes and manufacturer implementation derivations affect the SQL Server Write-Ahead-Logging protocol in several ways, which are described in this section.
Sector rewrite

Some of the current manufacturer implementations request a logical view of the on-disk sectors. This means the operating system sees a sector size of 512 bytes but the drive performs the appropriate logical-to-physical mapping to the larger sector sizes as stored on physical disk. A logical sector write can include the reading of nearby data known as Read Modify Write (RWM) to finish the complete sector-sized write.

For more information on logical to physical sector mapping, see Implementation Guidelines for 1K/4K Sector Sizes (http://www.t13.org/docs2005/e05122r3-WD_Comments_on_Long_Sectors.pdf).

The SQL Server transaction log files are always written with sector-aligned sizes and at sector-aligned boundaries. A small-logical to larger-physical mapping action causes the rewrite (RMW) of log sectors that were already saved to stable media. If the subsequent write fails, the transaction log is damaged and previously committed transactions can be lost. This is termed a tear or torn condition of the already hardened log.

“… SQL Server 6.x may not see the same performance impact from frequent and small transaction log writes. SQL Server 6.x rewrites the same 2-KB log page as transactions are committed. This can reduce the size of the log significantly compared to the 512-byte sector boundary flushes in SQL Server 7.0, SQL Server 2000, and SQL Server 2005. Reducing the size of the log directly relates to the amount of mechanical drive activity. However, as explained above, the SQL Server 6.x algorithm may expose committed transactions….”

For this and more information on how SQL Server 7.0 changed the transaction log design to prevent sector rewrites, see SQL Server 7.0, SQL Server 2000, and SQL Server 2005 logging and data storage algorithms extend data reliability on MSDN
(http://support.microsoft.com/kb/230785/en-us).

Logical-to-physical mapping implementations may be dangerous and could lead to complex data loss situations.


Align with physical sector boundary

Implementations that present logical sector sizes introduce various complexities for SQL Server and the operating system. For this discussion, we use an actual sector size of 4 KB and a presented (logical) sector size of 512 bytes.

SQL Server database log or data files should be created on a physical sector-aligned boundary. However, because the operating system is presented with sectors of size 512 bytes, it can align the start of the SQL Server file at a location that is not aligned with the physical sectors.

The following table shows how the logical sector presentation can cause SQL Server database log and data files to span sectors in an unaligned way. Sector read, modify, writes (RMWs) occur in order to handle the leading and trailing data boundaries.

As shown in the table, a write to Database File Page 0 must handle sectors from File ABC and File XYZ, which is at the head of the I/O in addition to Database File Page 1, which is at the tail of the I/O. I/O performance can decrease because of the extra misaligned activities.




Physical 4KB Sectors

Logical 512-Byte Sectors

Sector #1

File ABC

File XYZ

Database File Page 0 – Sector 1 of 16

Database File Page 0 – Sector 2 of 16

Database File Page 0 – Sector 3 of 16

Database File Page 0 – Sector 4 of 16

Database File Page 0 – Sector 5 of 16

Database File Page 0 – Sector 6 of 16






Sector #2

Database File Page 0 – Sector 7 of 16

Database File Page 0 – Sector 8 of 16

Database File Page 0 – Sector 9 of 16

Database File Page 0 – Sector 10 of 16

Database File Page 0 – Sector 11 of 16

Database File Page 0 – Sector 12 of 16

Database File Page 0 – Sector 13 of 16

Database File Page 0 – Sector 14 of 16



Sector #3

Database File Page 0 – Sector 15 of 16

Database File Page 0 – Sector 16 of 16

Database File Page 1 – Sector 1 of 16

Database File Page 1 – Sector 2 of 16

Database File Page 1 – Sector 3 of 16

Database File Page 1 – Sector 4 of 16

Database File Page 1 – Sector 5 of 16

Database File Page 1 – Sector 6 of 16



You might be tempted to work around this by creating a database on a freshly formatted drive so that an alignment issue would not occur. The first fault with this workaround is that there is no guarantee of physical sector alignment when the operating system is presented a logical sector size. Also, it is generally impractical on a production system.

Another clever workaround would be to use a defragmentation utility. The fault here is that the defragmentation utility is likely to work from the logical sector size and not the physical sector size. Although better data alignment proximity might occur, it is just as likely that the movement using logical sector boundaries could lead to sector-spanning situations shown in the example.

Nevertheless, another workaround would be to back up and restore the database to achieve better physical alignment. To achieve new physical allocations, drop the original database and run the restore. The new file creation and data copy may align with the physical sectors better or even reduce logical fragmentation. However, the start of the file could be unaligned and the restore attempt could cause all database pages to be unaligned as in the example shown earlier.

The peak of poor logical-to-physical fragmentation could be as bad as Max Sector Frags = Physical Sector Size / Logical Sector Size. In the example, this could create a 16:1 ratio of sectors to SQL Server 8-KB data page space, causing pages of the database to be stored on 16 unique sectors, intermixed with other data.

SQL Server lets you grow and shrink database log and data files. This causes the acquisition and release of disk space. This can lead to further logical-to-physical fragmentation.

An example that uses files that can automatically grow is helpful. In this example, the original database was 10 GB with an auto-grow increment of 1 GB. The original 10 GB allocation occurred on a clean drive and enabled all data pages to align on the physical sector size correctly. However, the same drive supports storage of backups (this is a poor practice) and other files. Data is added to the database that causes an auto-grow but the growth acquires the 1 GB portion from the drive starting at an unaligned physical sector location. This leaves part of the database aligned and part of it unaligned to physical sector boundaries.

The performance implications may be such that data values stored in the original 10 GB pages can be read and written faster than that in the new 1-GB section of the database. This would be a very difficult issue to track down and correct.

Align with 8 KB boundary

Sector sizes should always allow for 8 KB alignment of SQL Server data pages. SQL Server writes data pages in 8 KB increments; the physical sector sizes and alignment should always enable SQL Server to write the 8 KB so that it is aligned on sector boundaries or as a multiple of a smaller sector size. This is to prevent torn I/O conditions, data page spanning, and sector rewrites.

Sector sizes of 4 KB enable the SQL Server data page to occupy two physical sectors and allows for SQL Server data pages to be aligned on 8 KB boundaries efficiently.

If a sector size of 1536 bytes is used, the alignment for data pages is broken. This creates 8 KB pages that span the same physical sector. This results in the risk of unwanted sector rewrites (RMWs) of data pages. This hinders database performance and could lead to unexpected data loss (torn) situations.

Larger transaction logs

The SQL Server transaction log files are always written with sector-aligned sizes and at sector-aligned boundaries. Most ordinary workloads do not require SQL Server to significantly increase log space usage due to larger physical sector sizes. For example, concurrent transactions share log-block space.

Although SQL Server tries to use the log space as efficiently as possible, certain application patterns cause the log-block fill percentages to remain small. The following extreme example points out the clear difference in the transaction log space requirements as the sector size varies.


  WHILE(@I < 10000)

  BEGIN

    BEGIN TRAN

    INSERT INTO tblTest values (‘A’, @I)

    COMMIT TRAN

    SET @I = @I + 1

  END


- vs -

BEGIN TRAN

  WHILE(@I < 10000)

  BEGIN


    INSERT INTO tblTest values (‘A’, @I)

    SET @I = @I + 1

  END

  COMMIT TRAN


The first example requires SQL Server to write the log records 10,000 times, as each commit is processed. The second example enables SQL Server to pack log records and write all 10,000 inserts and the single commit at the same time.

Calculating the log usage for the examples results in the following approximations.



  • By using a 4 KB sector size for the first example, ~40 MB of disk space is used.

  • By using a 512 byte sector size for the first example, ~5 MB of disk space is used.

  • By using a 4 KB sector size for the second example, ~1 MB of disk space is used.

  • By using a 512-byte sector size for the second example, ~1 MB of disk space is used.

This means for an I/O subsystem that reports sector sizes larger than 512 bytes, the SQL Server transaction log file could acquire physical space at the rate of “Rate = n / 512” where “n” is the new sector size. Application design can become critical to efficient log space usage.

Note: Be cautious. Expanding the length of a transaction can have adverse affects on concurrency as locks continue to be held until the commit is finished.

Although tempdb is primarily non-logged, internal operations such as the allocation of a data page can be logged. Therefore, larger sector sizes can affect tempdb size.


Restore and attach

A SQL Server database can be restored or attached on a system that has a smaller sector size. To help guarantee data integrity, the larger sector size should be evenly divisible by the smaller sector size. For example a 4 KB source restored/attached to a 1 KB or 512-byte destination is an evenly divisible size. Restoring or attaching to a smaller sector size such as 1536 bytes does not fill the “evenly divisible” requirement and immediately requires sector rewrite operations to occur.

SQL Server is not designed to dynamically upgrade the database to the larger sector sizes. SQL Server disallows restoring or attaching a database on a system with a larger sector size; it generates an error message and prevents the restore or attach operation. Enabling a database to function by using a smaller formatted sector size than the actual sector size violates the WAL protocol because the sector size variation guarantees the log records will not be correctly aligned with the physical sector size and log records will be rewritten.

At the time of publication, some current subsystems report sector sizes larger than 512 bytes but most do not. The larger sector sizes involve newer technology and require API changes at the operating system level. Future changes in Microsoft® Windows and SQL Server will correctly support the larger sector sizes and allow for dynamic adjustment of sector sizes.

Format for 4 KB sectors run on smaller sectors

SQL Server prevents restoring or attaching a database in an environment that has a larger physical sector size than the sector size the database was formatted with.

SQL Server can restore or attach a database formatted with a larger sector size onto smaller, evenly divisible physical sectors. This is possible because the log write uses the original formatted size (the larger size of the two). SQL Server may use a bit more log space in this configuration, but the writes to the log are larger than the physical sector size. This prevents rewrites as long as the smaller sector size is an even divisor. An irregular division of (original sector size / physical sector size) creates rewrites (RMWs) and should not be allowed.


System and sample databases

SQL Server 2005 ships all system (master, model, and msdb) and sample databases formatted with 4­ KB sector sizes so that they can be installed on a device with up to 4 KB. SQL Server 2000 system and sample databases are 512-byte based, causing Setup to fail on larger-sector drives.

The model database is used as the template when SQL Server creates tempdb and user databases. Only the contents of the model database are used, not the full physical format. tempdb and new database creations use the sector size reported by the operating system at the time of creation. This is performed on a per-file basis. Variance of sector sizes can then occur across the files throughout a database. Make sure that the sector sizes of a given I/O path are equal.


Determining the formatted sector size of database

Run the DBCC fileheader(‘<>’) command to output the formatted sector size for each file in the database. The SectorSize column shows the formatted sector size in bytes.
What sector sizes does SQL Server support?

SQL Server currently supports the following sector sizes that are equal to or less than 4 KB.

  • Physical sector sizes that evenly divide into 4 KB

  • Physical sector sizes that are smaller than that of the database’s original formatted sector size as long as the smaller sector size is evenly divisible into the originally formatted sector size

  • 512

  • 1024

  • 2048

  • 4096
Remote mirroring

Several hardware vendors provide remote mirroring solutions. Remote mirroring captures writes and duplicates them remotely. Because of the distance between the principal and mirror and the reluctance to affect the performance on the principal by making it wait for remote writes, remote mirroring systems generally operate asynchronously. Very strict rules and protocols must be adhered to in order to safely use a remote mirroring technology with SQL Server data, log, and backup files.

For more information on SQL Server support for remote mirroring solutions, search MSDN for the Using Third Party Mirroring Solutions with SQL Server 2000 and 2005 User Databases white paper.

The following is a brief recap of those attributes beyond the local WAL protocol that are necessary for a successful remote mirroring deployment.

Write ordering: The order of the writes on the primary must be the same exact order as that used to apply changes to the mirror. If the write order is not maintained, the WAL protocol is destroyed and the database will become corrupted.

Consistency groups: A consistency group makes all I/O actions to a set of volumes appear as a single I/O stream, keeping all write requests in exact order across the group. Without the ability to keep write ordering across a group of volumes intact, it is not safe to mirror SQL Server databases that span volumes. If each volume can keep a separate write order, the database log and data files on the mirror will not maintain the correct point-in-time semantics and the database will become corrupted.

Notice that consistency groups can apply to local mirror solutions as well.



Restore points: Following is an example of the restore points or forks in a remote mirroring system.

Primary

Mirror

Full Database Backup




Log Backup #1 (LSN 1 to 100)




Log Backup #2 (LSN 101 to 200)




FAILOVER OCCURS







Log Backup #3 (LSN 101 to 150)




Log Backup #4 (LSN 150 to 200)

The restore point issue occurs at LSN 101. If the remote mirroring solution does not guarantee immediate delivery to the mirror, some write operations can be lost under a failure condition. In the example, the LSN changes from 150 to 200 at the primary are assumed to have been ‘in-play’ at the time of the failover and never made it to the mirror. The mirror is brought online and therefore loses the data between 150 and 200 because the data never made it to the mirror. The database is transactionally consistent on the mirror after recovery but some data loss has occurred.

The log backups start after failover as shown in the example. At this point there are two outstanding issues which must be correctly resolved.



  1. The primary may have changes on disk that never made it to the mirror and therefore requires a full synchronization with the mirror. Your manufacturer may provide a more direct way to resynchronize the primary after failover but until the primary is in synchronization with the mirror, it is not safe to use the original primary again.

  2. During restore, the primary ‘Log Backup #2’ must be considered invalid. The restore sequence would be the following:

    1. Restore the Full Database Backup.

    2. Restore Log Backup #1.

    3. Restore Log Backup #3. If Log Backup #2 is restored, log backups #3 and #4 cannot be restored because log records 150 to 201 on the mirror differ from originally lost on the primary.

    4. Restore Log Backup #4.

Yüklə 0,61 Mb.

Dostları ilə paylaş:
1   2   3   4   5   6   7




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə