Monday, 30 May 2011

SQL Server Physical Database Architecture

Topics

Pages and Extents
Types of Pages and Page Structure
Files and Filegroups
Extent Allocation and Free Space
Managing Space used by Objects
Tracking Modified Extents
Table and Index Structures

Pages and Extents

A page is a fundamental unit of data storage
A data file (.mdf or .ndf) is logically divided into pages numbered contiguously from 0 to n
Disk I/O operations are performed at the page level
The page size is 8 KB(8192 Bytes)
Extent is  a basic unit in which space is managed
is eight physically contiguous pages, or 64 KB
two types of extent
Uniform extents – all the 8 pages are owned by a single object
Mixed extents  - the pages are shared among objects(up to eight objects)


A new table or index is generally allocated pages from mixed extents. When the table or index grows to the point that it has eight pages, it then switches to use uniform extents for subsequent allocations. If you create an index on an existing table that has enough rows to generate eight pages in the index, all allocations to the index are in uniform extents.





 
What if the Row exceeds the max limit of 8060 bytes?


When the total row size of all fixed and variable columns in a table exceeds the 8,060 byte limitation, SQL Server dynamically moves one or more variable length columns to pages in the ROW_OVERFLOW_DATA allocation unit, starting with the column with the largest width. This is done whenever an insert or update operation increases the total size of the row beyond the 8060 byte limit.
When a column is moved to a page in the ROW_OVERFLOW_DATA allocation unit, a 24-byte pointer on the original page in the IN_ROW_DATA allocation unit is maintained.
If a subsequent operation reduces the row size, SQL Server dynamically moves the columns back to the original data page
 
Files and File Groups
 
SQL Server 2005 databases have three types of files:
Primary data file - is the starting point of the database and points to the other files in the database. Every database has one primary data file. The recommended file name extension for primary data files is .mdf.
Secondary data files - make up all the data files, other than the primary data file. Some databases may not have any secondary data files, while others have several secondary data files. The recommended file name extension for secondary data files is .ndf.
Log files - hold all the log information that is used to recover the database. There must be at least one log file for each database, although there can be more than one. The recommended file name extension for log files is .ldf.
 
When is the file location information in the primary file used? 
 
Database Engine uses the file location information from the primary file to initialize the file location entries in the master database in the following situations:
When attaching a database using the CREATE DATABASE statement with either the FOR ATTACH or FOR ATTACH_REBUILD_LOG options.
When upgrading from SQL Server version 2000 or version 7.0 to SQL Server 2005.
When restoring the master database.
  
Data Files Page
Pages in a data file are numbered sequentially, starting with zero (0)
  for the first page in the file.
Each file in a database has a unique file ID number.
To uniquely identify a page in a database, both the file ID and the
  page number are required.
 
The first page in each file is a
  file header page that contains
  information about attributes of the
  file.
Several of the other pages at the
  start of the file also contain system
  information, such as allocation maps.
One of the system pages stored in
  both the primary data file and the first
  log file is a database boot page that contains information about
  the attributes of the database.
 
Extent Allocation & Free space
The SQL Server extent allocations and track free space have a relatively simple structure with the following benefits:
free space information is densely packed, so relatively few pages contain this information.

 
This increases speed by reducing the amount of disk reads that are required to retrieve allocation information. This also increases the chance that the allocation pages will remain in memory and not require more reads.
the allocation information is not chained together

 
This simplifies the maintenance of the allocation information. Each page allocation or deallocation can be performed quickly. This decreases the contention between concurrent tasks having to allocate or deallocate pages.
 
Managing Extent Allocation
 
SQL Server uses two types of allocation maps to record the allocation of extents:
Global Allocation Map (GAM)
qGAM pages record what extents have been allocated.
qEach GAM covers 64,000 extents
qThe GAM has one bit for each extent in the interval it covers.
qIf the bit is 1, the extent is free; if the bit is 0, the extent is allocated.

Shared Global Allocation Map (SGAM)
q   SGAM pages record which extents are currently being used as  mixed extents and also have at least one unused page.
q   covers 64,000 extents.
q   The SGAM has one bit for each extent in the interval it covers.
q   If the bit is 1, the extent is being used as a mixed extent and has a free page. If the bit is 0, the extent is not used as a mixed extent, or it is a mixed extent and all its pages are being used. 
Current use of extent
GAM bit setting
SGAM bit setting
Free, not being used
1
0
Uniform extent, or full mixed extent
0
0
Mixed extent with free pages
0
1
 
Tracking Free Space
  Page Free Space (PFS) pages record the allocation status of each page, whether an individual page has been allocated, and the amount of free space on each page.
  The PFS has one byte for each page, recording whether the page is allocated, and if so, whether it is empty, 1 to 50 percent full, 51 to 80 percent full, 81 to 95 percent full, or 96 to 100 percent full.
  There is a PFS page approximately 8,000 pages in size after the first PFS page. 
 
A PFS page is the first page after the file header page in a data file (page number 1). This is followed by a GAM page (page number 2), and then an SGAM page (page 3).
   
 
Index Allocation Map (IAM)
qMaps the extents of a database file used by an allocation unit.
qCovers 64,000 extents
qMultiple IAM pages linked in an IAM chain and are located randomly in the file.
qAllocation Unit Types:
IN_ROW_DATA
LOB_DATA
ROW_OVERFLOW_DATA
qHas a header that indicates the
starting extent of the range of
extents mapped by the IAM page.
qHas a large bitmap in which
each bit represents one extent
qIf a bit is 0, the extent it represents is not allocated to the allocation unit owning the IAM.
qIf the bit is 1, the extent it represents is allocated to the allocation unit owning the IAM page
 
Managing Space using IAMs
SQL Server Database Engine has to insert a new row and no space is available in the current page, it uses the IAM and PFS pages to find a page to allocate, a page with sufficient space to hold the row.
   The Database Engine uses the IAM pages to find the extents allocated to the allocation unit.
   For each extent, the Database Engine searches the PFS pages to see if there is a page that can be used.
Note: This means that the IAM and PFS pages are generally in memory in the SQL Server buffer pool, so they can be searched quickly.
   The Database Engine allocates a new extent to an allocation unit only when it cannot quickly find a page in an existing extent with sufficient space to hold the row being inserted.
Note: The Database Engine allocates extents from those available in the filegroup using a proportional allocation algorithm. If a filegroup has two files and one has two times the free space as the other, two pages will be allocated from the file with the available space for every one page allocated from the other file. This means that every file in a filegroup should have a similar percentage of space used.
Tracking Modified Extents
Differential Changed Map (DCM)
   Tracks the extents that have changed since the last BACKUP DATABASE statement.
   If the bit for an extent is 1, the extent has been modified since the last BACKUP DATABASE statement.
   If the bit is 0, the extent has not been modified.
   Differential backups read just the DCM pages to determine which extents have been modified. This greatly reduces the number of pages that a differential backup must scan.
Bulk Changed Map (BCM)
   Tracks the extents that have been modified by bulk logged operations since the last BACKUP LOG statement.
   If the bit for an extent is 1, the extent has been modified by a bulk logged operation after the last BACKUP LOG statement.
   If the bit is 0, the extent has not been modified by bulk logged operations. 
 
Clustered Index Structure
   
Non-clustered Index Structure
 
 


 



 

No comments:

Post a Comment