HOW TO IMPLEMENT PARTITIONING:

By: Albert

Tables might get large. Or even very large, like for example having 8000 Million rows.
As a consequence, the associated indexes are large too.

We already know that we can "partition", or separate, (whole) tables and indexes on there own "filegroups".
In this chapter, we go one step further, by "partitioning" distinguisable subsets of one table on their own filegroups.

There are two main reasons to partition a Table or Index.

You might enhance performance because of parallel IO on multiple FileGroups, and
You might have faster and smaller backup, because new and active records are in a certain partition,
while historical records live in other partitions (thus filegroups), which you do not need to backup that often.

In this section, we show the "partitioning" of subsets of a table, on their own filegroups.

Ideally, you would have a certain column in your table, that lends itself easily to "get partioned". In other words,
that particular column would have values that are easily divided into subsets, like for example "years" or "months".
For example, you might have a table with some date column, and you could group records belonging to
to the years "2000", "2001, "2002" etc..

But even if it doesn't go in such a "natural way", you can always force some form of subsetting. For example, you have
a nummeric column, and you just are going to distinquish the following subsets:

1 - 10000000
10000001 - 20000000
20000001 - 30000000
etc..

Creating a partitioned Table, or Index, is a 3 step process.

Don't forget that the whole idea behind partitioning a table is this: create Ranges of values, where the rows of the table will
fall into, and make sure that you can store those different record subsets (the different Ranges), onto separate Filegroups.

- We start by defining a Partition Function. This is function that defines the boundaries, or Partition Ranges, that the subsets of rows will use.
- Secondly, we create a Partitioning Scheme, that defines the mappings of the Partition Ranges to FileGroups (individual storage structures).
- Thirdly, we create a Table, using the definitions above.

That's all.

So, a simple example will illustrate this.

1. Suppose we have a certain database, which uses the filegroups FG1, FG2, FG3 and FG4.

Suppose we have a table PARTSAMPLE that we want to partition. It uses the columns ID (datatype INT) and NAME (varchar(20)).
The values that ID can take, are from 1 to higher (for example 1,1000,1500,7000,20000,30000,89000 etc..

2. Now let's define the "Partition Ranges", or boundaries, that the subsets of rows can take:
We do that by creating a Partition Function, whereas later we are going to "bind" it somehow to the table.

CREATE PARTITION FUNCTION samplefunction (INT)
AS
RANGE LEFT FOR VALUES (10000, 20000, 30000)
GO

This function, is an independent object in the database. Later we will use it in the PARTSAMPLE table definition.

The "LEFT" or "RIGHT" in the function definition means if you want the interval to "leftsided" or "rightsided" as in:
10001 - 20000 (right)
10000 - 19999 (left)

3. Next, we will define the "Partition Scheme", that will relate the Ranges to the Filegroups:

CREATE PARTITON SCHEME samplescheme
AS
PARTITION samplefunction TO
([FG1], [FG2], [FG3], [FG4])
GO

You see that in the Partition Scheme definition, we relate the FileGroups to the "samplefunction" function, thus thereby
relating the FileGroups to the Ranges (or boundaries).

4. As the last step, we will define our PARTSAMPLE table, using the Partition Scheme defined above.

CREATE TABLE PARTSAMPLE
( ID INT NOT NULL,
NAME VARCHAR(20) NOT NULL
)
ON samplescheme(ID)
GO

So, if a record with an ID of 15000 would be inserted, it would be stored on FileGroup "FG2".
Likewise, if a record with an ID of 25000 would be inserted, it would be stored on FileGroup "FG3".

For an existing table, you could create an empty partitioned table with another name.
Next, INSERT... SELECT... the rows from the original non partitioned table to the new partitioned table.
When done, you can drop the original, and rename the new table.

Often, it's not that easy, because of various reasons like:

- take into account the PK-FK relations with other tables.
- it could take a long time, and costs in performance if the table is really large.
- the original table should not have any other transactions while inserting to the new table.
- take into account rebuilding and/or partitioning indexes as well.

So, partitioning a really large table, could be considered to be something like a "project"...