The postings on this site are my own and do not represent my Employer's positions, advice or strategies.

LifeAsBob - Blog

 

Home

No Ads ever, except search!
Friday, May 10, 2024 Login
Public

Moving Blobs 9/12/2008 3:29:35 PM

Moving blobs to a dedicated file group, maybe with several data files, is a very good idea !  Of course, if you wait to do this until your database grows quite large, you've got a problem.  I define large as 50+ gb.

Currently I have an issue with a 120gb database.  100gb of the data is in one table, with an image data type.   I'd like to isolate this tables blob data to a dedicated file group, made up of 4 data files.  This is not difficult, what is difficult is making it happen. 

I've experimented with doing this 3 ways:

1.  SQL Management Studio  - 28 hours.

2.  SSIS - 18 hours.

3.  Custom TSQL Scripts, 4 spids, dividng the table up, 12 hours.

I have found all of them to be very painfully slow.  So slow in fact that I'm not able to do it, as I currently can't take a 12+ hour outage. 

Knowing your volumetrics and design up front is very important for this exact reason.

SQL Server 2005 can do this quicker with parrallel loads, if the table is partitioned, but this is a 3rd party database from a vendor and can't support partioned tables.

SQL Server 2008 provides some new options for storing the blobs outside the database, which is always my preference if possible.

Below is the script for option 3, handy sometimes:

 

use emailxaminer_run

go

/*

select min(fileid),max(fileid) from rt_filecache

-- calculations to figure out each spid to run.

1. 980,066 to 7,553,806 start = min(fileid)

2. 7,553,807 to 14,127,547

3. 14,127,548 to 20,701,288

4. 20,701,289 to 27,275,029 end = max(fileid)

select 27275029 - 980066

select 26294963 / 4 --6573740

-- thread 1

select 980066 + 6573740 -- 7553806

select 14127548 + 6573740

*/

declare @min int

declare @max int

Declare @startPosition int

declare @endPosition int

declare @currentPosition int

set @startPosition = 980066

set @endposition = 30000000

set @currentPosition = @startPosition

declare @increment int

set @increment = 1000

While @currentPosition < @EndPosition

Begin

begin tran

INSERT INTO dbo.Tmp_RT_FileCache (FileID, MessageId, FileType, FileSize, datelastmod, FileName, FileBLOB)

SELECT FileID, MessageId, FileType, FileSize, datelastmod,

FileName, FileBLOB

FROM dbo.RT_FileCache with (nolock)

Where FileId Between @currentPosition AND

(Case WHEN @currentPosition + @increment > @endPosition then @endPosition

Else @currentPosition + @increment END)

commit tran

Print 'Where Clause = ' + convert(varchar(10),@currentPosition) + ' AND ' +

Convert(varchar(10), (Case WHEN @currentPosition + @increment > @endPosition then @endPosition

Else @currentPosition + @increment END))

Set @currentPosition =

Case WHEN @currentPosition + @increment > @endPosition then @endPosition

Else @currentPosition + @increment + 1 END

--Print @currentPosition

End


Blog Home