SQLBits 8 Session Notes #2

Tips for working with large data volumes

Simon Sabin
simon@sqlknowhow.com
http://www.sqlblogcasts.com/blogs/simons/
@simon_sabin

big deletes are a problem because:
    deleting rows requires 2x log space – 1x change transaction, 1x rollback information
    not a minimally logged operation
    deletes are required from index pages as well

delete rows uses the clustered index, BUT will delete entries in non-clustered index as well (cf ClusteredIndexDelete properties in execution plan)

Wide delete will *explicitly* delete from a non-clustered index, not implied as above.

Wide delete is the default for a \’large\’ table (typically over 1M rows), and is a sequential operation <– could cause concurrency issues if someone takes an exclusive lock on a row that's a delete candidate.

pseudo -image:
    table –>
        rows to delete –>
            Clustered index
            NC index 1
            NC index 2
            ……

Deletes done by PK / RID. Possible random IO
    -solve: index on predicate (good), covering index on predicate and primary  key (better)
    -why: avoids the SORT operation in execution plan

Consider batching up the delete – DELETE TOP (1000)….. (SQL2008 only – and there\’s no ORDER BY if using DELETE TOP)
    but try and keep high selectivity with no subselects or joins to get the delete rows
    -note: batch ranges should not overlap, or you get blocks and / or deadlocks

You could alternatively put all the key values from the clustered index into a temporary table (or TV), then join back to it for the delete
    -why: avoids clustered index scans if seen in the estimated execution plan
    -also: batch it up with a while loop, but be very careful about how you construct this
        -why: have impact on transaction log (beware implicit transaction)

NOTE: \’set rowcount\’ is deprecated. It\’s technically a server option, not a query option.

no index on predicate will result in a table scan if you\’re using any TOP construct.

if you batch up to run in parallel (multiple DELETE threads) that\’s fine for the Clustered index, but you could get overlaps when deleting from the nonclustered indexes.

TakeAway

make sure finding rows is batchable
or
read all rows and batch that
align all deletes with primary key / clustered index.


One thought on “SQLBits 8 Session Notes #2”

Leave a comment