SQLBits 8 Session Notes #5

SQL2008 Database Internals

Klaus Achenbrenner
@aschenbrenner
http://www.csharp.at

Everything is about the data page !

SQLServer does IO @ the page level = 8k – reads, writes page-by-page
    even if you only change 1 bit, the whole page is written back, not just the bit.

2 kinds of extent:   
    Uniform – belongs to just one database object
    Mixed – up to 8 different objects use the same extent

New tables and indexes are always created in mixed extents

Existing tables, when the occupy 8 pages, they\’re allocated to a uniform extent

Extent management

    done by GAM pages and SGAM pages

GAM Page
    tells you if an extent is used or not
    each extent is represented with 1 bit
    bit not set = extent is used
    one GAM per 4GB of data

SGAM
    stores it its a mixed extend and has 1 free page available
    each extent is represented with 1 bit
    bit not set = uniform extend of no free pages

Order of pages in the file

File header page = 1st page in a database file
PFS page = 2nd
GAM page = 3rd
SGAM = 4th
5th and 6th pages not used
DCM page = 7th (Differential Changed Map) – extents that have changed since last full database backup
    -ahaMoment = THAT\’s how differential backups work !
BCM page = 8th (Bulk Changed Map) – which extents have been involved in a minimally  logged operation

Indexes

sys.indexes shows heaps as column name = NULL & index_id = 0

Non-clustered index_ids 251-255 are internally reserved by SQLServer, but you can have 999 non-clustered indexes on a single table

Data partitions

    IN_ROW_DATA     – fixed and variable length columns
    ROW_OVERFLOW_DATA    – overflows from IN_ROW_DATA, sql_variant (if required)
    LOB_DATA    –    text, ntext, xml, varchar(max), nvarchar(max), varbinary, stored CLR <– interesting, no ?

Heading for stuff I missed

IAM page stores which extents belong to an allocation unit
    1 per 4GB of data

due to internal requirements, you actually get only 8060 bytes per page 🙂 presume null bitmaps and overheads for varchar etc.
     pageHeader
        payload (data)
    rowOffsetArray

/** look these up ***
DBCC TRACEON (3604)
DBCC IND ()
DBCC PAGE ()
*** Thank you **/

This is really clever –>
slot 1:80
= datafile 1, offset of 80 bytes
you can read the next extent by going 80 bytes into datafile 1 , and reading the next 8192 bytes (when your database file is detached, of course)

play with DBCC PAGE, it\’s very cool
OFFSET TABLE – you can find the start of any row by it\’s offset from 0

NOTE: get the slide desk, I\’m missing stuff…

Row OffSet Array
    2bytes for every record on the page
    stores the offset of where the record begins
    defines the order of data in the page

great slide on Record Storage !!!

There\’s a performance impact on adding a column or changing nullability, since each record has to be modified.

Please note: it doesn\’t matter which order you define your columns in. On the page, it\’ll be fixed-length columns, then the variable-length columns

SQLBits 8 Session Notes #6

10 do\’s and don\’ts for SQL CLR

Matt Whitfield

CLR can do user-defined aggregates.

Do

1.     Know your transactions
    TransactionScope() class.
    single connection = local transaction
    different connection = MSDTC = two-phase commit

2.    Know structures v. classes
        Reference types are passed around via their reference (classes) ABCDEI ->F all gets passed back
        Value types are passed in their entirety. (structures) ABCDEI -> F, leaves two different lists, with only the changed value in the function.

3.    Dispose your iDisposables
        if an object implement IDisposable, it will want you to Dispose() on it explicitly, releasing resources before GC
        Guarantee disposal by using() {} blocks <– this will always dispose on exit of the block

4.    Aim to write SAFE code
        SAFE is a restrictive permission set. can\’t affect the SQLServer process or anything on the outside.
        EXTERNAL_ACCESS allows access to resources outside of SQLServer (ie web service call in a trigger) <– not good 😦
        UNSAFE – just don\’t. OK ?

5.    Understand the GAK-lack
        \’blessed\’ assemblies can be loaded from the GAC. And there aren\’t many of them in the GAC
        Other assemblies need to be loaded using CREATE ASSEMBLY

6.    Understand managed memory
        come from the MemToLeave address space (kind of goes away)
        multiple pages can be requested.
        Native memory is single-page requests only

7.    Use the context connection
        use \’context connection = true\’
        standard connection requires elevation to EXTERNAL_ACCESS

8.    Write efficient code
        It makes your own life easier.
        If it\’s complex, put it into a profiler and see what it does first..

9.    Understand \’boxing\’
        take a value type, and send it to something that expects a reference type – it\’s put into a \’box\’, and is then treated as a reference type
        but remember that 1==1, but (object)1 != (object)1

10.    Use dictionaries
        finding an object by key rather than looping is the same as an index seek rather than an index scan.
        need to override the GetHashCode and Equals method of Object
        -note: this makes is unsuitable for fuzzy matching

Don\’t

1.    Add strings together
        strings are immutable (can\’t change)
        when you add 2 strings together, you actually create a 3rd string
        use StringBuilder

2.    Go overboard
        You can do good stuff and bad stuff.
        Don\’t wrap t-sql in CLR – doesn\’t add value

3.     Forget that data scales
        don\’t implement anything that requires a fixed data size
        don\’t assume that the fastest access method on small data sets is still as fast on large ones.

4.    Treat null as an afterthought
        Always assume you receive NULLs as a either a parameter or returned data
        Nullable type represent NULL with the HasValue property – SQL uses ISNULL property

5.    Use finalisers
        GC calls the finalisers (destructor) for you, so you don\’t have to.
        Finalise makes the object a Generation 1 object for GC (look this up)

6.    Re-invent the wheel
        take time to research what you want to do. The function may actually already exist.

7.    Access data in scalar functions
        String manipulation is OK
        Accessing data in scalar functions is NOT cool. Trust me.

8.    Over optimise
        simple, well structured code often has suitable performance characteristics

9.    Move the middle tier in
        it will make it far more difficult to scale out when you need to.

10.    Call Environment.Exit
        this just bins you out of SQLServer, sometimes VERY ungracefully – won\’t roll back, won\’t finish your sp call…..

SQLBits 8 Session Notes #4

SSIS Dataflow performance tuning

Jamie Thompson

Buffer Architecture

A buffer is:
    an area of memory
    created by asynchronous component
    doesn\’t move or change shape
    data in buffer can be changed by components
    is what you see in a data viewer

you know what ? this isn\’t really a good time to be typing. I need to listen. Get the slide deck.

SQLBits 8 Session Notes #3

Optimising MS BI

General

Scalability != Performance

Cool tools:

Xperf
xEvent
Coreinfo.exe

    Consider symbols from SQLCAT to get internal SQL function names and their calls.

    Really deep analysis of what\’s going on in the OS

Data Loading

\’Minimally Logged\’ logs allocations not all rows / pages but can be wrapped in a transaction and rolled back. TRUNCATE TABLE is minimally logged (cf Kalen Delaney)

Bulk Load is less chatty than singleton inserts – less round tripping.

INSERT myHeap WITH  (TABLOCK) SELECT….

PLEASE NOTE: HEAPS ONLY

Clustered index insert takes an exclusive RANGE lock that will expand as more data is inserted. Can lead to concurrency problems and range overlaps.

SQLServerDestination – must run IS on same server as SQLDestination <– THAT's why I don't like it ! (Uses Shared Memory connections)

/** Look up parallel load design patterns **/
/** Look up brute force hash partitioning design patterns **/
/** Look up large update design patterns **/

\”inserting into a partitioned table requires that the input is sorted by the partitioning key, even if you only hit one partition.\” <– hmm. Worried much ? Not sure… if we're not bulk loading.

Switching partitions in and out requires an exclusive lock on the table. SCH-M lock required (but only for a few ms)

Tuning SQL Server

sys.dm_os_waitstats – PAGELATCH_UP (resource id) can indicate problems at the PFS page.

Wait type RESOURCE_SEMAPHORE = waiting for memory to be allocated to process.

Fixing SOS_SCHEDULER_YIELD:
    Terminate and reconnect – don\’t use the same connection for the entire package.
    Soft NUMA (qv)
    Or give the network a kick…
    Or decrease parallelism…

BULK INSERT problem waits that indicate IO problems on source:
    IMPROVIO_WAIT
    OLEDB
    ASYNCH_NETWORK_IO

Tuning network

Affinitise NICs to CPU cores – Interrupt-Affinity Policy tool

Anecdotal evidence that 1 core in 8 is needed just to service the network on busy systems.

/**Look up Jumbo Frames**/

Discrepancy between reads and writes on NIC properties ? Could be small packet sizes (4096 bytes) – consider max of 32k (set on connection string & sql server – NETWORK PACKET SIZE)

Cubes

SSAS is not meant to be \’infinitely scaleable\’ 🙂

Overhead for each column that is extracted, but varies on the DATATYPE, not the size of the actual values in the column.
    MONEY is *sometimes* more efficient than DECIMAL – it\’s dependant on characteristics of how the specific datatype is stored.

Loose definitions:
    Large = won\’t fit in memory
    Fast Extract = saturates more than 1 NIC

Hyper-threading appropriate for SPINLOCK performance.

Partitioning a cube:
    Roughly equal size partitions
    \’Matrix\’ partitions (qv)
    Recommendation for max of ~1000 – 2000 partitions – higher would impact exploring / processing the cube

Distinct count – better performance if your partitions don\’t intersect:
    1000 – 2000, 2001 – 3000, 3001 – 4000 instead of 1000- 2500, 2000 – 8000, 1000 – 12000

DSVs can be pointed at a specific NIC (by IPAddress), so you can improve NetworkIO performance from the data source (provided you\’ve got multiple NICs)

Process Index / Aggregate:
    If MSOLAP:Memory / Quota Blocked > 0 = not enough memory for aggregations.
        Cure: Reduce AggregationMemoryMin and AggregationMemoryMax from their default values of 10% and 80%
    Increasing CoordinatorBuildMaxThreads can increase parallelism – default is 16 per partition

ProcessAdd dimension processing – don\’t assume that it\’s fast just because it only adds new rows. Size of the dimension makes a difference – this bigger, the slower.

ProcessFull vs. Incremental
    Update can take a long time.
    Always a good idea to throw in a ProcessFull every once in a while
        –  ProcessUpdate once an hour, but do a ProcessFull on the newest partitons every day.

Scaling Out
    by creating copies of the cube (cubeSync or RoboCopy) and load balance access from users to those copies.
    Note: cubeSync not really fast, better performance from RoboCopy.
    -caveat: flushing the cube from memory to disk (to RoboCopy) can make something that fits in memory very large on disk.

Parent / Child Dimensions
    Aggregates only target the [all] and leaf levels
    Performance can suffer with as few as 10K members

MOLAP dimensions + ROLAP facts are OK, but NEVER use referenced dimensions.Ever.

Natural hierarchies are very efficient.

Junk dimension – use a dimension key built from the keys of all it\’s attributes using binary addition. <– probably not good. Avoid if possible.

It\’s possible to avoid *some* types of slowly changing dimension:
    CustomerDim and CustomerDemographicDim, splitting out some of the attributes of the customer into another dimension entirely
    BUT complex when used too often
    Must be done at the relational level.

/**Look up subspace calculation / block computing**/

String Store

Every string has an overhead of 12 bytes for management
    a 1-byte string takes up more space than an 8-byte integer
    String store has a maximum size of 4GB
    Cure: on larger dimensions, prefer integer columns over strings (but I do anyway)

4GB limit applies to any store
    *.ksstore
    *.asstore
    *.bsstore

This limit is removed in SQL denali

Magic Settings

For use when you just can\’t change the design:

    –    speed up cubes
Threadpool Tuning    –    concurrency
    –    multiuser, predictability
    –    multiuser, predictability
    –    multiuser, speed
    –    multiuser, speed

Real Time Cubes

Need to agree on latency
    \”as fast as possible\” – define \’possible\’
    how do you measure it ?

Identify data classes:
    \’real\’ time  – must reside in memory. Locking/Concurrency requirement
    stale, but recent – perhaps in a different table / partition / database
    archive – perhaps in a different table / partition / database / server

Real time SNAPSHOT or RSCI isolation for low latency
BULK load or singleton INSERTs, etc, etc,…

If \”realTimeOLAP=True\”  is in the connection string, this turns off ALL caching. ROLAP data will be consistent, but any MOLAP will be affected.

DON\’T:
    Parent / Child
    Many-Many
    distinct count

Arbitrary shapes will cause massive round-tripping (Hierarchies with holes)

Resources

\”Data loading performance Guide\”
Codeplex – \’Parent/Child Naturalizer\’ (sic)

SQLBits 8 Session Notes #2

Tips for working with large data volumes

Simon Sabin
simon@sqlknowhow.com
http://www.sqlblogcasts.com/blogs/simons/
@simon_sabin

big deletes are a problem because:
    deleting rows requires 2x log space – 1x change transaction, 1x rollback information
    not a minimally logged operation
    deletes are required from index pages as well

delete rows uses the clustered index, BUT will delete entries in non-clustered index as well (cf ClusteredIndexDelete properties in execution plan)

Wide delete will *explicitly* delete from a non-clustered index, not implied as above.

Wide delete is the default for a \’large\’ table (typically over 1M rows), and is a sequential operation <– could cause concurrency issues if someone takes an exclusive lock on a row that's a delete candidate.

pseudo -image:
    table –>
        rows to delete –>
            Clustered index
            NC index 1
            NC index 2
            ……

Deletes done by PK / RID. Possible random IO
    -solve: index on predicate (good), covering index on predicate and primary  key (better)
    -why: avoids the SORT operation in execution plan

Consider batching up the delete – DELETE TOP (1000)….. (SQL2008 only – and there\’s no ORDER BY if using DELETE TOP)
    but try and keep high selectivity with no subselects or joins to get the delete rows
    -note: batch ranges should not overlap, or you get blocks and / or deadlocks

You could alternatively put all the key values from the clustered index into a temporary table (or TV), then join back to it for the delete
    -why: avoids clustered index scans if seen in the estimated execution plan
    -also: batch it up with a while loop, but be very careful about how you construct this
        -why: have impact on transaction log (beware implicit transaction)

NOTE: \’set rowcount\’ is deprecated. It\’s technically a server option, not a query option.

no index on predicate will result in a table scan if you\’re using any TOP construct.

if you batch up to run in parallel (multiple DELETE threads) that\’s fine for the Clustered index, but you could get overlaps when deleting from the nonclustered indexes.

TakeAway

make sure finding rows is batchable
or
read all rows and batch that
align all deletes with primary key / clustered index.


SQLBits 8 Session Notes #1

Advanced Analysis Services Best Practices

Marco Russo
marco@sqlbi.com

Relational schema

    Snowflake schemas not always great for performance, because the JOIN in A/S is not very smart

    Star schemas don\’t do these types of JOINs
        use views to generate the star schema
        eliminates ambiguity

Data source decoupling
    use views to decouple layers
        views at database layer so they\’re visible to the DBA
        one view for each dimension
        columns in the view should be the same as the attributes in the dimension
        avoid calculated columns in the view
        use LEFT JOIN over INNER JOIN (?)
       

Dimensional Patterns
    surrogate keys independant from application keys
        -but: they don\’t have semantic value, so they should not be visible to the end-user : AttributeHierarchyVisible = False

    attribute keys
        use unique keys, again hidden from the end-user


NOTE: Get the slide deck and presentation off the website, there\’s more there than I can write down now.

Grouping
    automatic or manual ?
    manual gives more control, and can be built into a view for the dimension
    use business logic in the view

Banding
    categorise a measure according to a range of values
    Direct relationship or indirect relationship

SCD
    type1
        process update doesn\’t detect duplicate key errors
            process full would fail if there were duplicates
        flexible aggregations need to be regenerated

    type2
        process add instead of process update
        attribute relationships always Rigid

Junk Dimensions
    OK I missed that….

Parent – Child hierarchies
    if performance is good enough, stick with them
    -but: one PCH per dimension
        must use attribute key
        unique name for a member includes surrogate keys (which we don\’t want to show tot the end user.
    -cf: Parent – Child naturalizer from Thursday

Role Dimensions
    look this up, he\’s very fast…
    -alternative: create a view for each dimension (duplicate dimension)

Drill-through
    BIDS won\’t let you change the column order
        -but: you can edit the XML directly to reorder the columns (but don\’t ever open it in a GUI again)
    not supported on calculated members

Calculation Dimension
    select [name] from [delegate] where [willToLive] > 0
        (0 rows affected)
    Time Intelligence Wizard
        there are limits in BIDS. Perhaps it\’s not a great tool 🙂
    remove [All] member (IsAggregatable = false)

Conclusion
    Avoid MDX like the plague
    Don\’t tell the end-user anything they don\’t need to know
    Calculate values in the ETL layer