MS-SQL Internals
modified: 2015-06-02 23:10:36

    ADO.NET - .NET
    OLE DB - COM [being phased out]
    ODBC - open standard
    JDBC - Java

    [DB SERVER] <---> [MC-SMP] <---> [CLIENT]

The MC-SMP (Session Multiplex Protocol), which is used to multiplex database communication sessions over a single reliable transport connection.

The TDS (Tabular Data Stream), specified in MS-TDS and MS

MC-SQLR (Microsoft Server Resolution Protocol)

MS-CEPM (Microsoft Complex Event Processing engine Manageability Protocol)


    SQL Server Database Engine
    SQL Server Reporting Services
    SQL Full-text Filter Daemon Launcher
    SQL Server Browser

MARS (Multiple Active Result Sets)

    Power Query
        Query Folding
            The transformations take place in the server database engine. These reduce network usage. Use
            SQL Server Profile to see that folding is actually happening.
    SQL Server Profiler

5/22/2015 Notes:

    MSDTC (Microsoft Distributed Transaction Coordinator)
        A process that handles dual-phase commits for transations that span multiple SQL Servers.
        discipline and tools that enable management of data for the purpose of analysis, exploration,
        reporting, mining, and visualization. Development for all these services can be done using the
        new _SQL Server Data Tools_.
            Integration Services (IS/SSIS)
                Extract-Transform-Load (ETL) tool; uses GUI
                may gather data to be used for Analysis Services
                tries to replace the usage of custom programming of T-SQL
            Reporting Services (RS)

            Analysis Services (AS/SSAS)
                hosts multidimensional databases where data is stored in cubes, whereas
                data mining provides methods to analyze datasets for nonobvious patterns
                in the data
            OLAP (Online Analytical Processing)
                massive data visualization
                uses MDX (Multidimensional Expressions) to query the database
            MDX (Multidimensional Expressions)
                set-based query language tailored to retrieving multidimensional data
            Data Mining
                mining algorithms
                finding associations, forecasting, classifying cases into groups
                predicting new cases to occur; most profitable cusomters; estimating expected component failure rates
            Reporting Services (RS/SSRS)
                managed reporting solution
                reports can be scheduled
                most DBA's place it on its own dedicated server for better performance
            SQL Server Management Studio
                uses Visual Studio
                integrated environment thats used by DBAs and developers
                    - visual object explorer
                    - query editor
            SQL Server Configuration Manager
                - start and stop any server
                - set start-up options
                - configure connectivity
                - show services and servers running on a particular server
            SQL Profiler/Trace/Extended Events
                - profiler being deprecated in favor of Extended Events
                    = will enable deeper level of tracing with decreased load on server
            Performance Monitor
                - found in Window's administrative tools
                - server installation installs a lot of useful performance counters
            Database Engine Tuning Advisor
                - analyzes a batch of queries (from Profiler) and recomments
                  index and partition modifications for performance
            Command-Line Utilities
                These tools are rendered somewhat obsolete by Integration Services
                and SQL Server Agent, but they are still included.
                - sqlcmd = execute SQL code
                    Management Studio has a mode that allows you to use the query
                    editor as if it is sqlcmd
                - bcp = bulk copy program
                - has AdventureWorks sample database
            Editions of SQL Server
                - Enterprise (mission critical applications)
                - Business Intelligence
                - Standard (basic database capabilities including reporting and analytics)
                - Datacenter (USE ENTERPRISE)
                - Workgroup (USE STANDARD)
                - Standard for small business (USE STANDARD)
            Big Data
            Transactional Data
                - data describing an event
                - always have a time dimension, numerical value, and refers to one or more objects
            Dimensional Data
                - data describing a fact
                - may have a time dimension
            Data Store
                architectural term for a database
            Set Based SQL
                The query is written so that the data store engine can optimize it.
                = use existing language features to perform operations
            Procedural Based SQL
                How to produce the results is controlled by procedure code.
                = writting your own logic which could be done by built-in logic
            Correlated Subquery
                - Where the outer query has to be executed first.
                - Hence inner query is correlated with outer query
            Table Variable Versus Cursor Versus Temp Table
                = table variable uses more RAM
                = cursor uses more CPU
                = temp table uses more I/O
                = table variable resides in memory 100% of the time
                = cursor
                = temp table usually resides in the tempdb and requires inter-db communication which is slow
                = temp table requires lot of disk I/O and resource usage because:
                    - temp table has to be created
                    - data has to be inserted on the temporary table
                    - often has to be joined with a physical table to obtain result
                    - lock has to be established while updating data on temp table
                    - temporary table has to be dropped
                = table variable has automatic resource release
                DECLARE @Elements TABLE
                    Number INT IDENTITY(1,1),
                    ProductName VARCHAR(300),
                INSERT INTO @Elements SELECT Name from dbo.Products
                WHILE @N < @Count
                    SELECT @CurrentValue = ProductName FROM @Elements WHERE Number = @N
                    print @CurrentValue
            Information Architecture Principle
                - enterprise extends basic ideas of designing a single database to include:
                    - which types of databases serve which needs within the organization
                    - how those databases share resources
                    - how they communicate with one another and other software
                    - community planning or zonining
                    - concerned with applying the best database meta-patterns
                - compromises three main areas of information management
                    - database design and development
                    - enterprise data center management
                    - business intelligence analysis
                - six attributes by which every database can be measured
                    - function of design, development, and implementation
                        - usability (design)
                            ! completeness of meeting the organization's requirements
                            = suitability of the design for the purpose
                            = effectiveness of data format for applications
                            = robustness of the database
                            = east of extracting information (by programmers and power users)
                            * common reason why a database is less than usable is an overly complex
                              or in appropriate design
                            * thorough and well-documented understanding of the organizational requirements
                            * life-cycle planning of software features
                            * selecting the correct meta-pattern (transactional and dimensional) for the data store
                            * normalization and correct handling of optional data
                            * simplicity of design
                            * well defined abstraction layer
                        - extensibility (design)
                            ! information must be readily available today and in the future
                            ! easily adapts to meet new requirements
                            = normalization and correct handling of optional data
                            = generalization of entities when designing the schema
                            = data-driven designs that not only model the obvious data, but also
                              enable the organization to store the behavioral patterns, or process
                            = well defined abstraction layer that decouples the database from all
                              client access, including client apps, middle tiers, ETL, and reports
                            = complexity breeds complexity and inhibits adaptation; simple solution
                              is easy to understand and adopt, and ultimately, easy to adjust later
                        - data integrity (design)
                            ! with out data intergrity a query answer can not guarenteed to be correct
                            = entity integrity
                                involes the structure of the entity; the tables primary key enforces intergrity
                            = domain integrity
                                ensures that only valid data is permitted in the attribute
                                set of possible values for an attribute; nullability is also part of this
                                data type and nullability of the row enforce this
                            = referential integrity
                                domain integrity of foreign keys
                            = transactional integrity (atomicity)
                                ensures that every logical unit of work, such as inserting 100
                                rows or updating 1000 rows, is executed as a single instruction
                                ACID properties
                                    atomic - transaction is not blended with other transactions
                                    all or nothing - transaction either happens or it does not
                                    consistent - data store begins and ends in consistent state
                                    isolated - does not effect other transactions
                                    durable - once commited always commited
                        - performance (design)
                            = well designed schema with normalization and generalization and correct
                              handling of optional data
                            = set-based queries implemented within a well-defined abstraction layer
                            = sound indexing strategy, including careful selection of clustered and
                              non-clustered indexes
                            = tight, fast transactions that reduce locking and blocking
                            = paritioning, which is useful for advanced scalability
                        - availability (implementation)
                            information accessibility when required regarding uptime, location,
                            and the availability of the data for future analysis; disaster recovery,
                            redundancy, archiving, and network delivery
                            = quality, redundant hardware
                            = SQL Server's high availability features
                            = proper DBA procedures reggarding data backup and backup storage
                            = disaster recovery planning
                        - security (implementation)
                            for any organizations asset, the level of security must be secured
                            depending on its value and sensitivity
                            = physical security and restricted access of the data center
                            = defensively coding against SQL injection
                            = appropriate operating system security
                            = reducing surface area of SQL Server to only those services and
                              features required
                            = identifying and documenting ownership of the data
                            = granting access according to the principle of least privilege,
                              which is the concept that users should have only the minimum
                              access rights required to perform necessary functions within
                              the database
                            = cryptography - data encryption of live databases, backups, and
                              data warehouses
                            = metadata and data aubit trails documenting the source and veracity
                              of the data, including updates
            - data architect is responsible for informing the organization about
              these six objectives; cost associated with meeting them; risk of
              failing to meet the objective; and recommened level for each objective
            - operation store
                - tuned for a balance of data retrieval and updates
                    - indexes and locking are key concerns
                    - recieve first-generation data
                    - subject to data update anomalies and benefit from normalization
            - caching data store (reporting store)
                - optional read-only copies of all or part of an operational database
                - organization might have multiple caching data stores to deliver data throughout
                - might use SQL server replication or log shipping
            - Replication
                - log shipping
                    - automatically send transaction log backups from a primary database on a primary server
                      to one or more secondary databases on seperate secondary server instances; transaction
                      log backups are applied to each of the secondary databases individually
                      - a third server instance, known as the monitor server, records the history and state of
                        backups and restore operations and, optionally, raises alerts if these operations fail
                        to occur as scheduled
                - transactional replication
                    - typically used in server-to-server scenarios that require high throughput; improving
                      scalability and availability
                    - typically used for reliable and consistent communications
                    - lowest update latency
                - merge replications
                    - primarily designed when possible data conflicts can occur
                    - TODO: READ MORE ABOUT THIS
                    - typically used when subscriber may not always have communication
                - snapshot replication
                    - complete refresh of data (non-incremental)
                    - highest update latency
                - Sync Framework
                    - TODO: READ ABOUT THIS
            Database Index
                is any data structure that improves the performance of a lookup
                - non-clustered
                    - physical order of data rows is not the same as the index rows
                    - indexed columns are typically non-primary key columns used in JOIN, WHERE, and ORDER BY clauses
                    - multiple can exist on a single db table
                - clustered
                    - only one can exit on a db table at a time
                    - physical order of data rows is the same as index rows
                - cluster
                    - when multiple databases and multiple tables are joined, it's referred to as a cluster
                    - cluster key joins records together as one
                    - can be keyed with a B-tree or hash table
                    - a group of items
                - column order
                    [city] [last name] [first name] [phone number]
                    - fast to enumerate phone numbers for a specific city
                    - slow to enumerate phone numbers for a specific last name
                    - slowest to enumerate phone numbers for a specific first name
                - full table scan
                    - happens when no index is used
                    - happens when impossible to use index to find exact match
                        - SELECT email_address FROM customers WHERE email_address LIKE '%@wikipedia.org';
                        - can be prevented by creating an index targeting the query
                            - create index on reverse(email_address)
                            - SELECT email_address FROM customers WHERE reverse(email_address)
                                - LIKE reverse('%@wikipedia.org'); would not work with above
                                    - must create specific index for it
                - types of index
                    - bitmap index
                        - works well for variables that repeat
                    - dense index
                        - pairs of keys and pointers for every record in the data file
                        - pointers for EVERY RECORD
                        - in clustered indexes with duplicate keys, it points to the first record with that key
                        - data file may be sorted or unsorted
                    - sparse index
                        - pairs of keys and pointers for every block in the data file
                        - every key is associated with a particular pointer to the block in the sorted data file
                        - data file is sorted, creating blocks
                    - reverse index
                        - reverses the key value
                        - very useful where new key values monotonically increase (increase by one)
                        - 1234 becomes 4321
                - index implementations
                    - balanced trees, B+ trees, and hashes
                    - concurrency control
                        - in principle indexes can utilize the common database concurrency control methods,
                          however specialized concurrency control methods for indexes exist
                - covering index
                    in most cases an index is used to quickly locate the data records(s); not to return data
                    a covering index is a special case where the index itself contains the required data field(s) and can return data
                    - slows down data insertion and update
                    - can increase total size of table by a large amount
                    - some systems allow including non-key fields in the index.,,.l
            ISO SQL Standard
                A standard governing SQL.
            Data Cube
                - also called a hypercube
                - three or more dimensions
                - star and snowflake schema
                - part of a data mart schema
                - generally read-only
                - can be pre-computed, computed-on-demand, or partially-pre-computed
            Data Mart
                - improve end-user response time by allowing users to have access to the specific type of data they
                  need to view most often by providing the data in a way that suports the collective view of a group
                  of users
                - pre-aggregated data organized
            View Selection Problem
                - part of data aggregation
                - objection is typically to minimize average time to answer OLAP queries, although some minimize update time
                = NP-Complete; (greedy, randomized, genetic, A*)
            Rollup/Drill-down or summarization of a Data Cube
                = rollup can be done by traversing upwards through a concept hiearchy (summarizing/less-detail)
                    = concept hiarchy maps a set of low level concepts to a higher level, more general concepts
                = non-summarized cube is computed at the lowest level of detail
                = drill down can be done by traversing downloads through a concept hiearchy (more detail)
            Fact Table (Part Of A Cube?)
                - links to dimension tables (which hold data that does not change much)
                - a dimension is a link to another table instead of a fact value
            Data Quality Services
                - enables a business user, information workers, or IT professional who is neither a database expert nor a programmer
                  to create, maintain, and execute their organization's data quality operations with minimal setup or preperation time
                    - modifcation, removal, enrichment of data that is incorrect or incomplete
                    - matching; de-duplication
                    - reference data services; verification of quality of data
                    - profiling
                    - monitoring
                    - knowledge base
            Master Data Store (Master Data Management (MDM))
                - combines data from throughout the organization
                - primary purpose is to provide a single version of the truth for an organization
            Smart Database Design
                - premise that an elegant physical schema makes the data intuitively obvious
                  and enables writing great set-based queries and respond well to indexing
                    - this creates short, tight transactions, which improves concurrency and
                      scalability, while reducing the aggregate workload of the database
            Database System
                - made up of four major components that affect the overall performance
                    = server platform
                        - server environment is the physical hardware configuration (CPUs,
                          memory, disk spindles, and I/O bus), operating system, and the
                          SqL Server instance configuration
                    = maintenance jobs
                        - steps that keep the database running optimally
                        - index defragmentation
                        - DBCC integrity checks
                        - maintaining index statistics
                    = client application
                        - collection of data access layers
                        - middle tiers
                        - front-end applications
                        - ETL scripts
                        - report queries
                        - SSIS packages
                    = database
                        - everything within the data file
                            - physial schema
                            - T-SQL code (queries, stored procedures, UDF, views), indexes, data
            Data Definition Language (DDL)
                - SQL is comprised of a DDL since it uses such statements as:
                    = CREATE
                    = CREATE TABLE
                    = DROP
                    = ALTER
                    = RENAME
            Physical Schema Consideration
                - well designed physical schemes avoid over-complexity by generalizing similar types of objects
                - make the data obvious to the developer and easy to query
                - prime consideration when converting the logical database design into physical schemea
                  is how much work is required for query to navigate the data structures while maintaining
                  a correctly normalized design
                - poorly designed physical schema encourages developers to write iterative code
                    - code that uses temporary buckets to manipulate data
                    - code that will be difficult to debug or maintain
            Agile Development
                - gets the job done and often produces a better result than traditional methods
                - fits well with database design and development
            Traditional Waterfall Process
                - steps
                    - requirements gathering
                    - design development
                    - implementation
                - bad because users do not always know what they wanted but develop their needs
                  as they explore and evolve
                    - see Agile Development
            Declarative Language
                - SQL
                - describes the problem
            Iterative T-SQL
                - code that acts upon data one row at a time instead of as a set
                - typically implemented by cursors, temporary tables, memory tables
            - sound indexing strategy identifies a handful of queries that represent 90 percent of the
              workload, and with judicious use of clustered indexes and covering indexes, solves the
              queries without expensive lookup operations (full scan)
            Normalized And Denormalized Cons And Pros
                - normalized
                    - good for write intensive or small subset writes
                    - slow reads because of joins needed
                    - write intensive
                    - data deduplication (less storage, less writes, more disjoint reads)
                    - less efficient index usage
                    - COSTLY READS
                    - good initial development strategy
                - denormalized
                    - good for read intensive operations
                    - data duplication (more storage, more writes, less disjoint reads)
                    - more efficient index usage
                    - COSTLY WRITES
                    - performance oriented
                    - can be abstracted from a normalized view
            Active Record Object
            Resource Governor
                - part of SQL server which can restrict the resources available for different set of
                  queries, enabling the server to maintain the service-level agreement (SLA) for some
                  queries at the expense of other less critical queries
            Indexed Views
                - materialize the view as a clustered index and can enable queries to select from joined data
                  without hitting the joined tables, or to pre-aggregate data
                - is a custom covering index that can cover across multiple tables
            Partitioned Tables
                - can automatically segment data across multiple filegroups, which can serve as an auto-archive device
                - reducing the size of the active data partition, the requirements for maintaining the data, such as
                  defragging the indexes, are also reduced
                Vertical Partitioning (kinda like normalization)
                    - same rows, fewer columns
                Horizontal Partitioning (File Group Partitioning)
                    - same columns, fewer rows
                File Group Partitioning
                    - logical storage unit
                    - every database has a primary filegroup that contains the primary data file (.mdf)
                    - user-defined file groups can be created to contain secondary files (.ndf)
            Service Broker
                - manages queues, messages, and conversations by the database engine
                - loads/starts/stops procedures for processing messages
                - asynchronous, ordered, message send/receiver rollback
            Column Store
                - data is logically organized as a table with rows and columns
                - data is physically stored in column-wise data format
                - great compression with sorted columns (if contain same data and usually do)
                - does not inherently store columns sorted (but can be done)
            Row Store
                - data is logically organized as a table with rows and columns
                - data is physically stored in row-wise data format
                - traditional
            TCP/UDP Ports
                - Database Services
                    - default instance - TCP 1433
                    - dedicated admin connection - TCP 1434
                - Analysis Services - TCP 2383
                - Integration Services - TCP 135
                - Reporting Services - TCP 80 and TCP 443
                - Service Broker - TCP 4022
                - Browser Service - UDP 1434 and TCP 2382
            Data Access Tracing
                - outside scope of bible
                - ???? TODO: learn about this
            Server Cursor
                - ????
            Multiple Active Result Sets (MARS)
                - multiple active SQL statements on the same connection
                - result sets should be short-lived per SQL statement
                - if result set is long-lived or large, server cursors should be used
                * always read to the end of the results and use API to change connection properties
                * by default MARS functionality is not enabled
                    - turn it on using a connection string value
                        - MarsConn for OLE DB
                        - Mars_Connection or ODBC
            XML Data Type
                - TODO: learn more
                - variable declarations
                - parameter declarations
                - return type
                - conversions
            User-Defined Types
                - defined using .NET CLR
                - data exposed through fields and properties
                - behavior exposed through the class methods
            Large Value Types
                - 2^31-1 bytes long
                - allows text values in excess of 8K limit
                - varchar(max)
                - nvarchar(max)
                - SQL server based apps to store and manipulate unstructured data on the file system
            Expired Password Changing
                - can be changed in the connection string
                - a prompt via user interface
            Snapshot Isolation
                - uses the tempdb
                    - must have enough space
                - enhances concurrency and improves performance by avoiding reader-writer blocking
                - relies on the row versioning feature
                - transaction begins with BeginTransaction but not assigned a sequence transaction number
                  until the first T-SQL statement is executed
                - varbinary(max)
            Qualified Table Names
                - <server>.<database>.<schema>.<table>
                - commonly used is <schema>.<table>
                - it is standard practice to include the <schema>
            Equals Versus Not Equals
                - it is usually more performat to use = than <> ??
            IN, SOME, ANY
                - IN uses a list of values
                - SOME and ANY
                    - requires a sub-query
                    - used with =, >, <, =>
                - all function differently when used with a NOT condition
                - requires a true subquery and returns a true when the search condition is true for every value in the list
                - uses wildcard matching % _ [A-Z]
                - indexing only accounts for beginning of a column
                    - frequency use of LIKE can be a performance hit
                    - try to enumerate possible values
                    - enable full-text indexing
            Full Text Indexing
                - powerful indexing method that can even take into consideration weighted words
                  and variations of inflections and can return the result set in table form for
            Complex Union Queries
                - SELECT LastName + `, ` + FirstName as FullName FROM Person.Person ORDER BY 1;
            Default Collation
                - SELECT SERVERPROPERTY('Collation');
            Supported Collations
                - SELECT * FROM fn_helpcollations();
            Chen ER diagramming
                - popular
            E/R diagramming
                - simple and easy to explain
            1:1 Relationships
                - not common
                - usually used to partition the data for performance or security reasons
            Associative Table / Junction Table
                - creates one to many relationships between two entities
            Super/Sub Type
                - can be 1:1 or 1:many relationship from the view of the super type
                - may restrict to one sub-type to some degree
            Domain Integrity Lookup Pattern
                - serves to only limit the valid options for an attribute
                - only purpose is data consistency
            Recursive Pattern
                - 1:many to the same type (Person->Person)
            CROSS JOIN
                - http://explainextended.com/2009/07/16/inner-join-vs-cross-apply/
                - if the cross applied table yeilds no rows then that entry is skipped
            Database Design Layers
                [*] Domain Integrity
                    [!] often heavy read and rare writes
                    - look up tables (enforce specific values for a field); like an enum/enumeration
                [*] Business Entities
                    [!] often heavy read/write
                    - objects the user can describe
                [*] Supporting Entities
                    [!] often heavy read/write
                    - associative tables
            Normal Forms
                - First Normal Form (1NF)
                    - entity format
                    - every unit of data is represented within scalar attributes
                        - capable of being represented by a point on a scale
                    - every unit of data must be stored as one attribute and one attribute must contain only one unit of data
                    - all attributes must be unique
                        - no repeating attributes
                        - if entity is wide the object too broadly designed
                    - all data must be represented within unique tuples
                        - no duplicate tuples permitted
                        - entities must be represented by a single tuple (row / record)
                        - all tuples must represent a single item
                    - ? all entities must have the same fields
                - Second Normal Form (2NF)
                    - ensure that each attribute describes the entity
                        - does the attribute depend on the entity?
                    - problems arise with multiple primary keys
                        - attribute must depend on all primary keys as a composite
                          otherwise it is a partial dependency
                - Third Normal Form (3NF)
                    - check for transitive dependencies
                        - transitive dependency is similar to a partial dependency in that they
                          both refer to attributes that are not fully dependent on the primary
                        - a transitive dependency is when attribute1 is dependant on attribute2
                          and attribute2 is dependant on the primary key, OR attribute1 is dependant
                          on the primary key and dependant on attribute2
                        - thus attributes should ONLY be dependant on the primary key and not themselves
                - Boyce-Codd Normal Form (BCNF)
                    - every attribute must describe every candidate key
                - Forth Normal Form (4NF)
                    - if two independant attributes are brought together to form a primary key, but they
                      these two attributes dont uniquely identify the entity without the third attribute
                      then this violates the form (can be across different table)
                - Fifth Normal Form (5NF)
                    - when there exists a ternary relationship the individual entities must stand on
                      their own without any loss of data
                - Character Data Type
                    char(n)      - fixed length up to 8000 characters long using collation
                    nchar(n)     - fixed length uni-code
                    varchar(n)   - variable length up to 8000 character long using collation
                    nvarchar(n)  - variable length up to 8000 characters long using collation
                    varchar(max) - variable length up to 2GB in length
                    nvarchar(max) -
                - Numeric Data Types
                    bit          - 1 or 0       (boolean)
                    tinyint      - 0 to 255     (unsigned byte 8-bit)
                    smallint     - signed short (16-bit)
                    int          - signed int   (32-bit)
                    bigint       - signed int   (64-bit)
                    decimal/numeric         - fixed-precision numbers [length varys]
                    money                   - accuracy to one ten-thousandths (.0001) (64-bit)
                    smallmoney              - accuracy to one ten-thousandths (.0001) (32-bit)
                    float                   - (32 or 64 bit)
                    real                    - float with 24-bit precision
                - Date/Time Types
                    datetime        - accurate to 3 milliseconds from Jan 1, 1553 to Dec 31, 9999  (64-bit)
                    smalldatetime   - accurate to 1 minute from Jan 1, 1900 to June 6, 2079        (32-bit)
                    datetime2       - variable accuracy from .01 seconds to 100 nanoseconds        (6-8 bytes)
                    date            - Jan 1, 0001 to Dec 31, 9999                                  (24 bit)
                    time(2)         - variable accuracy from .01 to 100 nanoseconds                (3-5 bytes)
                    datetimeoffset  - includes embedded timezone; like date and time(2)            (8-10 bytes)
                - Other Types
                    timestamp/rowversion    - database unqiue random value; gened with updates   (64-bit)
                    uniqueidentifier        - system generated unique value                      (128-bit)
                    binary(n)               - fixed length of (n) bytes up to 8000 bytes
                    varbinary(n)            - variable length of (n) bytes to uo 8000 bytes
                    varbinary(max)          - variable length up to 8000 bytes???
                    image                   - variable length up to 2GB
                    sql_variant             - can store any data type up to 2GB
                - Joins
                    - inner join
                    - outer join
                        - left or right
                        - always try to use left outer joins
                        - try to never mix left outer and right outer joins in the same set/query
                            - find out why
                - Joins, On, Where
                    - A condition using ON for a join is more efficient than using
                      a WHERE for the entire query.
                - Self Joins
                    - the same table is referenced.. such as an Employee table which
                      links to another employee as a manager.. use aliases to do the
                - Cross Joins
                    - for every tuple in A the columns are combined with every tuple in B
                - Natural key
                    - a primary key made up of meaningful business data rather than
                      a arbitrary sequence number (IDENTITY)
                - Composite Key
                    - made up of multiple keys; common when natural keys are used
                - Exotic Joins
                    - Multiple-Condition
                    - Theta Join
                - Theta Joins
                    - uses operators other than equals
                -  Set Difference Queries
                    - aka find unmatched rows query
                    - full set difference query
                        - logical opposite of a join
                - UNION
                    - UNION ALL does not remove duplicates
                - INTERSECT (Type Of Union)
                    - same as INNER JOIN .. ON but stacks results vertically
                    ** sees NULL values as equal (instead of not equal)
                    - useful to test results to make sure they are the same
                - EXCEPT (Type Of Union)
                    - logical opposite of INNER JOIN .. ON but stacks result vertically
                - Subqueries
                    - Simple Subquery
                        - placed in parenthesis; can run independant of the outer query
                    - Common Table Expression
                        - independant of outer query; same as simple subquery
                        - form of code reuse in comparison to the Simple Subquery
                    - Correlated Subquery
                        - placed in parenthesis; dependant on outer query
                - Group By
                    - considers NULL values to be the same
                - Windowing and Ranking
                    - Over And Partition
                        - http://stackoverflow.com/questions/6218902/over-clause-when-and-why-is-it-useful
                        - Can break all the tuples into partitions and gain aggregations over the partitions
                - Views
                    - use them to denormalize data
                    - save complex aggregate queries as views
                    - be wary of using WHERE clause in a view for performance reasons
                    - views can easily become non-updatable due to restrictions
                        - not recommended to use views for write operations
                    - be wary of disappearing rows
                        - you can insert data into the view, but not see it through the view
                        - WITH CHECK OPTION
                            - only allows updates/inserts that comply with the WHERE clause