Software Performance Engineering

Query Execution Flow

Requests:

First, a request comes from the network in the form of TDS (Tabular Dta stream) Protocol. The request can be in any of the type such as a Batch Request (which contains T-SQL text) or Remote Procedure Call Request (which contains Procedure Identifier and Parameters to be passed to the Procedure) or a Bulk Load Request (This is different from other requests since because is the only request that starts execution before the request is complete in TDS Protocol).

We can find each request that is executing within SQL Server using DMV sys.dm_exec_requests. from this we can find inforamtion such as Open transactions count, wait type and wait time etc.

?? Get usefull data from DMV and take screenshot and place as a slide

Task:

After Complete TDS Requst reaches, Database Engine creates a Task to handle the request.

The Task created represents entire Batch request, even though it contains multiple statements. There are some cases where SQL Statements are parallely executed (referred to as DOP). In this case, task will spawn a new sub-task for execution and places in Tasks Queue.

We can see list of tasks in the server by querying sys.dm_os_tasks.

?? Get usefull data from DMV and take screenshot and place as a slide

As have told, when a new request reaches the server a task is created to handle the request. The Task has to start executing first, for this engine assigns a worker to it.

Workers:

workers are threads in threadpool.

As long as Task in not picked by Worker it is said to be in pending state.

A number of workers are created initially at server start up and more can be created on-demand up to the configured max worker threads in the server.

and this max worker threads count depends on no. of logical processors in the system and also whether the server is of 32-bit or 64-bit.

Max Workers Threads count can be found using DMV sys.dm_os_sys_info.

?? Get usefull data from DMV and take screenshot and place as a slide

Each worker takes exactly one task and executes it. The worker is said to be busy until the task finishes completely. For SQL batch request the worker that picks up that task will execute the entire SQL batch. Since, it is single threaded, execution of each statement must complete before execution of new statement starts. For statements that use parallelism internally i.e. where (DOP > 1), a sub-task is created and placed in Task Queue which is picked up by available worker.

The list of tasks

and state of workers can be fond using DMV sys.dm_os_workers.

?? Get usefull data from DMV and take screenshot and place as a slide

Worker picksup the task and gives to the execution phase.

Parsing:

In order to understand the content of the request, T-SQL text is parsed to build an Abstract Syntax Tree. If Parsing is failed task execution is terminated with compilation error. At this stage worker leaves up the task and is free to takeup other pending task.

Compilation

after compilation a primary data access plan or query plan is created. This plan describes the way to open the tables, indexes and access or manipulate the data rows. A data acess path like 'open index idx1 on table t, locate the row with the key 'k' and return the columns a and b' is generated.

Optimization:

is Choosing optimal data access path from all posible plan alternatives which has lowest data cost. Cost is calculated primarily by considering the size of data that would have to be read, CPU Consumption and memory required .

In order to come up with these costs SQL Server needs to know the size of each table and the distribution of column values, which are available from the statistics associated with the data.Other factors considered are the CPU consumption and the memory required for each plan alternative. each plan alternative.

Once a query plan is chosen by the Optimizer the request can start executing. The query plan gets translated into an actual execution tree.

Plan & Data Cache:

Evaluating Query performance:

Query Cost, Page Reads and Query Execution Time are key statistics to be considered while evaluation query performance

DMV'S

1. Compatibility Views.

2. Catalog Views.

3. Dynamic Management views.

Meta data starting with name sys.dm_ such as sys.dm_exec_cached_plans are called dynamic management objects. This contain either about dynamic management objects and dynamic management views.

DMV's allow us to see much of the internal behaviour of sql server. DMV's allow tracking of the detailed resource usage history. DMV's expose changing server state information that span multiple transactions, sessions, user requests.

---------------- Increasing Performance using DMV's ------

Agenda of this presentaion is to provide an overview of, what DMV's are and range of problems they can solve.

What dynamic management views are, kinds of data they can contain, types of problems they can solve

As query runs on SQL Server database, SQL Server automatically records the information into structures in memory. we can access this information via DMV's. In a nutshell, DMV's are views on SQL Server metadata. DMV information is stored, in per sql server instance level.

The major DMV groups which will help to troubleshoot Performance problems are :

1. Index

2. Execution

3. SQL Server Operating System

4. Transaction

5. Input/Output

6. Database

In addition to this, there are 10 more major DMV groups like Security, Extended events etc.

consider an example of query execution in the sql server,

When a query is run in sql server, an immense information is recorded in the sql server which includes,

1. The queries cached plan. (this describes at a low level how the query is executed)

2. What indexes are used

3. what indexes the query would like to use but can't use, because they're missing indexes.

4. How much i/o occured (both physical and logical)

5. How much time spent on executing the query.

6. how much time was spent waiting on other resources

7. what resources the query is waiting on

Being able to retrive and analyse this inforamtion will allow us to understand how our query works.

The data shown via DMV's is cummulative since the last SQL Server reboot or restart. If you are interested only in the actions of the given query run. you can determine the effect of query by taking snapshot of the DMV data, run your query, and then take another snapshot. getting the delta between 2 snapshots will provide you with the details of the effect of the query.

Typically, when you query the DMVs to extract important diagnostic information, this has minimal effect on the server and its resources. This is because the data is in memory and already calculated;you just need to retrieve it. To further reduce the impact we can use prefixed statements to ignore locks during DMV execution. There are some cases, where this DMV data is not readily available and need to be calculated. for example in case of sys.dm_index_physical_stats. In summary, compared to other methods of obtaining this information like sql profiler DMV's has little impact on the sql server.

SELECT * FROM sys.system_objects WHERE name LIKE 'dm_%'

These DMV's contain information related to various aspects of the Indexes including missing indexes, index usage (no. of seeks, scans, look ups), Operational statistics (I/O, locking, latches, and access method), and physical statistics(size and fragmentation information).

2. Why they are important

Software Performance Engineering

Saturday, 15 July 2017

No comments:

Post a Comment

Linux Metricbeat Counters Mapping

Report Abuse