Friday, 14 September 2018

MS SQL Server - Waits stats using extended events



Identify wait stats of given workload



Benchmark wait statistics

We use DMV: sys.dm_os_wait_stats to extract the wait stats in the sql server. This DMV provides wait times categorized by wait type as following result set:













The wait time retrieved here is the cumulative time from last services restart and last wait stats reset.
In order to extract the wait stats for the workload we have applied, We have to either

  1. Reset wait stats (or)
  2. Baseline wait stats before applying workload.

Reset wait stats

Wait stats can be reset using below DBCC command.

DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR)

If the database is production database, it might not be optimum solution to clear wait stats. Better, considering the benchmark rather than resetting.

Baseline wait statistics

Create baseline with existing statistics using benchmark wait stats query. This creates table WaitStatsBench as below which we use to extract wait times for given workload.

Perform Operation

Perform the actual business transaction or run workload.

Capture wait stats

Extract the wait stats for the workload using Get wait statistics query.
from the DMV sys.dm_os_wait_stats or use modified query to get aggregate wait stats. This results:

Wait statistics captured above are the aggregated result from the last services restart or clear wait stats. Run the difference from baseline query to get the differences from the base line.

Identify the top wait type from the resultant list

Identify the top wait type with highest wait time. Consider ASYNC_NETWORK_IO as the target wait type to be investigated.

Capture queries corresponding to wait type

Prepare extended event session for identified wait type

Create stored procedure with extended event using create resource waits query.
Create stored procedure to extract the captured wait stats using extract events query.

Start extended event session

Insert the corresponding wait type into the table to which query should be captured. following table covers most of the wait types:

wait_type in sys.dm_os_wait_stats
map_value in sys.dm_xe_map_values
ASYNC_NETWORK_IO NETWORK_IO
BROKER_TASK_STOP SSB_TASK_STOP
CLR_JOIN CLR_TASK_JOIN
CLR_MEMORY_SPY CLR_MEMORY_SPY_ACCESS
CREATE_DATINISERVICE GET_DATINISERVICE
DBCC_SCALE_OUT_EXPR_CACHE CHECK_EXPRESSION_CACHE
DBSTATE DB_STATE
DLL_LOADING_MUTEX DLL_LOAD
ERROR_REPORTING_MANAGER ERROR_REPORTING_MGR
EXECUTION_PIPE_EVENT_INTERNAL TWO_THREAD_PIPE_EVENT
FS_FC_RWLOCK FS_GC_RWLOCK
FT_IFTS_RWLOCK FT_RWLOCK
FT_IFTS_SCHEDULER_IDLE_WAIT FT_SCHEDULER_IDLE_WAIT
FULLTEXT GATHERER FULLTEXT_GATHERER
HADR_ARCONTROLLER_NOTIFICATIONS_SUBSCRIBER_LIST HADR_ARPROXY_NOTIFICATION_SUBSCRIBER_LIST
HADR_DATABASE_FLOW_CONTROL HADR_PARTNER_FLOW
HADR_DATABASE_VERSIONING_STATE HADR_VERSIONING_STATE
HADR_DATABASE_WAIT_FOR_RESTART __indexMUTEX_HADR_DATABASE_WAIT_FOR_RESTART
HADR_DATABASE_WAIT_FOR_TRANSITION_TO_VERSIONING HADR_WAIT_FOR_TRANSITION_TO_VERSIONING
HADR_FILESTREAM_BLOCK_FLUSH HADRFS_BLOCK_FLUSH
HADR_FILESTREAM_FILE_CLOSE HADRFS_FILE_CLOSE
HADR_FILESTREAM_FILE_REQUEST HADRFS_FILE_REQUEST
HADR_FILESTREAM_IOMGR HADRFS_IOMGR
HADR_FILESTREAM_IOMGR_IOCOMPLETION HADRFS_IOMGR_IOCOMPLETION
HADR_FILESTREAM_MANAGER HADRFS_MANAGER
HADR_RECOVERY_WAIT_FOR_CONNECTION __indexMUTEX_HADR_RECOVERY_WAIT_FOR_CONNECTION
HADR_RECOVERY_WAIT_FOR_UNDO __indexMUTEX_HADR_RECOVERY_WAIT_FOR_UNDO
HADR_TRANSPORT_FLOW_CONTROL HADR_TRANSPORT_FLOW
HTBUILD HASH_TABLE_BUILD
HTREPARTITION HASH_TABLE_REPARTITION
INTERNAL_TESTING
LAZYWRITER_SLEEP LZW_SLEEP
MD_AGENT_YIELD METADATA_AGENT_YIELD
MD_LAZYCACHE_RWLOCK METADATA_LAZYCACHE_RWLOCK
MISCELLANEOUS UNKNOWN
MSSEARCH MSSEARCH_COM
PREEMPTIVE_FSRECOVER_UNCONDITIONALUNDO PREEMPTIVE_FSRECOVER_CONDITIONALUNDO
PREEMPTIVE_OS_SQMLAUNCH PREEMPTIVE_SQMLAUNCH
PWAIT_ALL_COMPONENTS_INITIALIZED ALL_COMPONENTS_INITIALIZED
PWAIT_COOP_SCAN COOP_SCAN
PWAIT_EVENT_SESSION_INIT_MUTEX EVENT_SESSION_INIT_MUTEX
PWAIT_HADR_ACTION_COMPLETED HADR_ACTION_COMPLETED
PWAIT_HADR_CHANGE_NOTIFIER_TERMINATION_SYNC HADR_ARPROXY_NOTIFICATION_SUBSCRIBER_LIST
PWAIT_HADR_CLUSTER_INTEGRATION HADR_CHANGE_NOTIFIER_TERMINATION_SYNC
PWAIT_HADR_FAILOVER_COMPLETED HADR_CLUSTER_INTEGRATION
PWAIT_HADR_OFFLINE_COMPLETED HADR_FAILOVER_COMPLETED
PWAIT_HADR_ONLINE_COMPLETED HADR_OFFLINE_COMPLETED
PWAIT_HADR_POST_ONLINE_COMPLETED HADR_ONLINE_COMPLETED
PWAIT_HADR_SERVER_READY_CONNECTIONS HADR_SERVER_READY_CONNECTIONS
PWAIT_HADR_WORKITEM_COMPLETED HADR_WORKITEM_COMPLETED
PWAIT_MD_LOGIN_STATS MD_LOGIN_STATS
PWAIT_MD_RELATION_CACHE MD_RELATION_CACHE
PWAIT_MD_SERVER_CACHE MD_SERVER_CACHE
PWAIT_MD_UPGRADE_CONFIG MD_UPGRADE_CONFIG
PWAIT_PREEMPTIVE_AUDIT_ACCESS_WINDOWSLOG PREEMPTIVE_AUDIT_ACCESS_WINDOWSLOG
PWAIT_QRY_BPMEMORY QRY_BPMEMORY
PWAIT_REPLICA_ONLINE_INIT_MUTEX REPLICA_ONLINE_INIT_MUTEX
PWAIT_RESOURCE_SEMAPHORE_FT_PARALLEL_QUERY_SYNC RESOURCE_SEMAPHORE_FT_PARALLEL_QUERY_SYNC
PWAIT_SECURITY_CACHE_INVALIDATION SECURITY_CACHE_INVALIDATION
QUERY_EXECUTION_INDEX_SORT_EVENT_OPEN QUERY_EXEC_INDEXSORT_OPEN
REDO_THREAD_PENDING_WORK REDO_SIGNAL
REDO_THREAD_SYNC REDO_SYNC
RESOURCE_GOVERNOR_IDLE
SCAN_CHAR_HASH_ARRAY_INITIALIZATION SCAN_CHAR_HASH_ARRAY_INIT
SERVER_IDLE_CHECK SERVER_IDLE_LOCK
SNI_LISTENER_ACCESS LISTENER_UPDATE
SNI_TASK_COMPLETION SNI_WAIT_TASK_FINISH
SP_PREEMPTIVE_SERVER_DIAGNOSTICS_SLEEP PREEMPTIVE_SP_SERVER_DIAGNOSTICS_SLEEP
THREADPOOL SOS_WORKER
TRAN_MARKLATCH_DT TRANMARKLATCH_DT
TRAN_MARKLATCH_EX TRANMARKLATCH_EX
TRAN_MARKLATCH_KP TRANMARKLATCH_KP
TRAN_MARKLATCH_NL TRANMARKLATCH_NL
TRAN_MARKLATCH_SH TRANMARKLATCH_SH
TRAN_MARKLATCH_UP TRANMARKLATCH_UP
VIA_ACCEPT VIA_ACCEPT_DONE
WAIT_XTP_GUEST XTP_GUEST
WAIT_XTP_TASK_SHUTDOWN XTP_TASK_SHUTDOWN
WAIT_XTP_TRAN_COMMIT XTP_TRAN_COMMIT
WAITFOR_TASKSHUTDOWN TASKSHUTDOWN
XE_CALLBACK_LIST XE_CALLBACK
XTPPROC_PARTITIONED_STACK_CREATE __indexXTPPROC_PARTITIONED_STACK_CREATE

DECLARE @WaitTypeTVP AS WaitType;INSERT INTO @WaitTypeTVP (Name)VALUES ('NETWORK_IO'), ('XE_CALLBACK')EXEC Waits.CaptureStatementsCausingWaits @TVP = @WaitTypeTVP;GO

Perform operation

Wait until actual workload completes.

Extract results from the session

exec  [Waits].[GetStatementsCausingWaits]

Stop extended events session

exec Waits.CaptureStatementsCausingWaits @StopXevent = 1

Performance overhead on using extended events

Ringbuffer

File

References

https://blogs.msdn.microsoft.com/sqlserverstorageengine/2017/07/03/what-are-you-waiting-for-introducing-wait-stats-support-in-query-store/
https://www.sqlskills.com/blogs/jonathan/an-xevent-a-day-30-of-31-tracking-session-and-statement-level-waits/
https://www.sqlskills.com/blogs/jonathan/mapping-wait-types-in-dm_os_wait_stats-to-extended-events/
https://www.brentozar.com/extended-events/
https://johnsterrett.com/2013/10/08/benchmark-sql-server-wait-statistics/

















Saturday, 8 September 2018

PostgreSQL High Availability, Load Balancing and Replication

Introduction

  1. A secondary server taking over quickly while primary server fails to serve is called High Availability.
  2. Several database machines serving same data is called as Load Balancing.
Ideally, web servers serving static web pages and read-only databases can be combined easily by merely load balancing the requests to be served. Unfortunately, most of the database servers receive mix of read and write requests. This makes hard to get consistent data from one of the available servers at any given point of time. The data written to the server must be propagated to all other servers.

This synchronization problem is solved by many solutions:
One Server to modify data & other servers to perform read operations
Using one server (Master or Primary server) as write only and other servers (Standby or Secondary server) as read only servers which track changes in the primary server.
  • Primary or Master server
  • Secondary or Standby server
    • Warm standyby server (cannot be connected until it is promoted to a master server)
    • Hot standyby server (serves read only queries)
Data modification is not considered as committed until all the servers have committed the transaction 
This solution guarantees:
  • fail over (There is no loss of data)
  • all load balanced servers will retrieve consistent data no matter whatever the serve is queried 

Comparison of different solutions

Shared Disk Failover (sharing the hardware functionality)
Uses single disk array that is shared by multiple servers. As soon as primary server server fails, secondary server will be able to mount and start.
Is advantageous as there:
  1. is no overhead of synchronization of the primary server with the secondary server.
  2. and allows rapid failover with no data loss.
Also has some serious limitations since:
  1. If shared disk array fails or corrupts all the servers becomes non-functional.
  2. and standby server should never access the shared storage when primary server is accessing
File System Replication (file system replication Eg: DRBD)
has restriction that, writes to standby server must be done in the same order of the data changes in primary server to ensure consistency to the standby server.

Write-Ahead Log shipping
Synchronously or Asynchronously warm and hot standby servers are kept updated by reading stream of WAL records and is achieved by
either:
  1. File based log shipping
  2. Streaming replication
  3. or combination of both
This is done on entire database
Logical Replication
Logical replication constructs a stream of logical data modifications (at table level is also possible) from the WAL of the server and sends to the other servers (can be from master to standby or standby to master). 
Trigger-Based Master-Standyby Replication (Updates standby servers asynchronously in batches) Eg: Slony-I
This is 2 step process

  1. Master-Standby replication setup server asynchronously sends all the modification queries to the master server.
  2. Master server asynchronously sends the data to the standby server.

Asynchronously sends all the data modification queries to the standby servers as a batch with help of which standby servers can answer read-only queries while the primary server is running. This case may lead to possible data loss during failover.
Statement-Based Replication Middleware (Each server operated independently)
A program intercepts every SQL Query and broadcasts the same to all the servers to execute. This can cause data inconsistency if the queries contains functions like CURRENT_TIMESTAMP(), random(), sequences which can have different value on different servers etc. Care  must be taken that transaction either commits or rollbacks on all the servers, perhaps using the 2 phase commit (PREPARE TRANSACTION and COMMIT TRANSACTION)

  1. One solution for this would be that middleware should query server dependent functions from the one server and make to use same in all other servers.
Asynchronous Multimaster Replication
Each server works independently and periodically communicates each other and resolve the conflicting transactions. These conflicting transactions can be resolved by user or using some rules like Bucardo.
Synchronous Multimaster Replicatiom
Each server can accept the write requests and modified data is sent to all other servers. Since, data is sent to all other servers, we won't face any issue with non-deterministic functions like random() etc. Heavy write activity may lead to excessive locking leading to the poor performance. However, read operations take advantage since we need not partition the workload and the any server can serve the read request. The problem with write overhead is reduced by using shared disk.
This replication should be implemented in the application using 2 phase commit transactions since, PostgreSQL doesn't implicit;y provide us the functionality.

















References

https://www.postgresql.org/docs/current/static/high-availability.html