Difference
between Dirty reads, non-repeatable read and phantom read.
- Dirty Read – A Dirty read is a situation when a transaction reads data that has not yet been committed. For example, Let’s say transaction 1 updates a row and leaves it uncommitted, meanwhile, Transaction 2 reads the updated row. If transaction 1 rolls back the change, transaction 2 will have read data that is considered never to have existed.
- Non Repeatable read – Non Repeatable read occurs when a transaction reads the same row twice and gets a different value each time. For example, suppose transaction T1 reads data. Due to concurrency, another transaction T2 updates the same data and commit, Now if transaction T1 rereads the same data, it will retrieve a different value.
- Phantom Read
– Phantom Read
occurs when two same queries are executed, but the rows retrieved by the
two, are different. For example, suppose transaction T1 retrieves a set of
rows that satisfy some search criteria. Now, Transaction T2 generates some
new rows that match the search criteria for transaction T1. If transaction
T1 re-executes the statement that reads the rows, it gets a different set
of rows this time.
In the SQL world, a transaction is a unit of work that include
single or multiple operations that query or modify data of a database,
transforming the database from one consistent state to another.
To achieve this consistency and guarantee data integrity,
transactions have four properties with the acronym ACID.
A
Glance At ACID Properties
ACID properties are a set of characteristics that ensure
reliable and consistent transaction processing in databases. These properties
are essential for maintaining data integrity and reliability, particularly in
multi-user and concurrent environments. The acronym "ACID"
stands for:
- Atomicity (A): Either all
operations at the transaction take place successfully or none do.
- Consistency (C): The
transaction transforms the database from one consistent state to another
by controlling the level of isolation and adhering to the integrity rules
defined within the database (like primary keys, foreign keys, and unique
constraints).
- Isolation (I): Concurrent
transactions can access only consistent data, in other words, transactions
occur independently without interference, and that can be done by
controlling the level of isolation.
- Durability (D): Modifications
done by a successful transaction must be stored on permanent storage so
they are persisted even if a system failure occurs.
In this
article, we will focus on the Isolation property, how it
works, its levels, and the anomalies that come as a cost for each level.
Isolation
Models (Pessimistic Vs Optimistic Concurrency Control)
As we mentioned, database isolation enables transactions to
occur independently without interference, so the question here is, how this can
be achieved?
SQL Server has two models to achieve this isolation between
transactions:
- Locking (Pessimistic): If the current
state of data is inconsistent, readers are blocked (must wait)
until the data becomes consistent.
- Row versioning (Optimistic): If the current
state of data is inconsistent, readers get an older version of the
consistent data so readers aren’t blocked (must not wait).
Each approach has its own advantages and disadvantages:
1.
Pessimistic Concurrency Control is
safer but can lead to reduced concurrency and potential deadlocks.
2.
Optimistic Concurrency Control offers
higher concurrency but requires careful conflict detection and resolution
mechanisms.
Isolation Levels
SQL Server isolation levels are used to define the degree
to which one transaction must be isolated from resource or data modifications
made by other concurrent transactions.
The main goal of isolation levels is to control the
following effects:
- Whether a reader requests a lock or not, and if so, how long the
lock is held.
- If a transaction modifies (holds an exclusive lock) a
resource and another reader wants to read the same resource, what is the
reader’s behavior in this situation:
– Read the uncommitted data.
– Reads the last committed data.
– Blocked (wait) until the exclusive lock on the resource is released.
The different Isolation Levels are:
1.
Read Uncommitted
2.
Read Committed
3.
Repeatable Read
4.
Serializable
5.
Snapshot
Read Committed is the default isolation level. However, it
can be changed from Query Window as well as Management Studio Tools.
The syntax is:
SET
TRANSACTION ISOLATION LEVEL
{READ UNCOMMITTED
| READ COMMITTED
| REPEATABLE READ
| SNAPSHOT
|
SERIALIZABLE
}
In SSMS tools menu, Options should be selected. Under Query
Execution -> Advanced, then the drop down for Set Transaciton Isolation
Level can be modified.
1)
Read Uncommitted
This level is the lowest level of isolation in which a
reader doesn’t ask for a shared lock to read data.
– Dirty Writes: It
occurs in a situation where we have a transaction that updates a resource and
another transaction updates the same resource before the first completion.
This level can avoid this anomaly by the exclusive locks
conflict. In other words, if a writer holds an exclusive lock, any other writer
can’t obtain an exclusive lock simultaneously and must wait until the other
exclusive lock is released.
– Dirty Reads: These
occur in a situation where a transaction reads changes from another uncommitted
transaction.
This anomaly occurs because this level doesn’t ask for a
shared lock so there is no conflict with any writer having an exclusive lock
which means the reader can read uncommitted changes.
--Read
uncommitted example 1
--Session
1
begin tran
update emp set Salary=999 where ID=1
waitfor delay '00:00:15'
rollback
--Session
2
set
transaction isolation level read uncommitted
select Salary from Emp where ID=1
set transaction isolation
level read committed
select * from Emp with(nolock)
2)
Read Committed
With Read Committed, transactions issue exclusive locks at
the time of data modification, thus not allowing other transactions to read the
modified data that is not yet committed. The Read Committed isolation level
prevents the Dirty Read issue. However, data can be changed by other
transactions between individual statements within the current transaction,
resulting in a Non-repeatable Read or a Phantom Row.
Another important note you have to know is that a shared
lock is released as soon as a reader (transaction statement) is
done not the entire transaction.
This isolation level is the default level at SQL Server,
you can get the isolation
level option by running this
statement DBCC
useroptions;
--Read
committed example 1:
--Session
1
begin
tran
update emp set Salary=999 where ID=1
waitfor delay '00:00:15'
commit
--Session
2
set transaction isolation
level read committed
select Salary from Emp where ID=1
Read committed example
2
--Read committed
example 2:
--Session1
begin tran
select * from Emp
waitfor delay '00:00:15'
commit
--Session2
set transaction isolation
level read committed
select * from Emp
Read committed example
3
--Read
committed example 3:
--Session
1
begin tran
select * from
emp
waitfor delay '00:00:15'
update emp set Salary=999 where ID=1
commit
--Session
2
set transaction isolation
level read committed
select Salary from Emp where ID=1
3) Repeatable Read
At this level, a reader has to request a shared lock to
read data, and this reader holds this lock until the end of the entire
transaction (not only the statement like the read committed) which
means if a reader holds a shared lock, there is no writer can obtain an
exclusive lock to update this resource until releasing the shared lock by
completing the transaction.
--Repeatable
Read Example 1
--Session
1
set transaction isolation
level repeatable read
begin tran
select * from emp where ID in(1,2)
waitfor delay '00:00:15'
select * from Emp where ID in (1,2)
rollback
--Session
2
update emp set Salary=999 where ID=1
--Repeatable
Read Example 2
--Session
1
set transaction isolation
level repeatable read
begin tran
select * from emp
waitfor delay '00:00:15'
select * from Emp
rollback
--Session
2
insert into Emp(ID,Name,Salary)
values( 11,'Stewart',11000)
--Repeatable
Read Example 3
--Session
1
set transaction isolation
level repeatable read
begin tran
select * from emp where ID in(1,2)
waitfor delay '00:00:15'
select * from Emp where ID in (1,2)
rollback
--Session
2
update emp set Salary=999 where ID=3
4) Serializable
At this level, a reader has to request a shared lock to
read data, and this reader holds this lock until the end of the entire
transaction on top of that a reader locks any row that qualifies the query’s
filter which means that the reader not only locks the already existing rows but
also any rows may be inserted in the future satisfy the query’s filter by
another transaction.
--Serializable
Example 1
--Assume
table does not have index column.
--Session
1
set transaction isolation
level serializable
begin tran
select * from emp
waitfor delay '00:00:15'
select * from Emp
rollback
--Session
2
insert into Emp(ID,Name,Salary)
values( 11,'Stewart',11000)
--Serializable
Example 2
--Assume
table has primary key on column "ID". In our example script, primary
key is not added.
--Add
primary key on column Emp.ID before executing below examples.
--Session
1
set transaction isolation
level serializable
begin tran
select * from emp where ID between 1 and 3
waitfor delay '00:00:15'
select * from Emp where ID between 1 and 3
rollback
--Session
2
insert into Emp(ID,Name,Salary)
values( 11,'Stewart',11000)
DEMO:
Prerequisites
Scripts for sample table creation and data population
are detailed below. These tables will be required to demonstrate the working of
different isolation levels.
IF OBJECT_ID('Emp') is not
null
begin
DROP TABLE Emp
end
create table Emp(ID int,Name
Varchar(50),Salary Int)
insert into Emp(ID,Name,Salary)
values( 1,'David',1000)
insert into Emp(ID,Name,Salary)
values( 2,'Steve',2000)
insert into Emp(ID,Name,Salary)
values( 3,'Chris',3000)
No comments:
Post a Comment