Parity Block

Part 1:

Suppose we are given 3 data disks:

disk 1: 1111 0000
disk 2: 1010 1010
disk 3: 0011 1000

————————————————————–

To find compute the redundant disk, we perform mod-2 sum:

disk 4: 0110 0010

 

Part 2:

Suppose we updated the value for disk 2:

disk 2 old value: 1010 1010
disk 2 new value: 1100 1100

————————————————————–

Mod-2 Sum: 0110 0110
*1’s are the position to change (2,3,6,7)

disk 4 original: 0110 0010
disk 4 updated: 0000 0100

Result:
disk 1: 1111 0000
disk 2: 1100 1100
disk 3: 0011 1000
disk 4: 0000 0100

 

Transaction Management Overview

ACID properties:

  • Atomicity: no partial transaction. It’s either all or none.
  • Consistency: for example, when user transfers fund from one account to another, no penny should be loss during the transfer. The ending states should be consistent
  • Isolation: many of the transactions are execute in parallel. This property ensure that the result of T1->T2 is the same as T2->T1. In another words, the transactions are independent from one another.
  • Durability: a transaction should be able to recover if unexpected crush occurs

Recovery manager ensures atomicity and durability properties.

Consistency and isolation is dealt in DB scheduling.

 

Transactions and Schedules

Schedule: a list of actions (reading, writing, aborting or committing) from a set of transactions).

Complete schedule: contains either an abort or a commit action for each transaction whose actions are listed in it

Serial schedule: transaction that are not interleaved.

 

 

 

1.10 Points to Review

  • A database management system (DBMS) is software that supports management of large collections of data. A DBMS provides efficient data access, data independence, data integrity, security, quick application development, support for concurrent access, and recovery from system failure. (Section 1.1)
  • Storing data in a DBMS versus storing it in operating system files has many advantages. (Section 1.3)
  • Using a DBMS provides the user with data independence, efficient data access, automatic data integrity, and security. (Section 1.4)
  • The structure of the data is described in terms of a data model and the description is called a schema. The relational model is currently the most popular data model. A DBMS distinguishes between external, conceptual, and physical schemas and data independence, which are made possible by these three levels of abstraction, insulate the users of a DBMS from the way the data is structured and stored inside a DBMS. (Section 1.5)
  • query language and a data manipulation language enable high-level access and modification of the data. (Section 1.6)
  • transaction is a logical unit of access to a DBMS. The DBMS ensures that either all or none of a transaction’s changes are applied tot he database. For performance reasons, the DBMS processes multiple transactions concurrently, but ensures that the result is equivalent to running the transactions one after the other in some order. The DBMS maintains a record of all changes to the data in the system log, in order to undo partial transactions an recover from system crashes. Checkpointing is a periodic operation that can reduce the time for recovery from a crash (Section 1.7)
  • DBMS code is organized into several modules: the disk space manager, the buffer manager, a layer that supports the abstractions of files and index structures, a layer that implements relational operators, and a layer that optimizes queries and produces an execution plan in terms of relational operators. (Section 1.8)
  • database administrator (DBA) manages a DBMS for an enterprise. The DBA designs schemas, provide security, restore the system after a failure, and periodically tunes the database to meet changing user needs. Application programmers develop applications that use DBMS functionality to access and manipulate data, and end users involve these applications. (Section 1.9)

1.9 People who Deal with Databases

  • Database implementors, who build DBMS software; end users, who wish to store and use data in a DBMS
  • Database application programmers develop packages that facilitate data access for end users, who are usually not computer professionals, suing the host or data languages and software tools that DBMS vendors provide
  • Database administrator is responsible for:
    • Design of the conceptual and physical schemas: interacting with the users of the system to understand what data is to be stored in the DBM and how it is likely to be used
    • Security and authorization: ensuring that unauthorized data access is not permitted
    • Data availability and recovery from failures: ensuring if the system fails, users can continue to access as much of the uncorrupted data as possible
    • Database tuning: modifying the database to ensure adequate performance as user requirements change.

       

1.8 Structure of a DBMS

  • When a user issues a query, the parsed query is presented to a query optimizer, which uses information about how the data is stored to produce an efficient execution plan for evaluating the query.
  • An execution plan is a blueprint for evaluating a query, and is usually represented as a tree of relational operators (with annotations that contain additional detailed information about which access methods to use, etc).
  • Files and Access Methods layer includes a variety of software for supporting the concept of a file, which, in DBMS, is a collection of pages or a collections of records. This layer typically supports a heap file, or file of unordered pages, as well as indexes.
  • Buffer manager brings pages in from disk to main memory as needed in response to read requests.
  • The lowest layer of the DBMS software deals with management of space on disk, where the data is stored. Higher layers allocate, deallocate, read, and write pages through the disk space manager.
  • DBMS components associated with concurrency control and recovery include:
    • transaction manager, which ensures that transactions request and release locks according to a suitable locking protocol and schedules the execution transactions;
    • lock manager, which keeps track of requests for locks and grants locks on database objects when they become available
    • recover manager, which is responsible for maintaining a log, and restoring the system to a consistent state after a crash.

1.7 Transaction Management

  • A transaction is any one execution of a user program in a DBMS.
  • Concurrent Execution of Transactions
    • locking protocol is a set of rules to be followed by each transaction and enforced by DMS in order to ensure that even though actions of several transactions might be interleaved, the net effect is identical to executing all transactions in some serial order.
    • Lock is a mechanism used to control access to database objects
      • shared locks on an object can be held by two different transactions at the same time
      • exclusive lock on an object ensures that no other transactions hold any lock on this object.

         

  • Incomplete Transactions and System Crashes
    • The DBMS maintains a log of all writes to the database
    • Write-Ahead Log (WAL): each write action must be recorded in the log (on disk) before the corresponding change is reflected in the database itself.
    • Checkpoint: The procedure of periodically forcing some information to disk to reduce the time required to recover from a crash.
  • Summary
    • Every object that is read or written by a transaction is first locked in shared or exclusive mode, respectively. Placing a lock on an object restricts its availability to other transactions and thereby affects performance.
    • For efficient log maintenance, the DBMS must be able to selectively force a collection of pages in main memory to disk. Operating system support for this operation is not always satisfactory.
    • Periodic checkpointing can reduce the time needed to recover from a crash. Of course, this must be balanced against the fact that checkpointing too often slow down normal execution.

1.6 Queries in a DBMS

  • Questions involving the data stored in a DBMS are called queries.
  • Relational calculus is a formal query language based on mathematic logic and queries in this language have an intuitive, precise meaning.
  • Relational algebra is another formal query language, based on a collection of operators for manipulating relation, which is equivalent in power to the calculus.
  • A DBMS enables users to create, modify, and query data through a data manipulation language (DML).
  • The DML and DDL are collectively referred to as the data sublanguage when embedded within a host language.