Sunday, November 30, 2008

Oracle Database Architecture

Oracle Database Architecture

Oracle is an RDBMS (Relational Database Management System). The Oracle database architecture can be described in terms of logical and physical structures. The advantage of separating the logical and physical structure is that the physical storage structure can be changed without affecting the logical structure.

Logical Structure

The logical structure for Oracle RDBMS consists of the following elements:
· Tablespace
· Schema

Tablespace:
The Oracle database consists of one or more logical portions called as ‘Tablespaces’. A tablespace is a logical grouping of related data. A database administrator can use Tablespaces to do the following:
· Control disk space allocation for database data.
· Assign specific space quotas for database users.
· Perform partial database backup or recovery operations.
· Allocate data storage across devices to improve performance.

Each database has at least one Tablespace called SYSTEM Tablespace. As part of the process of creating the database, Oracle automatically creates the SYSTEM tablespace. Although a small database can fit within the SYSTEM tablespace, it's recommended that to create a separate tablespace for user data. Oracle uses the SYSTEM tablespace to store information like the data dictionary. Data dictionary stores the metadata (or the data about data). This includes information like table access permissions, information about keys etc.
Data is stored in the database in form of files called as datafiles. Each Tablespace is a collection of one or more Datafiles. Each data file consists of ‘Data blocks’, ‘extents’ and ‘segments’.

Data Blocks:
At the finest level of granularity, an ORACLE database's data is stored in data blocks (also called logical blocks, ORACLE blocks, or pages). An ORACLE database uses and allocates free database space in ORACLE data blocks.

Extents:
The next level of logical database space is called an extent. An extent is a specific number of contiguous data blocks that are allocated for storing a specific type of information.

Segments:
The level of logical database storage above an extent is called a segment. A segment is a set of extents that have been allocated for a specific type of data structure, and all are stored in the same tablespace. For example, each table's data is stored in its own data segment, while each index's data is stored in its own index segment. ORACLE allocates space for segments in extents. Therefore, when the existing extents of a segment are full, ORACLE allocates another extent for that segment. Because extents are allocated as needed, the extents of a segment may or may not be contiguous on disk, and may or may not span files. An Oracle database can use four types of segments:

· Data segment--Stores user data within the database.
· Index segment--Stores indexes.
· Rollback segment--Stores rollback information. This information is used when data must be rolled back.
· Temporary segment--Created when a SQL statement needs a temporary work area; these segments are destroyed when the SQL statement is finished. These segments are used during various database operations, such as sorts.

Schema:
The database schema is a collection of logical-structure objects, known as schema objects that define how you see the database's data. A schema also defines a level of access for the users. All the logical objects in oracle are grouped into a schema. A scheme is a logical grouping of objects such as:

· Tables
· Clusters
· Indexes
· Views
· Stored procedures
· Triggers
· Sequences

Physical Structure:

The physical layer of the database consists of three types of files:
1. One or more Datafiles
2. Two or more redo log files
3. One or more control files



Datafiles (.dbf files):
Datafiles store the information contained in the database. One can have as few as one data file or as many as hundreds of datafiles. The information for a single table can span many datafiles or many tables can share a set of datafiles. Spreading tablespaces over many datafiles can have a significant positive effect on performance. The number of datafiles that can be configured is limited by the Oracle parameter MAXDATAFILES.

Data is stored in the Oracle database in form of files called as datafiles. Each Tablespace is a collection of one or more Datafiles. Each data file consists of ‘Datablocks’, ‘Extents’ and ‘Segments’. Some systems put a limit on the number of datafiles that can be used. So, efficient definition of tablespaces and datafiles is important.

Redo Log Files (.rdo & .arc):
Oracle maintains logs of all the transaction against the database. These transactions are recorded in files called Online Redo Log Files (Redo Logs). The main purpose of the Redo log files is to hold information as recovery in the event of a system failure. Redo log stores a log of all changes made to the database. The redo log files must perform well and be protected against hardware failures (through software or hardware fault tolerance). If redo log information is lost, one cannot recover the system. When a transaction occurs in the database, it is entered in the redo log buffers, while the data blocks affected by the transactions are not immediately written to disk. In an Oracle database there are at least three or more Redo Log files. Oracle writes to redo log file in a cyclical order i.e. after the first log file is filled, it writes to the second log file, until that one is filled. When all the Redo Log files have been filled, it returns to the first log file and begin overwrite its content with new transaction data. Note, if the database is running in the ARCHIVELOG Mode, the database will make a copy of the online redo log files before overwriting them.

Control Files (.ctl):
Control files record control information about all of the files within the database. These files maintain internal consistency and guide recovery operation. Control files contain information used to start an instance, such as the location of datafiles and redo log files; Oracle needs this information to start the database instance. Control files must be protected. Oracle provides a mechanism for storing multiple copies of control files. These multiple copies are stored on separate disks to minimize the potential damage due to disk failure. The names of the database’s control files are specified via the CONTROL_FILES initialization parameter.


Oracle Instance and Database:
In Oracle to access the data in the database, Oracle uses a set of Background processes that are shared by every user. Also, along with this there is some memory structure that stores the most recently queried data from the database.

A Database Instance (also save as Server) is set of memory structures and background processes that access a set of database files. It is possible for the single database to contain multiple instances, which is known as Real Application Cluster. The parameters that determine the size and composition of an oracle instance are either stored in an initialization file called init.ora or in spfile.ora. The initialization parameter file is read during instance start up and may be modified by the DBA. Any modification made will not be affected until the next start up.

System Global Area (SGA):
Oracle uses an area of shared memory called the system global area (SGA) and a private memory area for each process called the program global area (PGA). The SGA consists of: System global area: The SGA is a shared memory region that contains data and control information for one oracle instance. Oracle allocates the SGA when an instance starts and de-allocates it when the instance shut downs. Every instance has the SGA .The entire SGA should be large as possible to increase the system performance and to reduce disk I/O.

The information stored in the SGA is divided into three memory structures,

1) Database buffers cache
2) Redo log buffers
3) Shared pool

Database buffers cache:
The database buffer stores the most recently used blocks of data. The set of database buffers in an instance is the database buffer cache. The buffer cache contains modified as well as unmodified blocks. Because the most recently and most frequently used data is kept in memory. It improves the performance of system by reducing the disk I/O operations.

Redo log buffers:
The redo log buffer stores redo entries. This is a log of changes made to the database. The redo entries stored in the redo log buffers are written to an online redo log. An online redo log is a set of two or more files that record all the changes made to oracle data files and control files.

Shared pool:
The shared pool caches various constructs that can be shared among users, for example SQL shared area. For example’s SQL statements are cached so that they can be reused. Stored procedures can be cached for faster access. Note that in previous versions “out of memory” error were occurs if the shared memory was full. In oracle 10g it does not happen. It provides automatic shared memory tuning.

Program global area:
PGA is a memory buffer that contains data and control information for a server process. A server process is a process that services a client’s requests. A PGA is created by oracle when a server process is started. The information in a PGA depends on the oracle configuration. The PGA area is a non-shared area of memory created by oracle when a server process is started. The basic difference between SGA and PGA is that PGA cannot be shared between multiple processes in the sense that it is used only for requirements of a particular process whereas the SGA is used for the whole instance and it is shared.

Processes:
The relationships between the databases’ physical and memory structures are maintained and enforced by the Background Process. These are database’s own background processes that may vary in the number depending in your database configuration. The Trace files are only created when there is any problem.

Some of the Background Processes are:

SMON:
SMON stands for System Monitor and this database background process performs instance recovery at the start of the database. In a multiple instance system, SMON of one instance can also perform instance recovery for other instances that have failed. SMON also cleans up temporary segments that are no longer in use and recovers dead transactions skipped during crash and instance recovery because of file-read or offline errors. It coalesces i.e. combines contiguous free extents into larger free extents.

PMON:
PMON stands for the Process Monitor and this database background process cleans up failed user processes. PMON is responsible for releasing the lock i.e. cleaning up the cache and freeing resources that the process was using. Its effect can be seen when a process holding a lock is killed.

DBWR:
The DBWR (Database Writer) background process is responsible for managing the contents of the data block buffer cache and dictionary cache. DBWR performs batch writes of changed block. Since Oracle uses write-ahead logging, DBWR does not need to write blocks when a transaction commits. In the most common case, DBWR writes only when more data needs to be read into the system global area and too few database buffers are free. The least recently used data is written to the datafiles first. Although there is only one SMON and one PMON process running per database instance, one can have multiple DBWR processes running at the same time. Note the number of DBWR processes running is set via the DB_WRITER_PROCESSES.

LGWR:
The LGWR (Log Writer) background process manages the writing of the contents of the redo log buffer to the online redo log files. LGWR writes the log entries in the batches form. The Redo log buffers entries always contain the most up-to-date status of the database. Note LGWR is the only one process that writes to the online redo log files and the only one that directly reads the redo log buffer during to the normal database operation.

Archiver (ARCH):
The Archiver process reads the redo log files once Oracle has filled them and writes a copy of the used redo log files to the specified archive log destination(s). Actually, for most databases, ARCH has no effect on the overall system performance. On some large database sites, however, archiving can have an impact on system performance. We can specify up to ten ARCn processes for each database instance LGWR will start additional Archivers as needed, based on the load, up to the limit specified by the initialization parameter LOG_ARCHIVE_MAX_PROCESSES.

Checkpoint process (CKPT):
All modified information in database buffer in the SGA is written to the datafiles by a database write process (DBWR). This event indicates a checkpoint. The checkpoint process is responsible for signaling DBWR at checkpoints and updating all of the datafiles and control files of the database.

Recover (RECO):
The recover process automatically cleans up failed or suspended distributed transactions.