Segmentation and Paging unit

This tutorial is to explain about the organization of memory in Linux. Memory is the most important part of a system. Each and every application need some kind of memory for storing its data. In Linux applications access the memory space in a distinct manner. The memory-management unit (MMU) is the hardware responsible for implementing virtual memory.

Virtual Memory

The concept of virtual memory is one of the very powerful aspects of memory management. Since the initial era of computers the need of memory more than the existing physical memory has been felt. Over the years, many solutions were used to overcome this issue and the most successful of them has been the concept of virtual memory.

Virtual memory makes your system appear as if it has more memory than it actually has. This may sound interesting and may prompt one to ask how is this possible. So, let’s understand the concept.

To start, we must first understand that, virtual memory is a layer of memory addresses that map to physical addresses.
In the virtual memory model, when a processor executes a program instruction, it reads the instruction from virtual memory and executes it.
But before executing the instruction, it first converts the virtual memory address into a physical address.
This conversion is done based on the mapping of virtual to physical addresses that is based on the mapping information contained in the page tables (that are maintained by the OS).

The virtual and physical memory is divided into fixed length chunks known as pages. In this paged model, a virtual address can be divided into two parts:

An offset (Lowest 12 bits)
A virtual page frame number (rest of the bits)

Whenever the processor encounters a virtual address, it extracts the virtual page frame number out of it. Then it translates this virtual page frame number into a physical page frame number and the offset parts help it to go to the exact address on the physical page. This translation of addresses is done through the page tables.

Segmentation unit

The segmentation unit converts the logical address into a linear address of the memory. The user can access only logical (virtual) memory instead of physical memory. Logical memory is redirected to physical memory by the memory management unit. The logical address consists of a segment identifier (segment selector) and offset. Physical memory is divided into various segments like code segment, data segment, stack segment and extra segment. We can access any segment trough register like CS, DS, SS, ES, GS and FS which are provided by the processor.

CS (Code Segment) – store the program instruction

SS (Stack Segment) – store the current program stack

DS (Data Segment) – store the local and global data of programs

Each segment is represented by an 8 byte segment descriptor. It describes the characteristics of the segment.

Segment descriptor is stored in a global descriptor table or local descriptor table.

Fig. 1: Overview of Segmentation Unit

This is segmentation block which converts logical address to linear address.

Segment Selector

The Segment selector is a 16 bit field which provides the information such as segment descriptor identity, table type (GDT or LDT) and request of privilege level.

Fig. 2: Image showing Data Format of Segment Selector

The Index is 13 bit MSB field of segment selector, which is an identity of segment descriptor entry contained in the GDT or LDT.

TI stands for Table Indicator. The Segment selector is stored in GDT if TI bit is zero and stored in LDT if TI bit is set.

RPL is a 2 bit field of segment selector which is to specify the current privilege level (i.e. User level or kernel level) of CPU, when corresponding segment selector load into a code segment register.

Offset

Offset is added into the base address to jump to the absolute address of program in memory. Offset is a 32 bit field in segment selector.

Logical address is the combination of a segment selector and offset.

Logical Address = Segment Selector (16 bit) + Offset (13 bit)

Gdtr or Ldtr

Each process has individual LDT, but only one single GDT in the system. Size of LDT is stored in ldtr (Local Descriptor Table Register) and size of the GDT is stored in gdtr (Global Descriptor Table Register).

GDT or LDT

GDT is a Global descriptor table which describes the shared memory or kernel memory. The System will create a LDT when a new process is created by the user. Each process has its own LDT which is stored in one single GDT. LDT is to store the private memory information of particular single process or task.

Global Descriptor Process

Fig. 3: Image showing Global Descriptor Process

GDT holds the entry of segment descriptors. Each entry of segment descriptor is 8 bytes long. The first entry in GDT always starts from zero. Maximum number of entries depends on the index of the selector. Here index size is 13 bits long in a segment selector so we can store up to 2¹³ – 1 entry in GDT.

Relative address of GDT or LDT can be found by the following formula:

Relative address of GDT or LDT = GDT or LDT address + [13 bit index of selector x 8 ]

For example, GDT at 0x00002300 (this address stored in gdtr) and index value of segment selector is set to 4.

Relative address of GDT or LDT = 0x00002300 + [4 x 8]

= 0x00002300 + 32 (32 = 0x20 in hex)

= 0x00002320

Offset address is added to the base address of segment descriptor which gives linear address.

Segment Descriptor

Segment descriptor provides the information of a segment selector and the type of segment.

Paging Unit

The Paging unit converts the linear to the physical address of memory. What is page in memory? Page is a linear address group in the fixed length interval. RAM is partitioned into fixed length page frame approximately each page is 4KB size. When you create any simple file in Linux system, the system allocates 4 KB, i.e., one page in the file. Memory consists of a number of page frames. Do not confuse between page and page frame. Page frame is a different collection of page in memory.

Before understating paging mechanism, we clear the some confusion about various units.

Page: Page is a group of physical memory with fixed size.

Page Frame: Memory consists of a number of page frame. One page frame consist of number of pages.

Page Table: Page table contain the number of

Fig. 4: Overview of Paging Unit

Output of segmentation unit is 32 bit linear address which is converting from logical address. A Linear address is divided in three groups:

Any one of them paging mechanism can be used in organization of memory in the system:

Regular Paging
Extended Paging

Regular Paging

Regular paging of 32 bit linear address is divided into three field named directory, table and offset.

Fig. 5: Image showing Format of Regular Paging

The Linux system uses two step translation process from linear to physical address based on translation table. The First translation table is Page directory and second translation table is page table.

CR3 contains the address of the current paging directory. We can get the address of page table by adding page directory address in the Directory field of linear address. Page directory and page table can store 1024 entries and page can store 4096 entries. Each page frame size is 4 KB. Page is allowed up to 2¹² (4096 byte ~ 4 KB) entries in a single page. So we can access total memory cell up to 2³² entries.

Extended Paging

Extended paging is same as regular paging, but provides a large linear address. Extended paging is allowing 4 MB page size. Offset size is larger than regular paging and it is contain 22 lest significant bit of 32 bit address.