Memory Management

In order for programs to be executed by the processor, they must be loaded into main memory, which is several orders of magnitude faster than secondary storage in terms of its access time. Historically, main memory has always been significantly more expensive than secondary storage, and system designers have tried to optimise its use. Although costs have fallen steadily, this is still the case today, and modern operating systems and applications require large amounts of memory in order to run.

The memory management component of an operating system is concerned with the organisation and management of system memory. It determines how memory is allocated to processes, responds to constantly changing demands, and interacts with memory management hardware (if present) to maximise efficiency.


One of the main limitations imposed on programmers in the early days of computing was the size of the computer's memory. If the program was larger than the available memory, it could not be loaded, which placed severe restrictions on program size. The obvious solution would be to increase the amount of memory available, but this would significantly increase the cost of the computer system.

One way for a programmer to overcome these limitations was to use overlays. The programmer divided the program into a number of logical sections. A small portion of the program had to remain in memory at all times, but the remaining sections (or overlays) were loaded only when they were needed. The use of overlays allowed programmers to write programs that were much larger than physical memory, although responsibility for managing memory usage rested with the programmer rather than the operating system.

A program is written using overlays to overcome memory limitations

A program is written using overlays to overcome memory limitations

Partitions and protected memory

Another issue that must be dealt with in terms of memory management is that of memory protection. In a single-user, single-tasking system, this is relatively easy, since it is simply a matter of protecting the operating system's memory space from the user's program. Memory is effectively split into two partitions – one partition is reserved for the operating system, while the other is available for user programs.

Protection can be implemented using a boundary register built into the processor. The boundary register contains the base memory address of the user program, effectively defining the lower bounds of the memory space available to user processes. Each time a user process references a memory address, the system checks the referenced address against the value stored in the boundary register. If the user process has attempted to reference a memory location with a lower address, the process is terminated with an appropriate error message.

Operating system memory is protected

Operating system memory is protected

In a multi-programming environment, matters are inevitably going to be far more complicated. In early multi-programming systems, the memory management function had to protect the operating system memory space from all user processes loaded into memory as before, but it now also had to protect those user processes from each other.

These early multi-programming systems used fixed partitions to allocate memory to user processes, each of which could hold a single process. Each partition was defined by two boundary registers – a base register and a limit register for the for the lower and upper memory addresses respectively. Although more complex than systems in which only a single user process was allowed to run, memory management was still relatively simple because each process ran within a pre-defined partition.

A multi-programming system with fixed partitions

A multi-programming system with fixed partitions

Internal fragmentation

The main problem with the fixed-partition system was that each partition could accommodate only a single process. If the process did not use all of the memory within a partition, the remaining memory could not be used for any other process. The loss of available memory within a partition due to the fact that the process allocated to the partition may be significantly smaller than the partition in which it executes is known as internal fragmentation. Often, there was enough unused system memory to run another process, but there were no available partitions. This meant that the unused memory could not be utilised, and was plainly an inefficient use of memory resources.

Processes are queued awaiting an available partition

Processes are queued awaiting an available partition

Variable partitions

The main problem with fixed partitions is that they are rarely if ever exactly the right size for a particular process. If the partition is too small, a process cannot be loaded into it and must await the availability of a large partition. If the partition is too large, the process may be loaded but significant amounts of the available memory within the partition will be wasted.

The idea of a variable-size partition allows the operating system to create partitions of exactly the right size for a particular process wherever there is sufficient space available. Internal fragmentation is thus eliminated, but a new problem emerges. Processes are initially added to the system using contiguous memory space. As each process completes, however, it is removed from the system leaving a hole in memory. This hole can be used for a new process, providing it will fit into the space made available. Since it highly unlikely that any new processes will fit exactly into the gaps left by the old processes, the only processes able to use these gaps in memory will be smaller than their predecessors.

Processes in variable partitions can leave memory holes

Processes in variable partitions can leave memory holes

The operating system must also determine where to place incoming programs when there may be a number of holes large enough to accommodate the newly created process. There are three memory placement strategies that have been commonly used for this purpose:

Over time and regardless of the memory placement strategy used, a number of small holes will be created in memory that cannot be utilised by new processes because they are simply too small. This is referred to as external fragmentation. In order to counter this, operating systems evolved the ability to relocate partitions so that the holes were effectively moved to the top of memory and merged into one much larger unused memory space. This procedure is sometimes called compaction, and while it enables memory to be used far more efficiently, it increases yet again the complexity of the memory management function. The additional system overhead required to implement a compaction scheme is considerable.

Memory compaction

Memory compaction

In early single-tasking operating systems, only one process could reside in memory at any one time. This did not allow for particularly efficient use of the processor, since each time execution of the resident process was halted awaiting the completion of some I/O operation such as reading from or writing to disk, the processor was effectively idle.

As the amount of available memory increased it became possible to load multiple processes into memory simultaneously, enabling the processor to be quickly switched between them to provide the illusion that a number of processes were executing at the same time. It also enabled the processor to be immediately allocated to a waiting process in the event of the current process being halted to await the completion of an I/O request.

In multitasking systems, numerous processes now compete for the available memory, and one or more of the resident processes may have to give up memory to accommodate an incoming process with a higher priority. A process that relinquishes its memory before it has completed in order to make way for another process is temporarily stored on disk (in what is often referred to as virtual memory) until system memory is once more available (or until it achieves a sufficiently high priority level to enable it to replace another process currently stored in main memory).

The movement of processes back and forth between system memory and secondary storage in this way is called "swapping", and may occur several times during the execution of a single process. Inevitably, swapping incurs additional operating system overhead and adds complexity to the business of managing memory. It should also be fairly self-evident that the degree to which swapping must occur can be reduced by increasing the amount of memory available.

Paging and segmentation

The need to load an entire program into memory in one contiguous block places severe restrictions on memory usage. The use of overlays overcame the limitations imposed by the physical size of memory, but in order to make efficient use of the processor, several processes needed to be in memory at one time.

As physical memory became larger this was certainly possible, but the memory space required by each process could vary widely, and for the operating system, the main problem was trying to find the most suitable memory space for a particular process. Inevitably, some of the memory could not be used, and in many cases a process would be too big to fit into any of the available spaces.

One answer was to break each process down into a number of relatively small fixed-size blocks, called pages, which could be loaded into memory as and when required. The memory itself was also organised into blocks of the same size, called page-frames. Each page could be loaded into any available page frame, which meant that the entire available memory space could be used for user processes, and allowed far more processes to be in memory at the same time. This was a far more efficient use of computer memory, but also imposed a much greater burden on the operating system in terms of memory management overhead, since it must now keep track of multiple pages for each user process.

One of the benefits of a paging system is that only a relatively small subset of the pages belonging to a particular process need to be loaded at any one time. In fact theoretically, only the page that is currently executing needs to reside in memory. When a process references a page that is not in main memory, it generates a page fault, which prompts the operating system to load the missing page into memory from secondary storage.

The frequency with which this occurs will depend on the size of the page, and the total number of pages which make up a process. In practice, there will be a minimum number of pages that should reside in memory at any given time for a particular process. This number is known as the minimum working set, and will be determined to a large extent by the overall size of the user program.

Another method of utilising memory, called segmentation, is also based on the notion of splitting a process into blocks, but the size of the blocks (or segments) is variable. A scheme that allows the use of variable size segments can be useful from a programmer's point of view, since it lends itself to the creation of modular programs, but the operating system now not only has to keep track of the starting address of each segment, but since they are variable in size, must also calculate the offset to the end of each segment. Some systems combine paging and segmentation by implementing segments as variable-size blocks composed of fixed-size pages.

Virtual memory

The concept of virtual memory is closely associated with that of paging. As we have mentioned previously, virtual memory effectively extends the amount of memory available to applications by using some of the system's secondary storage space (i.e. the hard disk drive) as working memory. All or part of a process may be swapped between real memory and this virtual memory. This both allows more processes to co-exist on the system and eliminates the limitations on overall process size.

The only factor that limits process size, in fact, is the amount of secondary storage that is made available for virtual memory. In a sense, virtual memory systems create the illusion that there is more working memory available than is actually the case.

The use of a paging scheme, in conjunction with the use of virtual memory, leads to a situation where the individual pages of a process may be located in a number of different locations in both real memory and virtual memory. In fact, during the course of process execution, an individual page may be swapped between real and virtual memory a number of times, and is unlikely to occupy the same address twice.

This presents the operating system with a seemingly monumental task in terms of keeping track of the whereabouts of each page, for each process on the system. Furthermore, the process of swapping pages in and out of memory creates additional system overhead. As previously mentioned, it is generally accepted that the more random access memory (RAM) a system has the better it will perform, since less time will be spent swapping pages between real and virtual memory.

In a system that employs virtual memory, the size of the pages can have an effect on system efficiency. If the pages are small, the operating system will require more tracking information for each process because there will be more pages. If, on the other hand, the size of the pages is large, it will take longer to transfer each page between real memory and virtual memory, and vice versa, when swapping occurs.

Memory mapping

In a paged virtual memory system, there are actually two different kinds of address. A program instruction within a process, for example, will have an internal (or virtual) address that will be used to reference it from within the process. A virtual address will consist of a page number and an offset from the start of the page.

When the process is loaded into memory, however, the same program instruction will also have a physical address in memory. When a virtual address is referenced by a process, therefore, the operating system must translate this virtual address to a physical address. This is further complicated by the fact that the various pages of a process may be loaded into non-contiguous page frames, and may be swapped in and out of memory several times during the lifetime of the process.

A virtual address consists of the page number and offset

A virtual address consists of the page number and offset

The operating system creates a page table in memory for each process. The table contains a page table entry for each page of the process, and the entries are ordered sequentially by page number. When a process is assigned to the processor, the address of the page table in memory is loaded into a special purpose register called the page map origin register.

Each time the process references a virtual address, the system interrogates this register to obtain the base address of the page table, and adds the page number to give it the address of the page table entry. The entry will hold the starting address in memory of the page. Adding the offset will provide the required physical address.

The diagram below illustrates how a virtual address referenced by a process is translated into a physical address in memory. The mapping is performed dynamically by a high-speed, special-purpose hardware unit called the memory management unit (MMU).

Virtual address translation (memory mapping)

Virtual address translation (memory mapping)

Note that only some of the pages belonging to a particular process will be loaded into memory at any given time. The page table entry for each page must therefore indicate whether or not the page currently resides in memory. This is achieved using a 1-bit field in the entry called the resident bit. If the resident bit is set to 1, the entry gives the page frame number. If it is set to 0, the entry provides the location of the page in secondary storage.

Page replacement policies

When a new page needs to be brought into memory, it will often be necessary to remove one that is currently resident. The algorithms used to determine which pages will be replaced are called page replacement policies. Some of the policies that may be used include: