Yige

Yige

Build

Operating System Summary

Operating System Summary#

I. Overview#

Concurrency and Parallelism#

  • Concurrency: A single processor handles multiple tasks simultaneously; here, "simultaneously" is a macro concept, indicating that these tasks are executed alternately within a certain time period.
  • Parallelism: Multiple processors or multi-core processors handle multiple different tasks at the same time; here, "simultaneously" means truly executing multiple tasks at the same moment.

Sharing#

Sharing refers to the resources in the system being used by multiple concurrent processes. There are two types of sharing: mutual exclusion sharing and simultaneous sharing. Resources that are mutually exclusive are called critical resources, such as printers, where only one process is allowed to access them at the same time, requiring synchronization mechanisms to manage access to critical resources.

Virtualization#

Virtualization technology converts a physical entity into multiple logical entities, primarily through two types of virtualization technologies: time division multiplexing and space division multiplexing. For example, multiple processes can concurrently execute on the same processor using time division multiplexing, allowing each process to take turns occupying the processor, executing for a small time slice and quickly switching.

Synchronous and Asynchronous#

  • Synchronous means the entire processing sequence is executed in order, and results are returned only after all processes have completed. This is a linear execution method where the execution flow cannot cross. It is generally used for programs with strong procedural requirements, such as user login, where the system cannot log in until user verification is complete.

  • Asynchronous means that only the call instruction is sent, and the caller does not need to wait for the called method to complete; instead, it continues executing the following processes. This is a parallel processing method where there is no need to wait for one program to finish executing before executing other tasks, such as in the process of loading page data, where the page does not need to wait for all data to be retrieved before displaying.

Kernel Mode and User Mode#

  • Kernel Mode: The CPU can access all data in memory, including peripheral devices such as hard drives and network cards. The CPU can also switch itself from one program to another.

  • User Mode: Access to memory is limited, and access to peripheral devices is not allowed. The ability to occupy the CPU is stripped away, and CPU resources can be obtained by other programs.

  • There are three ways for user mode to switch to kernel mode: system calls, exceptions, and interrupts from peripheral devices.


II. Processes and Threads#

Process#

A process is an instance of a program in execution, with its own program counter and internal state, serving as an independent unit for resource allocation and scheduling in the system. The Process Control Block (PCB) describes the basic information and running state of the process. The creation and termination of a process refer to operations on the PCB.

Thread#

A thread is the basic unit of independent scheduling. A process can have multiple threads that share the process's resources. For example, in a browser, there are many threads within the browser process, such as HTTP request threads, event response threads, rendering threads, etc. The concurrent execution of threads allows the browser to respond to other user events while initiating an HTTP request when a new link is clicked.

Differences#

  • Resource Ownership
    A process is the basic unit of resource allocation, while a thread does not own resources; threads can access resources belonging to the process.

  • Scheduling
    Threads are the basic unit of independent scheduling. Within the same process, switching threads does not cause a process switch. Switching from a thread in one process to a thread in another process does cause a process switch.

  • System Overhead
    When creating or terminating a process, the system must allocate or reclaim resources such as memory space and I/O devices, resulting in overhead that is much greater than that incurred when creating or terminating threads. Similarly, during process switching, the current executing process's CPU environment must be saved, and the new scheduled process's CPU environment must be set up, while thread switching only requires saving and setting a small amount of register content, resulting in minimal overhead.

  • Communication
    Inter-process communication (IPC) requires synchronization and mutual exclusion mechanisms to ensure data consistency. Threads can communicate by directly reading/writing data segments (such as global variables) within the same process.

Process States#

  • Ready State: The process has acquired all resources outside the CPU and can execute immediately once CPU resources are allocated.

  • Running State: The process has acquired CPU resources and is executing.

  • Blocked State: When the program does not meet certain conditions, it must wait until those conditions are satisfied to execute, such as waiting for I/O operations; this state is called the blocked state.

Thread States#

  • New: The thread is derived from within a process; it can be spawned by either a process or another thread.

  • Blocked: If a thread needs to wait for an event during execution, it becomes blocked.

  • Unblocked: If the event that blocked the thread occurs, the thread is activated and enters the ready queue.

  • Scheduled: A ready thread is selected to enter the execution state.

  • Finished: If a thread finishes execution, its register context and stack contents will be released.

Thread Implementation Methods#

  • User-Space Thread Implementation
    User-level threads are implemented in user programs without kernel support. They do not depend on the operating system core, and application processes use thread library functions to create, synchronize, schedule, and manage threads. There is no need for user mode/kernel mode switching, making it fast. The operating system kernel is unaware of the existence of multiple threads, so if one thread blocks, the entire process (including all its threads) will block. Since processor time slice allocation is based on processes, the execution time for each thread is relatively reduced.

  • Kernel-Level Thread Implementation
    Kernel-level threads are created and terminated by the operating system kernel. The kernel maintains context information for processes and threads, as well as thread switching. A kernel thread blocked due to I/O operations does not affect the execution of other threads.

  • Advantages of User-Space Thread Implementation

    1. Can be implemented in operating systems that do not support threads.
    2. The cost of creating and destroying threads and the cost of thread switching are much lower than those of kernel threads.
    3. Allows each process to customize its own scheduling algorithm, making thread management more flexible.
    4. Threads can utilize more table space and stack space than kernel-level threads.
  • Disadvantages of User-Space Thread Implementation

    1. Only one thread can run at a time within the same process. If one thread uses a system call and blocks, the entire process will be suspended.
    2. Page faults can also cause the entire process to be suspended.
    3. The advantages and disadvantages of kernel threads are the opposite of those of user threads. In practice, the operating system can use a hybrid approach to implement threads.

Coroutines#

Coroutines are a lighter-weight alternative to threads. Just as a process can have multiple threads, a thread can also have multiple coroutines. Importantly, coroutines are not managed by the operating system kernel but are entirely controlled by the program (executed in user mode). This brings significant performance benefits, as switching between coroutines does not consume resources like thread switching does. Essentially, they are threads in user space (i.e., there is no context switching, and user and kernel address space switching occurs, but coroutines run in a single thread).

Applications of Coroutines#

  • Lua has used coroutines since version 5.0, implemented through the coroutine extension library.
  • Python can implement coroutines using yield/send. After Python 3.5, async/await became a better alternative.
  • Go language has a powerful and concise implementation of coroutines, allowing the easy creation of hundreds or thousands of coroutines to execute concurrently.
  • Java does not have native support for coroutines, but some open-source frameworks simulate coroutine functionality, such as the Kilim framework source code.

Questions and Thoughts#

  • There is too much mixed information online about the states of processes and threads, making it difficult to understand their relationship. Currently, I believe that processes should also include new and finished states, which is similar to threads. I think only running, ready, and blocked states should be considered basic states, while others like new, finished, and awakened should only be considered as operations for state transitions.
  • In Java, threads have an additional waiting state, which is essentially a finer distinction of the blocked state into BLOCKED and WAITING. The difference is that waiting is active waiting, while blocking is passive blocking. In the waiting state, the thread can only enter the ready state if it is actively awakened, while in the blocked state, the thread is continuously contending for a lock and will switch to the ready state as soon as it acquires it without needing to be actively awakened by others.

III. Inter-Process Communication Methods#

Pipes#

A pipe is a shared file that connects a reading process and a writing process for data exchange.

  • It is half-duplex (i.e., data can only flow in one direction) and has fixed read and write ends; a pipe only has meaning when both read and write ends exist.
  • When the read end finds the pipe empty, it needs to sleep and wait until data is available to be awakened; correspondingly, the write end also waits when the pipe is full until it is awakened.
  • It can be seen as a special file, and ordinary read, write, etc., functions can be used for its read and write. However, it is not an ordinary file and does not belong to any other file system, existing only in memory.
  • Mutual exclusion: Only one process can read or write to the pipe at any given time.
  • There are unnamed pipes and named pipes; the former is used for communication between parent and child processes, while the latter is used for communication between any two processes running on the same machine.

Message Queues#

A message queue is a linked list of messages stored in the kernel. A message queue is identified by an identifier (i.e., queue ID).

  • Message queues are record-oriented, with messages having specific formats and priorities.
  • Message queues are independent of the sending and receiving processes; when a process terminates, the message queue and its contents are not deleted.
  • Message queues can implement random querying of messages; messages do not necessarily have to be read in a first-in-first-out order and can also be read by message type.

Signals#

A more complex communication method used to notify the receiving process that a certain event has occurred; it is the only asynchronous communication mechanism in inter-process communication.

Semaphores#

Unlike the IPC structures already introduced, a semaphore is a counter. Semaphores are used to implement mutual exclusion and synchronization between processes, rather than to store inter-process communication data.

  • Semaphores are used for inter-process synchronization; if data needs to be passed between processes, shared memory must be combined.
  • Semaphores are based on the operating system's PV operations, and operations on semaphores are atomic operations.
  • Each PV operation on a semaphore is not limited to adding or subtracting 1 from the semaphore value; it can add or subtract any positive integer.
  • Supports semaphore groups.
  • Note⚠️: Semaphores and signals are not the same concept; there are significant differences between the two.

Shared Memory#

Refers to two or more processes sharing a given storage area.

  • Shared memory is the fastest form of IPC because processes directly access memory.
  • Since multiple processes can operate simultaneously, synchronization is required.
  • Semaphores and shared memory are usually used together, with semaphores synchronizing access to shared memory.

Sockets#

Sockets are also a mechanism for inter-process communication, enabling communication between processes on different hosts. A socket can be seen as an endpoint for inter-process communication, with each socket having a unique name that other processes can access, connect to, and communicate data.

Summary#

  • Pipes: Slow speed, limited capacity.
  • Message Queues: Capacity is limited by the system, and care must be taken when reading for the first time to consider any previous unread data.
  • Semaphores: Cannot pass complex messages, only used for synchronization.
  • Shared Memory: Can easily control capacity, fast speed, but synchronization must be maintained, such as when one process is writing, another must be careful about read/write issues, similar to thread safety. Of course, shared memory can also be used for communication between threads, but it is unnecessary since threads already share a block of memory within the same process.

IV. Synchronization Mechanisms#

Background#

  • Concurrent execution of multiple processes (or threads) can lead to situations where processes mutually restrict each other, such as when two processes need to:
    1. Share a unique hardware device.
    2. Share the same memory area.
    3. The execution of one process depends on the execution results of another process on shared resources.

If there is a temporal relationship between multiple processes that need to work together to complete a task, it is called synchronization; if the conditions for collaboration are not met, but a relationship arises due to shared exclusive resources, it is called mutual exclusion, which is a special form of synchronization.

  • For synchronization mechanisms, the following four rules must be followed:
    1. Free to enter: If no process/thread is in the critical section, any process can enter.
    2. Busy waiting: If a process/thread is in the critical section, other processes cannot enter the critical section.
    3. Finite waiting: Processes/threads waiting to enter the critical section cannot wait indefinitely.
    4. Yielding wait (optional): Processes/threads that cannot enter the critical section should release the CPU, such as switching to a blocked state.

Semaphores#

  • Semaphores and PV primitive operations were invented by Dijkstra and are one of the most widely used mutual exclusion methods.

  • Principles of semaphores and PV operations:

    1. When a process wants to enter a shared area, it first executes the P operation, S-1.
    2. When a process wants to exit the shared area, it executes the V operation, S+1.
    3. The operations of processes entering and exiting the shared area are atomic operations (the execution process cannot be interrupted).
  • Defects: The readability of the program is relatively poor, and the management of semaphores is scattered among the participating objects, which may lead to deadlocks, starvation, and other issues.

Monitors (Unique to Process Synchronization)#

  • Monitors are an extension and improvement of the semaphore mechanism and are a simpler synchronization method.
  • A monitor is an object or module that can be safely accessed by multiple processes/threads.
  • The methods within a monitor are protected by a mutex, meaning that only one visitor is allowed to use them at any given time.
  • Characteristics: Safety, Mutual Exclusion, Sharing.

Critical Section (Thread Synchronization)#

Accessing shared resources or a segment of code through serialization of multiple threads is fast and suitable for controlling data access. At any given time, only one thread is allowed to access shared resources. If multiple threads attempt to access shared resources, once one thread enters, the other threads attempting to access the shared resources will be suspended and will wait until the thread in the critical section exits. Only after the critical section is released can other threads preempt.

Mutual Exclusion#

Mutual exclusion can achieve safe sharing of public resources not only within the same application but also between different applications. Mutexes are more complex than critical sections. Using a mutex allows for safe sharing of resources not only among different threads within the same application but also among threads of different applications.

Events (Thread Synchronization)#

Events maintain thread synchronization through notification operations and can also facilitate operations for comparing the priorities of multiple threads.

Differences in Synchronization Mechanisms Between Processes and Threads#

  • Semaphores, monitors, and mutexes are synchronization mechanisms for processes, while semaphores and mutexes can also be used for thread synchronization, but monitors are only used for process synchronization.
  • Thread synchronization, in addition to semaphores and mutexes, also includes critical sections and events; there is no mention of these two methods being used as synchronization methods for processes (unknown).

Classic Synchronization Problems#

  • Producer-Consumer Problem
  • Dining Philosophers Problem
  • Reader-Writer Problem

For specific references:


V. Deadlock#

Deadlock refers to a situation where multiple processes are in a state of impasse due to competing for resources. If no external force intervenes, the processes in deadlock cannot continue execution.

Causes of Deadlock#

  • Insufficient system resources.
  • Improper order of process execution.
  • Improper resource allocation, etc.

Four Necessary Conditions for Deadlock#

  • Mutual Exclusion Condition: Processes use the allocated resources exclusively.
  • Hold and Wait Condition: Processes that are blocked do not release the resources they have acquired.
  • No Preemption Condition: Resources that have been acquired by a process cannot be forcibly taken away until the process has finished using them.
  • Circular Wait Condition: There exists a circular wait chain of processes and resources when deadlock occurs.

Deadlock Handling#

  • Deadlock Prevention: Break one or more of the four necessary conditions for deadlock; this is relatively simple to implement, but if restrictions are too strict, it may reduce system resource utilization and throughput.
  • Deadlock Avoidance: Prevent the system from entering an unsafe state (a state that may lead to deadlock) during dynamic resource allocation, such as the Banker's Algorithm. Refer to Banker's Algorithm.
  • Deadlock Detection: Allow the system to run and potentially enter deadlock; after deadlock occurs, use certain algorithms to detect it and identify the resources and processes related to the deadlock, taking relevant measures to eliminate the detected deadlock. This is challenging to implement.
  • Deadlock Recovery: In conjunction with deadlock detection, free the system from deadlock (by terminating processes or preempting resources). For processes and resources detected as being in deadlock, release some resources by terminating or suspending them and reallocating them to processes in a blocked state, changing them to the ready state. This is also challenging to implement.

VI. Scheduling Algorithms#

First-Come, First-Served (FCFS)#

Can be used as both job scheduling and process scheduling algorithms; schedules jobs or processes in the order they arrive; therefore, it is more favorable for long jobs.

Shortest Job First (SJF)#

A job scheduling algorithm that selects the job with the shortest estimated time from the ready queue for processing until a result is obtained or it cannot continue execution; disadvantage: unfavorable for long jobs; does not consider job importance; running time is estimated and unreliable.

Highest Priority First (HPF)#

Can be used as both job scheduling and process scheduling algorithms; when scheduling jobs, it selects the job with the highest priority from the ready queue for processing; since it involves priority, it can be divided into preemptive and non-preemptive; the determination of priority can also be divided into static priority (a fixed value determined in advance based on process type, resource requirements, user requests, etc.) and dynamic priority (increases or decreases with the progress of the process or waiting time).

Highest Response Ratio Next (HRRN)#

FCFS may cause dissatisfaction among short job users, while SJF may cause dissatisfaction among long job users, leading to the proposal of HRRN, which selects the job with the highest response ratio for execution. Response ratio = 1 + (job waiting time / job processing time).

Round Robin#

Processes are placed in a queue in the order of arrival, and the CPU time slice is allocated to the process at the front of the queue. When the time slice expires, a timer interrupt occurs, pausing the current process and placing it at the end of the queue, repeating the cycle.

Multilevel Feedback Queue#

Currently recognized as a better scheduling algorithm; multiple ready queues are set up, each with different priorities. The first queue has the highest priority, and the others decrease in priority. Higher priority queues are allocated shorter time slices. When a process arrives, it is placed in the first queue according to FCFS. If it does not complete after being scheduled, it is placed at the end of the second queue for scheduling. If it still does not complete after the second scheduling, it is placed at the end of the third queue, and so on. Only when the previous queue is empty will the next queue's processes be scheduled.


VII. Memory Management#

Basic Knowledge#

Linking#

Generating loading modules forms a complete logical address.
Three linking methods:

  • Static Linking: Linking before loading.
  • Dynamic Linking at Load Time: Linking while loading.
  • Dynamic Linking at Runtime: Linking and loading only when needed during execution.

Purpose: To determine the complete logical address.

Loading#

Loading into memory forms a physical address.

Three loading methods:

  • Absolute Loading: Generates absolute addresses at compile time, single program environment.
  • Static Relocation: Converts logical addresses to physical addresses during loading, multi-program batch processing operating systems.
  • Dynamic Relocation: Converts logical addresses to physical addresses at runtime, requiring the setting of relocation registers, modern operating systems.

Purpose: To determine the final physical address.

Fragmentation#

  • Internal Fragmentation (allocated): Memory areas allocated to a process where some parts are unused.
  • External Fragmentation (unallocated): Some free partitions in memory are too small to be utilized.

Contiguous Allocation Management Methods#

  • Single Contiguous Allocation: Memory is divided into system and user areas, allowing only one user program (no external fragmentation, internal fragmentation exists).
  • Fixed Partition Allocation: The entire user space is divided into several fixed-size partitions, which can be equal or unequal. Each partition can only load one job, managed by a partition descriptor table (no external fragmentation, internal fragmentation exists).
  • Dynamic Partition Allocation: Partitions are dynamically established based on the size of the process, using a free partition table or free partition chain for management. The dynamic partition algorithm selects an appropriate free partition for allocation (external fragmentation exists, no internal fragmentation).

Dynamic Partition Allocation Algorithms#

First Fit Algorithm (FF)#

When allocating memory, it starts searching from the head of the free partition table until it finds the first free partition that meets the size requirements. This algorithm is simple to implement, prioritizing the use of lower-address free partitions, thus preserving larger free partitions in higher addresses for later allocation of larger jobs. The downside is that lower addresses are continuously partitioned, leaving many small free partitions that are difficult to utilize, and each search starts from the lower address, which undoubtedly increases search overhead.

Next Fit Algorithm (NF)#

Unlike the first fit algorithm, when allocating memory space for a process, it does not always start searching from the lower address but from the next free partition found last. Therefore, a search pointer needs to be set to indicate the next starting search free partition. If the last free partition's size still cannot meet the requirements, it should return to the first free partition. This algorithm makes the internal distribution of free partitions more uniform, appropriately reducing the overhead of searching for free partitions, but it easily splits larger free partitions.

Best Fit Algorithm (BF)#

Each time memory is allocated for a job, the smallest free partition that meets the requirements is allocated to the process, i.e., finding a free partition that is the smallest value. This avoids "wasting large resources on small jobs," but it can easily lead to small fragments.

Worst Fit Algorithm (WF)#

Opposite to the best fit algorithm, it always selects the largest free partition that meets the requirements for allocation to the process, i.e., finding a free partition that is the largest value. This minimizes the possibility of fragmentation and is beneficial for medium and small jobs.

Non-Contiguous Allocation Management Methods#

Basic Paging Storage Management#

Basic paging storage management does not have page replacement functionality (i.e., it does not implement virtual memory), so the entire program's pages must be loaded into memory before it can run. Since program data is stored in different pages, and pages are distributed discretely in memory, a page table is needed to record the mapping relationship between logical addresses and actual storage addresses to achieve mapping from page numbers to physical block numbers. Since the page table is also stored in memory, accessing data in a paging system requires two memory accesses (one to access the page table in memory to find the specified physical block number, adding the page offset to obtain the actual physical address; the second is to access memory based on the first obtained physical address to retrieve data).

To reduce the efficiency impact caused by two memory accesses, paging management introduces the Translation Lookaside Buffer (TLB) mechanism (similar to caching mechanisms). In memory management with TLB, when accessing memory data, the page number is first queried in the TLB. If found, it indicates that the page table entry to be accessed is in the TLB, and the corresponding physical block number is read directly from the TLB. If not found, the page table in memory is accessed to obtain the physical address, while adding that mapping entry to the TLB, which may involve TLB replacement algorithms.

Advantages and disadvantages: No external fragmentation, high memory utilization. However, the contents of each page are unrelated, making programming and sharing difficult.

Supplementary Concepts:

  • Page Frame = Memory Block = Physical Block = Physical Page (Memory Partition)
  • Page = Logical Address Space Partition of the Process
  • Pages and Page Frames correspond one-to-one (Page Table, Page Table Entries consist of "Page Number" and "Block Number," with the page number not occupying memory space).
  • Page Table Length refers to the total number of page table entries in the page table, i.e., the total number of pages.
  • Page Table Entry Length refers to the storage space occupied by each page table entry.
  • Page Size refers to the storage space occupied by a single page.

Basic Segmented Storage Management#

Programs are divided into segments based on content or procedural relationships (functions), with each segment having its own name. A user job or process containing segments corresponds to a two-dimensional linear virtual space, i.e., a two-dimensional virtual memory. Segmented management allocates memory in segments and then uses address mapping mechanisms to convert segmented virtual addresses into actual physical addresses.

  • The address space of a process: Divided into several segments based on the logical relationships of the program, with each segment having a segment name and starting from 0 for addressing.
  • Logical Address Structure: Segment Number and Offset (Offset within the Segment).
  • Memory Allocation Rules: Allocation is done in segments, with each segment occupying contiguous space in memory, and segments need not be adjacent.
  • Segment Table Entries consist of Segment Number (implicit, does not occupy storage space), Segment Length, and Base Address.
  • The difference in address conversion from the page table is that it also checks whether the offset exceeds the segment length based on the segment table.

Advantages and disadvantages:

  • Different types of segments can have different protections.
  • Segments can be shared, including through dynamic linking for code sharing.
  • No internal fragmentation occurs, but external fragmentation can occur, leading to lower memory utilization compared to paging.

Differences Between Paging and Segmentation#

  • A page is a physical unit of information, a discrete allocation mechanism proposed from the perspective of system memory utilization; a segment is a logical unit of information, containing a complete set of meaningful information, proposed from the user's perspective for memory management.
  • The size of a page is fixed and determined by the system; the size of a segment is uncertain and determined by the user.
  • The logical address structure of paging is one-dimensional (offset), while segmentation is two-dimensional (segment name + offset).

Segmented Paging Storage Management#

  • Processes are divided into logical modules into segments, and each segment is then paged, which avoids external fragmentation and only produces a small amount of internal fragmentation.
  • Logical Address Structure (two-dimensional): Segment Number, Page Number, Page Offset + Page Number, Block Number where the page is stored.

Address transformation: From logical address to segment number, page number, page offset --> Compare segment number with the segment length recorded in the segment table register to check for out-of-bounds --> Find the corresponding segment table entry based on the segment table starting address and segment number --> Check the page number against the page table length recorded in the segment table to check for out-of-bounds --> Query the page table entry based on the page table starting address and page number --> Obtain the physical address from the block number and page offset --> Access the target unit.

In simple terms, the process involves three memory accesses: checking the segment table --> checking the page table --> accessing the target unit (if a TLB is set, it can achieve one memory access).

Memory Space Expansion#

Overlay Technique#

The overlay technique includes a fixed area (where program segments do not load in and out during execution) and several overlay areas (where program segments need to load in and out during execution), increasing the user's programming burden transparently and occurring within the same program or process.

Swapping Technique#

The swapping technique is used when memory is tight, swapping out certain processes to free up memory space and then swapping in certain processes. External storage disks are divided into file areas and swap areas, with swapped-out processes placed in the swap area, occurring between different processes (or jobs).

Virtual Storage Technology (Virtual Memory)#

  • The principle of virtual memory usage: Locality Principle (similar to the reason for using B-trees as index data structures in MySQL).
    1. Temporal Locality: Instructions and data currently accessed will likely be accessed again soon.
    2. Spatial Locality: Memory units surrounding the currently accessed memory space are likely to be accessed soon.
    3. Cache Technology: Frequently used data is placed in faster storage.

Based on the locality principle, when loading a program, only part of the program is loaded into memory, leaving the rest in external storage, allowing the program to execute. During program execution, when the accessed information is not in memory, the operating system loads the required part into memory and continues executing the program. On the other hand, the operating system swaps out temporarily unused content from memory to external storage, freeing up space for information that needs to be loaded into memory. This way, the system appears to provide the user with a much larger storage capacity than the actual memory, known as virtual storage.

  • Definition of virtual memory: A program can run without being fully loaded; it is dynamically loaded during execution, and if memory is insufficient, it undergoes replacement.

  • Characteristics of virtual memory:

    1. Multiplicity: There is no need to load the entire job into memory at once during execution; it can be divided into multiple loads.
    2. Swappability: There is no need for the job to remain resident in memory throughout its execution; it can be swapped in and out during the job's execution.
    3. Virtuality: Logically expands memory capacity, with users seeing a memory capacity much larger than the actual capacity.
  • Implementation of virtual memory technology: Demand Paging, Demand Segmentation, Demand Segmented Paging.

  • Costs of using virtual memory:

    1. Management of virtual memory requires establishing many data structures, occupying additional memory.
    2. The conversion from virtual addresses to physical addresses increases instruction execution time.
    3. Paging in and out requires disk I/O, consuming time.
    4. If a page contains only part of the data, memory is wasted.

Page Replacement Algorithms#

Optimal Page Replacement Algorithm (OPT)#

An algorithm that has only theoretical significance, used to evaluate other page replacement algorithms. The replacement strategy is to replace the page that will not be accessed for the longest time in the future.

First-In, First-Out Page Replacement Algorithm (FIFO)#

The operating system maintains a linked list of all pages currently in memory, with the most recently entered page at the tail and the earliest entered page at the head. When a page fault occurs, the page at the head of the list is evicted, and the newly loaded page is appended to the tail. This is a simple and crude replacement algorithm that does not consider page access frequency information.

Least Recently Used Algorithm (LRU)#

The algorithm assigns each page an access field to record the time t since the last access of the page. Each time a replacement occurs, the page with the largest t value is replaced (implementation can be done using registers or stacks).

Clock Replacement Algorithm (also known as Not Recently Used Algorithm NRU)#

Pages are set with an access bit; when a page is accessed, the access bit is set to 1. All pages are stored in a circular queue, with a pointer pointing to the oldest page. When replacing a page, if the current page pointed to has an access bit of 0, it is replaced; otherwise, the access bit is set to 0, and the pointer is moved in a circular manner until a page with an access bit of 0 is found.

Improved Clock Algorithm#

Based on the Clock algorithm, an additional modified bit is added, and replacement is determined based on both the access bit and the modified bit. Pages with both access and modified bits set to 0 are prioritized for replacement, followed by pages with an access bit of 0 and a modified bit of 1.

Least Frequently Used Algorithm (LFU)#

Registers are set to record the number of times a page has been accessed, and the page with the least access count is replaced during each replacement. The problem is that the access register does not accurately reflect the current page access count because access speed is fast, so accessing once and accessing 100 times within the register update interval are treated the same. Moreover, LFU and LRU are quite similar, with the same hardware support, but the key distinction is that one is based on time and the other on count (for example, for registers pa 001111 and pb 111000, if using LRU, pa will be evicted, while if using LFU, pb will be evicted).

Page Buffering Algorithm (PBA)#

During replacement, pages, regardless of whether they have been modified, are not swapped out to disk but are first temporarily held in a linked list of pages in memory (modified and unmodified page lists can also be undifferentiated). When they are accessed again, they can be directly retrieved from these lists without disk I/O. When the number of modified pages in the list reaches a certain threshold, sequential write operations to disk are performed (effectively merging multiple I/O operations into one).

Page Allocation Strategies#

Resident Set: Refers to the collection of physical blocks allocated to a process in demand paging storage management (i.e., the number of physical blocks allocated to the process by the system).

Fixed Allocation Local Replacement: The resident set is allocated before the process runs, and when a page fault occurs, only a page from the process can be swapped out.

Variable Allocation Global Replacement: Whenever a page fault occurs, a new physical block is allocated, which may come from free physical blocks or require swapping out pages from other processes.

Variable Allocation Local Replacement: Dynamically increases or decreases the physical blocks allocated to a process based on the frequency of page faults.

When to load pages: Preloading strategy (before running); demand paging strategy (during running).

Where to load pages: UNIX method: Pages that are used for the first time are loaded from the file area, and swapped-out pages are written back to the swap area, and when used again, they are loaded from the swap area.

Thrashing Phenomenon: Caused by insufficient physical blocks allocated to a process, where just swapped-out pages are immediately swapped back into main memory, and just swapped-in pages are immediately swapped out of main memory. If this is due to a faulty page replacement strategy, the replacement algorithm can be modified to resolve the issue. If it is due to too many running programs, causing the program to be unable to load all frequently accessed pages into memory simultaneously, the number of concurrent programs should be reduced; otherwise, terminate the process or increase the size of the resident set.

Working Set: Refers to the set of pages that a process actually accesses within a certain time interval (the size of the resident set is generally not less than the size of the working set to prevent thrashing).

Supplementary:

  • Page Fault (or Page Error): When a program references a part of the address space not in physical memory, the operating system is responsible for loading the missing part into physical memory and re-executing the failed instruction.
    An example problem calculating page faults.
  • Belady's Anomaly: When using the FIFO algorithm, if a process is not allocated all the pages it requires, there are times when increasing the number of allocated pages results in a higher page fault rate.

VIII. File System#

Disk Scheduling Algorithms#

  1. First-Come, First-Served (FCFS)
  2. Shortest Seek Time First (SSTF): Allows the request closest to the current track to start the disk drive, i.e., the job with the shortest seek time is executed first, regardless of the order in which requests arrive, thus overcoming the issue of excessive movement of the disk arm in the FCFS scheduling algorithm.
  3. Scan Algorithm (SCAN) or Elevator Scheduling Algorithm: Always starts from the current position of the disk arm, selecting the request closest to the current disk arm in the direction of movement. If there are no requests in the direction of movement, the disk arm changes direction. In this scheduling method, the movement of the disk arm is similar to elevator scheduling, hence the name.
  4. Circular Scan Algorithm (CSCAN): The circular scan scheduling algorithm is an improvement on the scan algorithm. The disk arm moves in a single direction, from the outside to the inside. Starting from the current position, it selects the request closest to the current disk arm in the direction of movement. If there are no requests in that direction, it returns to the outermost position and accesses the job request with the smallest cylinder number.
  5. NStepSCAN Algorithm: The N-step SCAN algorithm divides the disk request queue into several sub-sequences of length N, processing these sub-queues in order using the FCFS algorithm, while each sub-queue is processed according to the SCAN algorithm. When N is large, the algorithm approaches the SCAN algorithm; when N=1, it degenerates into the FCFS algorithm. The N-step SCAN algorithm can effectively avoid the "disk arm sticking" phenomenon.

IX. I/O Models#

Reference: I/O in Operating Systems and High-Performance I/O Models

I/O Between Devices#

  • Polling
    The CPU actively polls various devices to check their status; if there is data, it performs I/O.
  • Interrupt
    When a device has data, it sends an interrupt, and the CPU decides whether to respond to the interrupt and then processes the device's I/O. The CPU does not need to poll the device status frequently; it simply passively receives interrupts.
  • Direct Memory Access (DMA)
    If one byte of data interrupts once, transferring 1KB of data would require 1024 interrupts, which wastes CPU time. Thus, the DMA method was introduced, where the CPU only needs to provide the starting and ending addresses to the DMA, and the DMA handles the data I/O in between. This liberates the CPU from byte-level intervention to block-level intervention.
  • Channel Control
    The DMA method can only control one device's data block; for multiple blocks of data, the CPU still needs to intervene multiple times. Therefore, channels were introduced to control I/O, which are more powerful than DMA, capable of controlling multiple blocks of data and I/O for multiple devices, further liberating the CPU's involvement in the I/O process.

I/O Between User Processes#

  • Blocking
    User threads call certain system functions to retrieve data from the kernel, and the thread remains in a blocked state until the data reaches the kernel cache.

  • Non-blocking
    User threads retrieve data without caring whether there is data in the kernel cache, returning directly, which may yield data or not, without putting the thread into a blocked state.

  • I/O Multiplexing
    Multiplexing means one thread manages multiple I/O operations; the thread is still blocked during the call, and when one or several I/O operations have data, it returns. The thread needs to traverse all I/O operations to determine which one has data. For example, the socket's select() function allows the thread to call select() and enter a blocked state; when any I/O has data, the thread exits the blocked state and gets the opportunity to continue executing.

  • Signal-Driven I/O
    A signal and a callback function are registered for an I/O operation; once the signal is triggered, the callback function reads the data. For example, a "readable" signal can be registered for a socket, and when data arrives and is readable, the signal is triggered, executing the callback function to copy data from the kernel cache to user space.

  • Asynchronous I/O
    In asynchronous I/O, after the operating system completes the copy of data from the kernel to user space, it notifies the user thread via a signal that it can proceed to the next operation, eliminating the need for the user thread to block while copying data.

Principles of select, epoll, kqueue#

  • select
    Select manages multiple sockets; when it receives an I/O interrupt from the network card, it returns without knowing which socket fd corresponds to the interrupt. The user thread must traverse to determine which I/O has data.
  • epoll
    When epoll receives an I/O interrupt, it looks up which socket fd corresponds to the interrupt. Epoll builds a red-black tree (a type of balanced binary tree) for efficient searching. The user registers interested socket events by inserting the socket fd into the red-black tree, using the interrupt number as the key, which can be understood as a tuple of (interrupt number, socket fd). Removing an event means deleting a node from the tree.

When an I/O interrupt occurs, epoll copies the data from the network card to the kernel cache, looks up the corresponding fd in the red-black tree based on the interrupt number, and adds the fd to the ready list, preparing to return it to the user thread. The user directly receives the ready fd.

  • kqueue
    When a socket I/O interrupt occurs, it looks up the corresponding socket fd in a hash table and then places it in a linked list to return. When the user registers an interested event, it adds a fd to the hash table.
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.