- Add page for file descriptors
- More info on message passing
Overview
A process is a program in execution, it is an entity maintained by the operating system.
A process consists of the following memory regions:
It is the job of the OS to give the process memory space. How the process allocates and deallocates space in memory is decided by the program itself.
- The text region contains the program instructions itself.
- The stack contains space for local variables and arguments of functions
Local variables assigned to the stack are removed when a function returns.
- The data region contains space for global and/or static variables
- The heap contains space for dynamically allocated variables (memory created at runtime with
malloc
)
A process can have the following states (in a time-sharing OS):
Scheduling
This involves the OS selecting what processes to run on the CPU. The goal is to maximize CPU utilization in a multiprogramming OS.
The following structures are used to manage processes in the OS.
- Job Queue - This is the set of all processes in a system
- Ready Queue - This is the set of all processes residing in memory, and ready to execute
- Device Queues - The set of processes waiting for an I/O device
Long Term Scheduler
- Selects processes to be brought into ready queue
- Controls the degree of multiprogramming
- Controls the mix of CPU-bound and I/O-bound processes
- Can take longer periods of time to schedule processes or "jobs"
- Infrequently invoked
Short Term Scheduler
- Selects the processes to be executed next and allocates CPU
- Used to limit scheduling overhead
- Frequently invoked
Creation
Processes can create other processes. There is a singular root process for any OS called the primordial process. A process that creates another one is called the parent process. The created process is called the child process.
Processes can choose to share regions of memory, Unix-like systems do not do this and each parent-child process has its own context.
Parent-child processes also execute concurrently. In Unix-like systems the wait()
syscall can be used in the parent to wait for the child to complete.
Parent-child processes are not guaranteed to finish execution in any specific order. In order to ensure the synchronization of parent and child processes IPC mechanisms such as #Pipes or the wait()
syscall can be used.
In Unix-like systems, creating a child process copies the entire context of the parent process and continues execution from the creation of the child process in the child process.
Inter-Process Communications
There are two main ways processes can communicate:
- Shared memory
- Message passing
Shared Memory
Shared memory involves sharing some region of memory between processes. Each process can read and write to the memory region. This is typically a faster method of sharing data.
You can overwrite the data in shared memory from a process if not careful.
The following can be implemented for shared memory
- System-wide sharing, or anonymous name
- Restricting read, write, access
- Dealing with race conditions (atomic, synchronized access)
Most thread level communication is done via shared memory.
The following syscalls are used on Unix:
shmget
- Create some share memory spaceshmat
- Add the share memory to the process address spaceshmdt
- Remove the shared memory from the process address spaceshmctl
- Preform some operation on the share memory
Message Passing
Message passing involves sending and receiving messages. Messages are typically not overwritten and is used when sending smaller amounts of data. It is also slower than sharing memory, although it is easier to sync processes.
Message passing provides two basic operations
- send
- receive
Pipes
Also called anonymous pipes, they are the most basic form of IPC on all Unix systems. Pipes are unidirectional meaning they can only be used to communicate one way between processes. They can only be created between parent and child processes or sibling processes. Data is collected in FIFO order.
Pipes only exist as long as the processes that use them are alive. Any pre-mature exit will loose all data in a pipe.
This works because using the pipe()
syscall creates two new File Descriptors for messages to be passed through, one for reading and one for writing. Since File Descriptors are copied between parent and children processes they are allowed to pass messages through the pipes without any issues.
Pipes use the following syscall(s) in Unix:
pipe
- To create a pipedup2
- To "duplicate" one file descriptor into another
Example:
/* Program used for syncing processes */
main()
{
char *s, buf[1024];
int fds[2];
s = "EECS 678. Pipe program 3\n";
/* create a pipe */
pipe(fds);
if (fork() == 0) {
/* child process. */
printf("Child line 1\n");
read(fds[0], s, strlen(s));
printf("Child line 2\n");
} else {
/* parent process */
printf("Parent line 1\n");
write(fds[1], buf, strlen(s));
printf("Parent line 2\n");
}
}
FIFO
Also called named pipes, these are much more powerful than anonymyous pipes. These can be used by any number of processes and do not require the parent-child or sibling relationship between processes.
These appear as a special type of file in the file system. They only allow half-duplex communication between processes. This means that each process can choose to read OR write not both.
There must be at least one writer and one reader open for the FIFO to operate. If there is not, the process will be blocked and in a waiting state.
If there are multiple readers of the FIFO the output of the writer can be distributed among the reading processes. It depends on the implementation in the OS however.
Message Queues
Message queues uses indirect communication or mailboxes. They are similar to #FIFO in the sense that queues can be used by multiple processes. Processes can also use any number of queues to communicate and both send and receive from a queue.
#Synchronization may be required for queues, and the capacity of the queue is defined by the OS. This capacity can be changed by the user however.
When sending messages in a queue, you can specify the message type help receivers identify which type of message to receive from the queue.
Unix Domain Sockets
This is a two-way communication channel. Sockets are a special type of file in Unix-like systems and is mostly used for the server, client model where the server waits for user requests for data or processing. Usually, this utilizes UDP or TCP. Sockets are bi-directional.
The following system calls are used for Unix sockets:
socket
- Create a new Unix socketbind
- Assign an address to a socket (creates socket file)
bind
creates the socket file as it takes in an address which is just the filename of the socket. If you creating an IP socket you would just use an IP address.
listen
- Listen to incoming client requests as a serveraccept
- Create a new connected socket from a client
accept
returns a file descriptor that the program can use to communicate with the client with recv
and send
.
recv
- Retrieve messages from socketsend
- Send message to socketclose
- Close the socket connection
Signals
Add section
Programming
Creating Processes
The fork()
syscall can be used to create a child process.
int main()
{
pid_t ret;
/* fork another process */
ret = fork();
if (ret < 0) { /* error occurred */
fprintf(stderr, "Fork Failed");
exit(-1);
}
else if (ret == 0) { /* child process */
execlp("/bin/ls", "ls", NULL);
}
else { /* parent process */
/* parent will wait for the child to complete */
wait(NULL);
printf("Child Complete");
exit(0);
}
}
If the return value of fork()
is < 0
the child process was not created. If the return value is == 0
we are "in" the child process.
Definitions
Process Control Block
The process control block or PCB contains metadata about a process. It is used for scheduling and context switching.
It holds the following information:
- Process state
- Program counter
- CPU registers
- CPU scheduling information
- Memory-management information
- Accounting information
- I/O status information
Context Switching
The process of storing and restoring state of a process on an operating system.
The switch from kernel to user mode is a mode switch
- This only applies to time-shared or multiprogramming OSes
- The context is represented by the PCB
Context switching is pure overhead and the OS does nothing useful when switching context.
Child Process
A child process is a process which was created by another process. In POSIX systems this can be done with the fork
syscall.
Parent Process
A parent process is a process which has spawned another process (child process).
There is a special parent process called the primordial process which is what spawns all user processes in an OS.
Zombie Process
A zombie process is a child process which has finished execution and is waiting for the parent to collect it's exit status.
Orphan Process
An orphan process is a child process which is still executing even though its parent has finished executing.
If a process is orphaned it will be adopted by some other process which will become its new parent. There will always be a parent process to adopt an orphaned process as all processes in an OS are children of the primordial process.
Reference
- Kulkarni, Prasad Various Lectures The University of Kansas 2024