Technology Sharing

High Concurrency Reactor Server [Medium]

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

4. Process Control and Process Synchronization

1. Signal

1.1 Basic Concepts of Signals

A signal is a software interrupt and a method of passing messages between processes. It is used to notify a process that an event has occurred, but it cannot pass any data to the process.

There are many reasons why signals are generated. In Shell, you can usekillandkillallCommand to send signal:

kill -信号的类型 进程编号
killall -信号的类型 进程名
  • 1
  • 2

1.2 Signal Types

Signal nameSignal valueDefault processing actionReason for signaling
SIGHUP1AThe terminal hangs or the control process terminates
SIGINT2AKeyboard interrupt Ctrl+c
SIGQUIT3CThe keyboard's escape key is pressed
SIGILL4CIllegal instruction
SIGTRAP5CBreakpoint instructions
SIGABRT6CAbort signal issued by abort(3)
SIGBUS7CBus Error
SIGFPE8CFloating point exceptions
SIGKILL9Akill -9 kills the process. This signal cannot be captured or ignored.
SIGUSR110AUser defined signal 1
SIGSEGV11CInvalid memory reference (array out of bounds, operation on null pointer)
SIGUSR212AUser defined signal 2
SIGPIPE13AWrite data to a pipe with no read process
SIGALRM14AAlarm signal, the signal emitted by the alarm() function
SIGTERM15ATermination signal, the default signal sent
SIGSTKFLT16AStack Error
SIGCHLD17BEmitted when the child process ends
SIGCONT18DResume a stopped process
SIGSTOP19DStop a process
SIGTSTP20DPress the stop button on the terminal
SIGTTIN21DBackground process requests to read terminal
SIGTTOU22DBackground process requests to write to the terminal
SIGURG23BEmergency condition detection (socket)
SIGXCPU24CCPU time limit exceeded
SIGXFSZ25CExceeded file size limit
SIGVTALRM26AVirtual clock signal
SIGPROF27AAnalyzing clock signals
SIGWINCH28BWindow size changes
SIGPOLL29BPolling (Sys V)
SIGPWR30Aelectricity failure
SIGSYS31CIllegal system call

A's default action is to terminate the process.

B's default action is to ignore this signal.

The default action for C is to terminate the process and dump the kernel image.

The default action of D is to stop the process. A program that enters the stopped state can be resumed.

1.3 Signal Processing

There are three ways for a process to handle signals:

  1. The signal is processed using the system's default operation, and the default operation for most signals is to terminate the process.
  2. Set the interrupt processing function, and after receiving the signal, it will be processed by this function.
  3. Ignore a signal and do nothing with it, as if it never happened.

signal()The function can set how the program handles the signal.

Function declaration:

#include <signal.h>

typedef void (*sighandler_t)(int);
sighandler_t signal(int signum, sighandler_t handler);
  • 1
  • 2
  • 3
  • 4

Parameter Description:

  • sig: Specifies the signal to capture.
  • func: A pointer to a signal processing function. The processing function needs to receive an integer parameter, which is the captured signal number.
  1. SIG_DFL:SIG_DFL macro indicates the default signal handling method. UseSIG_DFLAssignalThe second parameter of the function indicates that the system default processing method is used for the signal.
  2. SIG_IGN:SIG_IGN macro means ignore the signal. UseSIG_IGNAssignalThe second parameter of the function indicates that the process will ignore the signal when it receives it and will not perform any processing. This can prevent the process from being accidentally terminated or interrupted in some cases.
  3. SIG_ERR:SIG_ERRThe macro is used to indicate an error. It is not intended to be used assignalfunction, but as the second argumentsignalThe return value of the function indicates that the call failed.signalIf the function fails, it returnsSIG_ERRThis is usually used to detect and processsignalError in function call.

image-20240709113614147

image-20240709113230874

image-20240709113240944

1.4 What are the uses of signals?

The service program runs in the background. If you want to terminate it, killing it is not a good idea, because when the process is killed, it dies suddenly and no follow-up work is arranged.

If a signal is sent to the service program, and the service program receives the signal, it calls a function and writes the follow-up code in the function, the program can exit in a planned manner.

Sending a 0 signal to the service program can detect whether the program is alive.

image-20240709135336848

1.5 Sending Signals

The Linux operating system provides kill andkillall Command to send a signal to the program. In the program, you can usekill() Library functions send signals to other processes.

Function declaration:

int kill(pid_t pid, int sig);
  • 1

kill() The function takes parameterssig The specified signal is passed to the parameterpid The specified process.

parameter pid There are several situations:

  1. pid > 0 Send a signal to the processpid process.
  2. pid = 0 Send the signal to all processes in the same process group as the current process. It is often used by the parent process to send signals to the child process. Note that this behavior depends on the system implementation.
  3. pid < -1 Send the signal to the process group ID|pid| All processes.
  4. pid = -1 Send the signal to all processes that have permission to send signals, but not including the process that sent the signal.

2. Process Termination

There are 8 ways to terminate a process, 5 of which are normal terminations, which are:

  1. exist main() Functionreturn return;
  2. Calling in any function exit() function;
  3. Calling in any function _exit() or_Exit() function;
  4. The last thread starts from its startup routine (thread main function) with return return;
  5. Called in the last thread pthread_exit() return;

There are three ways of abnormal termination, they are:

  1. transfer abort() Function abort;
  2. A signal is received;
  3. The last thread responds to the cancellation request.

2.1 Process Termination Status

exist main() In the function,return The returned value is the termination status.return Statement or callexit(), then the termination status of the process is 0.

In Shell, check the status of process termination:

echo $?
  • 1

3 functions to terminate the process normally (exit() and_Exit() is described by ISO C,_exit() is specified by POSIX):

void exit(int status);
void _exit(int status);
void _Exit(int status);
  • 1
  • 2
  • 3

status The status of the process termination.

image-20240709143530327

image-20240709143615950

2.2 Resource release issue

  • return Indicates that the function returns and the destructor of the local object is called.main() In the functionreturn The destructors of global objects are also called.
  • exit() It means to terminate the process. The destructor of local objects will not be called, only the destructor of global objects will be called.
  • _exit() and_Exit() Exit directly without performing cleanup.

2.3 Process termination function

The process can be used atexit() Function registration termination functions (up to 32), these functions will beexit() Automatically called.

int atexit(void (*function)(void));
  • 1

exit() Termination functions are called in the reverse order of registration.

image-20240709143824286

image-20240709143830549

3. Calling the executable program

3.1 system() function

system()The function provides a simple way to execute a program, passing the program to be executed and the parameters to be executed as a string.system()Function is enough.

Function declaration:

int system(const char * string);
  • 1

system()The return value of the function is more troublesome.

  1. If the program to be executed does not exist,system()The function returns non-zero;
  2. If the program execution is successful and the executed program terminates with a status of 0,system()The function returns 0;
  3. If the program execution succeeds, and the executed program terminates with a status other than 0,system()The function returns non-zero.

3.2 exec function family

execThe function family provides another way to call a program (binary file or shell script) in a process.

execThe declaration of the function family is as follows:

int execl(const char *path, const char *arg, ...);
int execlp(const char *file, const char *arg, ...);
int execle(const char *path, const char *arg, ..., char * const envp[]);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
int execvpe(const char *file, char *const argv[], char *const envp[]);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

Notice

  1. If the execution of the program fails, -1 is returned directly, and the reason for the failure is stored inerrnomiddle.
  2. The process number of the new process is the same as that of the original process, but the new process replaces the code segment, data segment, and stack of the original process.
  3. If the execution is successful, the function will not return.execAfter that, the called program will replace the calling program, that is,execNo code after the function will be executed.
  4. In actual development, the most commonly usedexecl()andexecv(), others are rarely used.

4. Create a process

4.1 Linux processes 0, 1, and 2

All processes in the entire Linux system are in a tree structure.

  • **Process No. 0 (system process)** is the ancestor of all processes, and it creates processes No. 1 and No. 2.
  • **Process 1 (systemd)** is responsible for performing kernel initialization and system configuration.
  • **Process No. 2 (kthreadd)** is responsible for the scheduling and management of all kernel threads.

usepstreeThe command can view the process tree:

pstree -p 进程编号
  • 1

4.2 Process Identification

Each process has a unique process ID represented by a non-negative integer. Although unique, process IDs can be reused. When a process terminates, its process ID becomes a candidate for reuse. Linux uses a delayed reuse algorithm to make the ID of a newly created process different from the ID used by the most recently terminated process. This prevents a new process from being mistaken for a terminated process using the same ID.

Function to get the process ID:

pid_t getpid(void);    // 获取当前进程的ID。
pid_t getppid(void);   // 获取父进程的ID。
  • 1
  • 2

4.3 fork() function

An existing process can callfork()The function creates a new process.

Function declaration:

pid_t fork(void);
  • 1

Depend onfork()The new process created is called a child process.

fork()The function is called once but returns twice. The difference between the two returns is that the return value of the child process is 0, while the return value of the parent process is the process ID of the newly created child process.

The child process and the parent process continue to executefork()The following code,The child process is a copy of the parent process. The child process has a copy of the parent process's data space, heap, and stack (note: the child process has a copy, not shared with the parent process).

fork()After that, the execution order of the parent process and the child process is undefined.

image-20240709221535371

image-20240709221546617

4.4 Two Usages of fork()

  1. The parent process wants to copy itself, and then the parent process and the child process execute different codes respectively. This usage is very common in network service programs. The parent process waits for the client's connection request. When the request arrives, the parent process callsfork(), let the child process handle these requests, while the parent process continues to wait for the next connection request.
  2. The process wants to execute another program. This usage is very common in Shell. The subprocess starts fromfork()Called immediately after returningexec

4.5 Shared Files

fork()A feature of is that the file descriptors opened in the parent process are copied to the child process, and the parent process and the child process share the same file offset.

If the parent and child processes write to the same file descriptor without any form of synchronization, their output may be intermixed.

image-20240709222929369

image-20240709222803641

At this point you can see that there are only 100,000 rows of data.

image-20240709222853769

image-20240709223236254

At this point there should be 200,000 lines of data. The lack of one line may be because the file write operation is not atomic. In the absence of a synchronization mechanism, two processes may try to write to different parts of the file at the same time, causing the written data to interfere with each other.

4.6 vfork() function

vfork()Function calls and return values ​​are similar tofork()Same, but the semantics of both are different.

vfork()The function is used to create a new process, and the purpose of the new process isexecA new program that does not copy the address space of the parent process because the child process immediately callsexec, so the address space of the parent process will not be used. If the child process uses the address space of the parent process, it may bring unknown results.

vfork()andfork()Another difference is:vfork()Ensure that the child process runs first, and then callexecorexitThe parent process then resumes execution.

5. Zombie Processes

In the operating system, a zombie process is a child process that has terminated but whose parent process has not yet read its exit status. Although the zombie process is no longer running, it still occupies an entry in the process table so that the kernel can save the exit status information of the process (such as process ID, exit status, etc.) until the parent process reads this information.

5.1 Causes of zombie processes

If the parent process exits before the child process, the child process will be hosted by process 1 (this is also a way to let the process run in the background).

If the child process exits before the parent process, and the parent process does not process the child process's exit information, then the child process will become a zombie process.

5.2 The harm of zombie processes

The kernel retains a data structure for each child process, including the process number, termination status, CPU time used, etc. If the parent process handles the child process exit information, the kernel will release this data structure. If the parent process does not handle the child process exit information, the kernel will not release this data structure, and the child process number will always be occupied. The process numbers available to the system are limited. If a large number of zombie processes are generated, the system will not be able to generate new processes due to the lack of available process numbers.

5.3 How to avoid zombie processes

  1. Handling SIGCHLD Signal: When the child process exits, the kernel sends a SIGCHLD signal to the parent process.signal(SIGCHLD, SIG_IGN)Notify the kernel that it is not interested in the child process's exit, so the child process will release its data structure immediately after exiting.
  2. usewait()/waitpid()function: The parent process waits for the child process to end by calling these functions and obtains its exit status, thereby releasing the resources occupied by the child process.
pid_t wait(int *stat_loc); 
pid_t waitpid(pid_t pid, int *stat_loc, int options); 
pid_t wait3(int *status, int options, struct rusage *rusage); 
pid_t wait4(pid_t pid, int *status, int options, struct rusage *rusage);
  • 1
  • 2
  • 3
  • 4

The return value is the child process number.

stat_loc This is the information of the child process termination:

a) If the termination is normal, the macro WIFEXITED(stat_loc) Return true, macroWEXITSTATUS(stat_loc) The termination status can be obtained;

b) If the termination is abnormal, the macro WTERMSIG(stat_loc) Can get the signal to terminate the process.

image-20240709230911352

image-20240709231034423

image-20240709231050581

image-20240709231124375

image-20240709231140813

If the parent process is busy, you can capture SIGCHLD Signal, called in the signal processing functionwait()/waitpid()

image-20240709231439475

image-20240709231422927

6. Multi-process and signals

[Sending signals between processes](##1.5 Sending signals)

In a multi-process service program, if the child process receives an exit signal, the child process exits on its own.

If the parent process receives an exit signal, it should send an exit signal to all child processes and then exit itself.

image-20240711222919564

image-20240711222900141

image-20240711223111481

7. Shared Memory

Multiple threads share the address space of the process.If multiple threads need to access the same block of memory, use a global variable.

In multiple processes, the address space of each process is independent and not shared.If multiple processes need to access the same block of memory, global variables cannot be used, only shared memory can be used

Shared memory allows multiple processes (without requiring blood relationship between processes) to access the same memory space, and is the most effective way to share and transfer data between multiple processes. Processes can connect shared memory to their own address space. If a process modifies the data in the shared memory, the data read by other processes will also change.

Shared memory does not provide a locking mechanism, that is, when a process reads/writes shared memory, it does not prevent other processes from reading/writing it.If you want to lock the read/write of shared memory, you can use a semaphore. Linux provides a set of functions for operating shared memory.

7.1 shmget function

This function is used to create/get shared memory.

 int shmget(key_t key, size_t size, int shmflg);
  • 1
  • key The key value of the shared memory is an integer (typedef unsigned int key_t), usually in hexadecimal, for example 0x5005, the keys of different shared memories cannot be the same.
  • size The size of the shared memory in bytes.
  • shmflg The access permissions for shared memory are the same as those for files, for example0666|IPC_CREAT Indicates that if the shared memory does not exist, create it.
  • return value: Returns the shared memory ID (an integer greater than 0) if successful, or -1 if failed (insufficient system memory, no permission).

image-20240711224223200

image-20240711224212293

use ipcs -m You can view the system's shared memory, including: key value (key), shared memory id (shmid), owner (owner), permissions (perms), and size (bytes).

use ipcrm -m 共享内存id You can manually delete the shared memory as follows:

image-20240711225202860

Note: Data types in shared memory cannot use containers, only basic data types can be used.

7.2 shmat function

This function is used to connect shared memory to the address space of the current process.

void *shmat(int shmid, const void *shmaddr, int shmflg);
  • 1
  • shmid Depend onshmget() The shared memory identifier returned by the function.
  • shmaddr Specifies the address location where the shared memory is connected to the current process. Usually 0 is filled in, indicating that the system will select the address of the shared memory.
  • shmflg Flag bit, usually filled with 0.

Returns the shared memory start address when the call succeeds, and returns (void *)-1

7.3 shmdt function

This function is used to separate the shared memory from the current process, which is equivalent to shmat() The inverse of a function.

int shmdt(const void *shmaddr);
  • 1
  • shmaddr shmat() The address to which the function returns.

The call returns 0 if successful and -1 if failed.

7.4 shmctl function

This function is used to operate shared memory. The most common operation is to delete shared memory.

int shmctl(int shmid, int command, struct shmid_ds *buf);
  • 1
  • shmid shmget() The shared memory id returned by the function.
  • command Instructions for operating shared memory. If you want to delete shared memory, fill inIPC_RMID
  • buf The address of the data structure for operating shared memory. If you want to delete the shared memory, fill in 0.

The call returns 0 if successful and -1 if failed.

Note that using root The created shared memory cannot be deleted by ordinary users regardless of the creation permissions.

image-20240711230653886

image-20240711230522921

7.5 Circular Queue

7.6 Circular Queue Based on Shared Memory

The call returns 0 if successful and -1 if failed.

7.4 shmctl function

This function is used to operate shared memory. The most common operation is to delete shared memory.

int shmctl(int shmid, int command, struct shmid_ds *buf);
  • 1
  • shmid shmget() The shared memory id returned by the function.
  • command Instructions for operating shared memory. If you want to delete shared memory, fill inIPC_RMID
  • buf The address of the data structure for operating shared memory. If you want to delete the shared memory, fill in 0.

The call returns 0 if successful and -1 if failed.

Note that using root The created shared memory cannot be deleted by ordinary users regardless of the creation permissions.

[External link image is being transferred...(img-v6qW3XRA-1720711279572)]

[External link image is being transferred...(img-CG0tGAne-1720711279572)]The external link image transfer failed. The source site may have an anti-hotlink mechanism. It is recommended to save the image and upload it directly.
The external link image transfer failed. The source site may have an anti-hotlink mechanism. It is recommended to save the image and upload it directly.
The external link image transfer failed. The source site may have an anti-hotlink mechanism. It is recommended to save the image and upload it directly.

7.5 Circular Queue

7.6 Circular Queue Based on Shared Memory