1. File descriptor
1. Getting to know file descriptors first
When we learn the open function of the interface that opens the file, its return value is the file descriptor. At that time, we thought it was a small integer.

Seeing is believing, but hearing is false, verify:

#define _CRT_SECURE_NO_WARNINGS

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
int fd = open(“add.txt”, O_WRONLY | O_CREAT | O_TRUNC, 0666);
if (fd < 0)
{
perror(“open”);
return 1;
}
//Print the value of fd on the display
fprintf(stdout, “open fd: %d\n”, fd);
close(fd);
return 0;
}

In this way, we have verified that fd is a small integer, but what is the use of this file descriptor? For the open function, fd is returned if the file is successfully opened, and -1 is returned if it fails.

Regarding the usefulness of file descriptors, here we have to mention how the operating system manages files. To manage files is to manage processes, and to manage a process is generally described first and organized; when we open a file, The operating system needs to create a corresponding data structure to manage files. In the task_struct structure, there is a pointer to file*. A table files_struct has a core part pointer array fd_array[], and each element in it Both are pointers to open files, and the files in memory are managed through file pointers. The subscript of this array is called a file descriptor.

That is the essence of the file descriptor fd: it is an array subscript, and the file to be managed can be found through the file descriptor fd.

Although we know what the file description is? But when we open the file through open, why is his file descriptor fd 3?

This requires an understanding of how file descriptors are allocated.

2. Allocation rules for file descriptors
By default, a Linux process will have 3 open file descriptors, which are standard input 0, standard output 1, and standard error 2, which means that the fd_array[] array subscripts from 0 to 2 are all occupied . Then assign fd = 3 to my newly opened file operator, will the allocation of the file operator start from 3?

Let’s close the fd = 0 subscript to see what happens?

close(0);//Close fd = 0 for observation

At this time, it is found that printing fd = 0.

Continue to close fd = 1, what will you find?

close(1);//Close fd = 1 for observation

Here we find that when we execute the program, the value of fd is not printed on the screen.

Is there a bug in the program? Next, we continue to shut down the interface with fd = 2, and everything else remains unchanged.

close(2);//Close fd = 2 for observation

At this time, it was found that fd = 2 was printed out, that is to say, there was no bug, so why did different phenomena appear three times?

Summary of phenomena

close fd = 0

fd = 0 is printed on the screen

close fd = 1

fd is not printed on the screen

close fd = 2

fd = 2 is printed on the screen

Why is this so?

In fact, the allocation rule of the file descriptor fd: from small to large, follow the allocation of the smallest and unoccupied fd.

This is why we turn off fd = 0 and fd = 2, and fd = 0, fd = 2 will be printed; because after closing, it is not occupied and can be allocated by the operating system.

As for why fd = 1 is turned off, the display does not print, because the initial point of fd = 1 is to point to the file that controls the output of the display. If we turn off fd = 1, the operating system cannot find the output file of the display, so it cannot work normally. Print, but actually fd is assigned to the array subscript fd = 1.

2. Redirection
1. Understanding Redirection
We still have to talk about the above code. When we turn off 1, it means that the pointer in fd = 1 is no longer pointing to the display, but pointing to our file, that is, printing to our file.

But we also observed that there is nothing in add.txt here, isn’t it meant to print the content to the file? In fact, this is related to the cache area (this issue will be discussed next time), here we only need to refresh the cache area.

Here, the interface function fflush of C language is used to force the refresh.

At this time, the add.txt file is indeed written into fd.

Corresponding to the above phenomenon, we call it redirection. We point the pointer at fd = 1 to the redirection of the display to our own file.

The essence of redirection is to point to: the fd used by the upper layer remains unchanged, and the address of the struct file* corresponding to fd is changed in the kernel.

But will this redirection be too much trouble, every time you have to close fd, in fact, the operating system provides us with a special redirection interface.

2. Redirected interface function dup2
Through the redirection interface dup2, we can easily carry out the redirection work.

Function prototype:

int dup2(int oldfd, int newfd);

Although there are three types of dup interfaces, the most commonly used one is dup2.

function parameters

The part in the red box above roughly says: newfd is the copy of oldfd, please close newfd if necessary, that is, we close fd =1 at first, so that we can call redirection. Here are his notes:

If oldfd is not a valid file descriptor, the call will fail and newfd will not be closed.
If oldfd is a valid file descriptor and newfd has the same value as oldfd, then dup2() can do nothing and return newFD.
oldfd: refers to the fd of our file

newfd: refers to the fd to be redirected to that fd

The naming of this parameter may cause misunderstandings. Let’s understand it in the code below.

#define _CRT_SECURE_NO_WARNINGS

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
int fd = open(“add.txt”, O_WRONLY | O_CREAT | O_TRUNC, 0666);
if (fd < 0)
{
perror(“open”);
return 1;
}
dup2(fd,1);//redirection
fprintf(stdout, “open fd: %d\n”, fd);
close(fd);
return 0;
}

When we run the program, we write content to add.txt.

3. Two kinds of redirection
The following introduces two common redirections: input redirection (<) and output redirection (>).

input redirection (<)

For his understanding, we can understand it literally, referring to redesignating the device to replace the keyboard as a new input device.

Command format: command < file

Let’s further understand in the code:

int main()
{
int fd = open(“add.txt”, O_RDONLY);
if (fd < 0)
{
perror(“open”);
return 1;
}
dup2(fd, 0);//redirection
char line[64];
while (1)
{
printf(“<“);
if (fgets(line, sizeof(line), stdin) == NULL) break;
printf(“%s”, line);
}
return 0;
}

This is what we wrote in add.txt in advance, run the program below

Here we read the content in add.txt directly from the file line by line, and no longer need to read it from the keyboard, that is, input redirection.

Here it is similar to:

cat < add.txt

output redirection (>)

Literal understanding: refers to redesignating the device to replace the monitor as the new output device.

Command format: command > file

Here we use output redirection, which actually redirects the standard output result of command execution to the specified file. If the file already contains data, the original data will be cleared and new data will be written.

By hmimcu