場(chǎng)景: 從本地磁盤讀取數(shù)據(jù),然后將這些數(shù)據(jù)通過socket發(fā)送到遠(yuǎn)端。
read(file, user_buf,? len); // read data from disk
write(socket, tmp_buf, len); // write data into NIC.
完整過程如下:
1: 將文件從磁盤讀到kernel
2: 將kernel中的數(shù)據(jù)copy到user_buffer? ?// read done.
3:? ?將user_buffer中數(shù)據(jù)復(fù)制到kernel
4: 將kernel中數(shù)據(jù)復(fù)制到NIC。
上述過程4 次 context switch, 4次copy。
引入zero-copy 技術(shù)
1: 將文件從磁盤讀到kernel。
2: 將kernel中的數(shù)據(jù)發(fā)到NIC。
上述過程1次context switch, 2次copy。
system interface
ssize_t sendfile( int?out_fd, int?in_fd, off_t *offset,? size_t?count?);
sendfile() copies data between one file descriptor and another. Because this copying is done within the kernel,?sendfile() is more efficient than the combination of?read?and?write?, which would require transferring data to and from user space.
in_fd :?should be a file descriptor opened for reading and?out_fd?should be a descriptor opened for writing. If?offset?is not NULL, then it points to a variable holding the file offset from which?sendfile() will start reading data from?in_fd. When?sendfile() returns, this variable will be set to the offset of the byte following the last byte that was read. If?offset?is not NULL, then?sendfile() does not modify the file offset of?in_fd; otherwise the file offset is adjusted to reflect the number of bytes read fromi n_fd. If?offset?is NULL, then data will be read from?in_fd?starting at the file offset, and the file offset will be updated by the call.count?is the number of bytes to copy between the file descriptors. The?in_fd?argument must correspond to a file which supports?mmap(2)-like operations (i.e., it cannot be a socket). In Linux kernels before 2.6.33,out_fd?must refer to a socket. Since Linux 2.6.33 it can be any file. If it is a regular file, then?sendfile() changes the file offset appropriately.
=============================================================================
場(chǎng)景:
對(duì)于一個(gè)流對(duì)象,當(dāng)從這個(gè)流對(duì)象讀取數(shù)據(jù)的時(shí)候,需要傳遞給流對(duì)象一個(gè)buffer, 因此,此處存在一次memory copy。例如:在asio中socket的讀寫操作。
ZeroCopyInputStream?and?ZeroCopyOutputStream?interfaces, which represent abstract I/O streams to and from which protocol buffers can be read and written.
For a few simple implementations of these interfaces, see?zero_copy_stream_impl.h.
These interfaces are different from classic I/O streams in that they try to minimize the amount of data copying that needs to be done. To accomplish this, responsibility for allocating buffers is moved to the stream object, rather than being the responsibility of the caller. So, the stream can return a buffer which actually points directly into the final data structure where the bytes are to be stored, and the caller can interact directly with that buffer, eliminating an intermediate copy operation.
As an example, consider the common case in which you are reading bytes from an array that is already in memory (or perhaps an mmap()ed file). With classic I/O streams, you would do something like:
char buffer[BUFFER_SIZE];
input->Read(buffer, BUFFER_SIZE);
DoSomething(buffer, BUFFER_SIZE);
Then, the stream basically just calls memcpy() to copy the data from the array into your buffer. With a?ZeroCopyInputStream, you would do this instead:
const void* buffer;?
int size;
input->Next(&buffer, &size);
DoSomething(buffer, size);
Here, no copy is performed. The input stream returns a pointer directly into the backing array, and the caller ends up reading directly from it.