Linux下I/O模型介绍

阻塞I/O


阻塞 I/O 模型是最原始的模型,进程一旦执行某个函数调用,进程就进入休眠状态(Sleeping)。比如平时FIFO管道的 read,还有基于TCP的流socket的 read 调用,进程一旦进行系统函数的调用,会从用户态切入内核态,内核会进行系统调用 read ,这时候如果对应的流(管道,socket都算)还没准备写入数据,那么 read 函数就会阻塞,从而导致进程挂起。直到数据来了,内核才会把数据拷贝从内核的缓冲区到进程用户的缓冲区。这时候 read 函数才能返回,进程才能向下走,继续下面的处理。整个过程是串行的,必须一步一步来。

block

#!/usr/bin/python

'''
	tcp socket server
'''

import sys
import socket

HOST = ''
PORT = 9243
DATA_BUFFER = 4096

# create a TCP socket
try :
	s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
	print 'Socket created'
except socket.error, msg :
	print 'Failed to create socket. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
	sys.exit()

# Bind socket to local host and port
try:
	s.bind((HOST, PORT))
except socket.error , msg:
	print 'Bind failed. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
	sys.exit()

s.listen(5)			# allow 5 simultaneous

while True:
	# wait for next client to connect
	connection, address = s.accept()		# blocking
	while True:
		data = connection.recv(DATA_BUFFER) # blocking
		if data:
			print data
		else:
			break
	connection.close()						# close socket

上面的代码就是一个 server 端阻塞的调用,accept阻塞在三次握手之前,recv阻塞在网卡获取数据,产生中断,内核拷贝数据到用户进程之前。

非阻塞I/O


进程调用recvfrom,这时候进程会不断去问内核数据是否有数据,这也就是所望的 polling 轮询模式,这会导致 cpu 大量的空转。

poll

while True:
        # wait for next client to connect
        connection, address = s.accept()                # connection is a new socket
        while True:
                print "waiting for data"
                data = connection.recvfrom(DATA_BUFFER) # receive up to 1K bytes
        connection.close()

上面的代码一旦socket连接好后,进程就开始不断轮询,不断打印"waiting for data",虽然进程是不 sleep 了,但是其实和 sleep 没啥区别。而去还不断进行系统调用,导致 cpu 空转,这种模型在正常情况下相比阻塞模型没有任何优点。这就叫非阻塞“忙”轮询。

Select模型


在阻塞I/O模式下,一个进程或线程只能处理一个流的I/O事件。如果想要同时处理多个流,要么多进程(fork),要么多线程(pthread_create),很不幸这两种方法效率都不高。为了避免CPU空转,可以引进一个代理(一开始有一位叫做select的代理,后来又有一位叫做poll的代理,不过两者的本质是一样的)。这个代理比较厉害,可以同时观察许多流的I/O事件,在空闲的时候,会把当前线程阻塞掉,当有一个或多个流有I/O事件时,就从阻塞态中醒来,于是我们的程序就会轮询一遍所有的流(于是我们可以把“忙”字去掉了)

select

当用户进程调用了select,那么整个进程会被block,而同时,kernel会“监视”所有select负责的socket,当任何一个socket中的数据准备好了,select就会返回套接字可读这个条件,我们调用recvfrom把所读数据报拷贝到应用程序进程缓冲区。

while True:
	print "Select blocking..."
	infds,outfds,errfds = select.select([s,],[],[],time_out)
	if len(infds) > 0 :
		# wait for next client to connect
		for sock in infds:
			connection, address = sock.accept()		# connection is a new socket
			print "Connected successfully !"
			connection.close()

和阻塞相比,select也没啥优势,唯一优势在于它可以同时处理多个感兴趣的描述符。

附:select函数用法

select.select(rlist,wlist, xlist[, timeout])

This is a straightforward interface to the Unix select() system call. The first three arguments are sequences of ‘waitable objects’: either integers representing file descriptors or objects with a parameterless method named fileno() returning such an integer:

rlist: wait until ready for reading
wlist: wait until ready for writing
xlist: wait for an “exceptional condition” (see the manual page for what your system considers such a condition)
Empty sequences are allowed, but acceptance of three empty sequences is platform-dependent. (It is known to work on Unix but not on Windows.) The optional timeout argument specifies a time-out as a floating point number in seconds. When the timeout argument is omitted the function blocks until at least one file descriptor is ready. A time-out value of zero specifies a poll and never blocks.

The return value is a triple of lists of objects that are ready: subsets of the first three arguments. When the time-out is reached without a file descriptor becoming ready, three empty lists are returned.

Among the acceptable object types in the sequences are Python file objects (e.g. sys.stdin, or objects returned by open() or os.popen()), socket objects returned by socket.socket(). You may also define a wrapper class yourself, as long as it has an appropriate fileno() method (that really returns a file descriptor, not just a random integer).

标签:Linux, Python

评论已关闭