1、Beejs Guide to Network Programming 作者: Beej source:http:/ 1. What is a socket? You hear talk of sockets all the time, and perhaps you are wondering just what they are exactly. Well, theyre this: a way to speak to other programs using standard Unix file descriptors.What? Okyou may have heard some Uni
2、x hacker state, Jeez, everything in Unix is a file! What that person may have been talking about is the fact that when Unix programs do any sort of I/O, they do it by reading or writing to a file descriptor. A file descriptor is simply an integer associated with an open file. But (and heres the catc
3、h), that file can be a network connection, a FIFO, a pipe, a terminal, a real on-the-disk file, or just about anything else. Everything in Unix is a file! So when you want to communicate with another program over the Internet youre goanna do it through a file descriptor, youd better believe it. Wher
4、e do I get this file descriptor for network communication, Mr. Smarty-Pants? is probably the last question on your mind right now, but Im going to answer it anyway: You make a call to the socket () system routine. It returns the socket descriptor, and you communicate through it using the specialized
5、 send () and recv () (man send4, man recv5) socket calls. But, hey! you might be exclaiming right about now. If its a file descriptor, why in the name of Neptune cant I just use the normal read () and write () calls to communicate through the socket? The short answer is, You can! The longer answer i
6、s, You can, but send () and recv () offer much greater control over your data transmission. What next? How about this: there are all kinds of sockets. There are DARPA Internet addresses (Internet Sockets),path names on a local node (Unix Sockets), CCITT X.25 addresses (X.25 Sockets that you can safe
7、ly ignore), and probably many others depending on which Unix flavor you run. This document deals only with the first: Internet Sockets. 2.1. Two Types of Internet Sockets Whats this? There are two types of Internet sockets? Yes. Well, no. Im lying. There are more, but I didnt want to scare you. Im o
8、nly going to talk about two types here. Except for this sentence, where Im going to tell you that Raw Sockets are also very powerful and you should look them up. All right, already. What are the two types? One is Stream Sockets; the other is Datagram Sockets, which may hereafter be referred to as SO
9、CK_STREAM and SOCK_DGRAM, respectively. Datagram sockets are sometimes called connectionless sockets. (Though they can be connect ()d if you really want.) Stream sockets are reliable two-way connected communication streams. If you output two items into the socket in the order 1, 2, they will arrive
10、in the order 1, 2 at the opposite end. They will also be error free. Any errors you do encounter are figments of your own deranged mind, and are not to be discussed here. What uses stream sockets? Well, you may have heard of the telnetapplication, yes? It uses stream sockets. All the characters you
11、type need to arrive in the same order you type them, right? Also, web browsers use the HTTP protocol which uses stream sockets to get pages. Indeed, if you telnet to a web site on port 80, and type GET /, itll dump the HTML back at you! How do stream sockets achieve this high level of data transmiss
12、ion quality? They use a protocol called The Transmission Control Protocol, otherwise known as TCP .TCP makes sure your data arrives sequentially and error-free. You may have heard TCP before as the better half of TCP/IP where IP stands for Internet Protocol IP deals primarily with Internet routing a
13、nd is not generally responsible for data integrity. Cool. What about Datagram sockets? Why are they called connectionless? What is the deal, here, anyway? Why are they unreliable? Well, here are some facts: if you send a datagram, it may arrive. It may arrive out of order. If it arrives, the data wi
14、thin the packet will be error-free. Datagram sockets also use IP for routing, but they dont use TCP; they use the User Datagram Protocol, or UDP (see RFC-7688.) Why are they connectionless? Well, basically, its because you dont have to maintain an open connection as you do with stream sockets. You j
15、ust build a packet, slap an IP header on it with destination information, and send it out. No connection needed. They are generally used for packet-by-packet transfers of information. Sample applications: tftp, bootp, etc. Enough! you may scream. How do these programs even work if datagrams might ge
16、t lost?! Well, my human friend, each has its own protocol on top of UDP. For example, the tftp protocol says that for each packet that gets sent, the recipient has to send back a packet that says, I got it! (an ACK packet.) If the sender of the original packet gets no reply in, say, five seconds, he
17、ll re-transmit the packet until he finally gets an ACK. This acknowledgment procedure is very important when implementing SOCK_DGRAM applications. 2.2. Low level Nonsense and Network Theory Since I just mentioned layering of protocols, its time to talk about how networks really work, and to show som
18、e examples of how SOCK_DGRAM packets are built. Practically, you can probably skip this section. Its good background, however. Hey, kids, its time to learn about Data Encapsulation! This is very very important. Basically, it says this: a packet is born, the packet is wrapped (encapsulated) in a head
19、er (and rarely footer) by the first protocol (say, the TFTP protocol), then the whole thing (TFTP header included) is encapsulated again by the next protocol (say, UDP), then again by the next (IP), then again by the final protocol on the hardware (physical) layer (say, Ethernet). When another computer receives the packet, the hardware strips the Ethernet header, the kernel strips the IP and UDP headers, the TFTP program strips the TFTP