Summary of Computer Networks#
I. Architecture#
Application Layer
: The application layer protocols define the rules for communication and interaction between application processes, providing data transmission services for specific applications, such as HTTP, DNS, etc. The data unit is a message.Transport Layer
: The task of the transport layer is to provide general data transmission services (TCP or UDP) for communication between processes on two hosts.Network Layer
: Encapsulates the message segments or user datagrams generated by the transport layer into packets for transmission.Data Link Layer
: Assembles the IP datagrams handed down from the network layer into frames and transmits them over the link between two adjacent nodes.Physical Layer
: Transmits data in the form of bits using physical media.
Physical Layer Channel Multiplexing Technology#
Basic Concepts#
Channel, a medium for transmitting information in one direction, includes:
- Unidirectional Channel: One party sends, one party receives,
simplex
. - Bidirectional Alternating Channel: Both parties can send messages, but not simultaneously,
half-duplex
. - Bidirectional Simultaneous Channel: Both parties can send information simultaneously,
full-duplex
.
Multiplexing refers to channel sharing, with several common channel multiplexing technologies: frequency division multiplexing, time division multiplexing, statistical time division multiplexing. Signal multiplexing and demultiplexing are performed through multiplexers and demultiplexers.
Channel Multiplexing Technologies#
1. Frequency Division Multiplexing (FDM)
Users occupy different frequency bandwidths to multiplex the same channel at the same time.
2. Time Division Multiplexing (TDM)
Divides time into equal time slots for multiplexing frames, occupying the same bandwidth at different times.
3. Statistical Time Division Multiplexing (STDM)
An improved time division multiplexing that dynamically allocates time slots on demand rather than fixed allocation (the number of time slots is less than the number of users connected to the concentrator, ensuring that the packets in each transmitted STDM frame are full).
4. Wavelength Division Multiplexing (WDM)
Frequency division multiplexing in optical communication.
5. Code Division Multiplexing (CDM)
Each user communicates using the same frequency band at the same time, but each user uses a selected different code type, allowing independent communication between users.
Data Link Layer#
The main functions of the data link layer include:
- Link management
- Frame synchronization
- Flow control
- Error control
- Distinguishing data from control information
- Transparent transmission
- Addressing
The three basic issues of the data link layer are: framing
, transparent transmission
, and error detection
.
Framing#
Adds headers and trailers to the IP datagram to form a frame, allowing the receiving end to know the start and end of the frame in the bit stream at the physical layer, i.e., performing frame delimitation;
Additionally, the header and trailer must include many control information, and the link layer protocol specifies the maximum length of the data portion of the frame that can be transmitted, Maximum Transmission Unit - MTU.
Transparent Transmission#
The concept of transparent transmission means that during transmission at the data link layer, the transmitted data is not obstructed at the data link layer, and the data received by the receiving party is identical to the data sent by the sending party, meaning that the data link layer is completely transparent to the transmitted data frames;
Byte Stuffing:
During transmission, to prevent the data portion from containing frame delimiters, which would cause the receiving party to mistakenly believe that the received data has ended prematurely, escape characters are used to insert an escape character before any control characters appearing in the data portion. The inserted escape characters are removed at the receiving party's data link layer.
Bit Stuffing:
Whenever five consecutive 1s are detected, a 0 is immediately inserted to ensure that six consecutive 1s do not occur.
Error Detection#
Error detection refers to bit errors that occur during transmission;
Bit Error Rate: The ratio of erroneous bits to the total number of bits transmitted over a period of time;
Currently, the error detection method used at the data link layer is Cyclic Redundancy Check;
Note:
At the data link layer, we ensure that there are no bit errors, but not transmission errors. Transmission errors also include frame loss
, frame duplication
, frame out of order
, etc.;
II. IP Address#
Overview#
For computers to achieve network communication, they must have a network address for quick location. The IP address is the unique identity ID of a computer on the network, analogous to the need for a specific residential address for delivery in the real world.
IP Address = Network Address + Host Address (also known as: composed of host number and network number)
Network Segment Division#
Class A Address: Starts with 0, first byte range: 0 - 127 (1.0.0.0 - 126.255.255.255
);
Class B Address: Starts with 10, first byte range: 128 - 191 (128.0.0.0 - 191.255.255.255
);
Class C Address: Starts with 110, first byte range: 192 - 223 (192.0.0.0 - 223.255.255.255
);
Addresses reserved for internal private use on the Internet: 10.0.0.0 - 10.255.255.255
, 172.16.0.0 - 172.31.255.255
, 192.168.0.0 - 192.168.255.255
.
Subnet Mask#
-
When dividing subnets, we divide part of the host number that originally belonged to the host number into the network number, and to distinguish which part is the network number and which part is the host number, we need to use the subnet mask.
-
The subnet mask cannot exist independently; it must be used in conjunction with the IP address.
-
The only function of the subnet mask is to divide a certain IP address into two parts: the network address and the host address.
Calculating Subnet Mask#
Calculating Subnet Mask Using the Number of Subnets#
Calculation Rules:
-
Convert the number of subnets into binary representation.
-
Obtain the number of bits of that binary, denoted as N.
-
Obtain the class subnet mask of the IP address, set the first N bits of the host address portion to 1 to derive the subnet mask for dividing the IP address.
For example, to divide the Class B IP address 168.195.0.0 into 27 subnets:
-
27=11011
-
This binary has five bits, N = 5.
-
Set the first 5 bits of the host address of the Class B subnet mask 255.255.0.0 to 1, resulting in 255.255.248.0 as the subnet mask for the Class B IP address 168.195.0.0 divided into 27 subnets.
Calculating Subnet Mask Using the Number of Hosts#
Calculation Rules:
-
Convert the number of hosts into binary representation.
-
If the number of hosts is less than or equal to 254 (note to exclude the two reserved IP addresses), obtain the number of bits of that host, denoted as N, where N < 8. If greater than 254, then N > 8, meaning the host address will occupy more than 8 bits.
-
Use 255.255.255.255 to set all bits of the host address of that class IP address to 1, then set the last N bits to 0 from back to front to get the subnet mask value.
For example, to divide the Class B IP address 168.195.0.0 into several subnets, each with 700 hosts:
-
700=1010111100
-
This binary has ten bits, N = 10.
-
Set all bits of the host address of the Class B subnet mask 255.255.0.0 to 1, resulting in 255.255.255.255, then set the last 10 bits to 0 from back to front, resulting in: 11111111.11111111.11111100.00000000, which is 255.255.252.0. This is the subnet mask for the Class B IP address 168.195.0.0 divided into 700 hosts.
III. UDP#
Overview#
- The User Datagram Protocol (UDP) adds multiplexing and demultiplexing functions and error detection to the IP datagram service. It has the characteristics of
connectionless messages
andunreliable transmission
. - UDP only adds a header to the data provided by the application layer and processes it specially before handing it over to the network layer;
- The user datagram received from the network layer is unpacked, and the data is passed to the application layer unchanged.
UDP Header Format#
The UDP header fields are simple, consisting of 4 fields, each 2 bytes long, totaling 8 bytes.
- Source Port: Used when a reply is needed; can be set to zero if not needed.
- Destination Port: Must be used when delivering the message at the endpoint; otherwise, who receives the data?
- Length: The length of the UDP datagram, with a minimum value of 8 bytes, which is just the header.
- Checksum: Checks whether the user datagram has errors during transmission; if errors are found, it is discarded.
During transmission, if the receiving UDP finds that the destination port in the received datagram does not exist, it will discard it directly, and the Internet Control Message Protocol (ICMP) will send an "unreachable port" error message to the sender.
For details on cyclic redundancy check, see: Introduction and Implementation Analysis of CRC (Cyclic Redundancy Check Code)
IV. TCP#
As one of the main protocols of the transport layer, the TCP protocol is connection-oriented, end-to-end, reliable full-duplex communication, and a byte-stream-oriented data transmission protocol.
TCP Segment#
Although TCP is byte-stream-oriented, the data unit transmitted by TCP is a segment. A TCP segment consists of a TCP header
and a data portion
, with the first 20 bytes
of the TCP segment header being fixed, and an additional 4n bytes of options added dynamically as needed, with a maximum length of 40 bytes.
-
Source Port and Destination Port
each occupy two bytes; TCP's demultiplexing function is also implemented through ports. -
Sequence Number
occupies 4 bytes, with a range of [0,232]. TCP is byte-stream-oriented, and each byte is numbered in order. For example, if a segment has a sequence number of 201 and carries a data length of 100, then the sequence number of the first data byte is 201, and the last one is 300. When the maximum range is reached, it starts again from 0. -
Acknowledgment Number
occupies 4 bytes and is the sequence number of the first byte of the next segment expected to be received from the other party. If the acknowledgment number = N, it indicates that all data before sequence number N has been correctly received. -
Data Offset
occupies 4 bits, indicating the starting position of the data portion of the segment, the distance from the start of the entire segment. It indirectly indicates the length of the header. -
Reserved
occupies 6 bits and is reserved for future use; currently set to 0. -
URG (Urgent)
When URG=1, it indicates that the urgent pointer field is valid, and this segment contains urgent data that should be sent as soon as possible. -
Acknowledgment Number
is valid only when ACK=1; after the connection is established, all segments' ACKs are 1. -
PSH (Push)
When the receiving party receives a segment with PSH=1, it will deliver it to the receiving application as soon as possible, without waiting for the entire buffer to fill up before delivery. This is rarely used in practice. -
RST (Reset)
When RST=1, it indicates that a serious error has occurred in the TCP connection, and it must be disconnected and reconnected. -
SYN (Synchronize)
Used to synchronize sequence numbers when establishing a connection. When SYN=1 and ACK=0, it indicates a connection request segment. SYN=1 and ACK=1 indicates that the other party agrees to the connection. Used in TCP connection establishment. -
FIN (Finish)
Used to release a connection window. When FIN=1, it indicates that the sender of this segment will no longer send data and requests to release the unidirectional connection. Used in TCP connection termination. -
Window
occupies 2 bytes, indicating the sender's own receive window; the window value is used to inform the other party of the amount of data allowed to be sent. -
Checksum
occupies 2 bytes, and the checksum field checks the range including the header and data portion. -
Urgent Pointer
occupies 2 bytes; when URG=1, the urgent pointer indicates the number of urgent bytes in this segment (the urgent bytes end with normal bytes). -
Options
variable length, up to 40 bytes. For example, Maximum Segment Size (MSS). MSS refers to the length of the data portion, not the entire TCP segment length, with a default of 536 bytes. Options for window scaling, timestamps, etc.
Three-Way Handshake#
-
First: The client sends a connection request segment to the server, where SYN=1, seq=x (representing the client's initial sequence number), i.e.,
ISN (Initial Sequence Number)
. After sending, it enters the SYN_END state. -
Second: The server receives the segment and sends back an acknowledgment segment, where ACK=1, ack=x+1, and since it requires confirmation from the client, the segment also contains SYN=1, seq=y information. After sending, it enters the SYN_RCVD state.
-
Third: The client receives the segment and sends an acknowledgment segment, where ACK=1, ack=y+1. After sending, the client enters the ESTABLISHED state, and the server, upon receiving the segment, also enters the ESTABLISHED state. At this point, the connection is established.
Reason for Three-Way Handshake
To avoid resource wastage. If during the second handshake, due to network delay, the acknowledgment packet does not reach the client in time, the client will think the first handshake failed and will send a connection request again. The server, upon receiving it, will send the acknowledgment packet again. In this case, the server has created two connections, waiting for two clients to send data, while in reality, only one client is sending data, leading to resource wastage.
Four-Way Handshake#
The four-way handshake refers to the client and server each sending a request to terminate the connection while responding to each other's requests.
-
First Handshake: The client sends a packet with FIN=1, seq=x to the server, indicating that it has no data to transmit and wants to close the unidirectional connection. After sending, the client enters the FIN_WAIT_1 state.
-
Second Handshake: The server, upon receiving the request packet, sends back an acknowledgment packet with ACK=1, ack=x+1, indicating confirmation of the disconnection. The server enters the CLOSE_WAIT state. The client, upon receiving this packet, enters the FIN_WAIT_2 state. At this point, the data connection from the client to the server has been disconnected.
-
Third Handshake: The server sends a packet with FIN=1, seq=y to the client, indicating that it has no data to send to the client. After sending, it enters the LAST_ACK state, waiting for the client's acknowledgment packet.
-
Fourth Handshake: The client, upon receiving the request packet, sends an acknowledgment packet with ACK=1, ack=y+1 to the server and enters the TIME_WAIT state, possibly needing to retransmit the acknowledgment packet. The server, upon receiving the acknowledgment packet, enters the CLOSED state, and the connection from the server to the client has been disconnected. After a period, the client will also enter the CLOSED state.
Note:
- By default (without changing socket options), when you call close (or closesocket, hereafter referred to as close), if there is still data in the send buffer, TCP will continue to send the data until completion.
- Sending FIN only indicates that this end can no longer send data (the application layer can no longer call send), but it can still receive data.
Reason for Four-Way Handshake
Since TCP connections are full-duplex, both parties can actively transmit data, one party's disconnection needs to inform the other party to perform subsequent related operations, which is a responsible behavior.
Protocols based on TCP transmission include: FTP (File Transfer Protocol)
, Telnet (Remote Login Protocol)
, SMTP (Simple Mail Transfer Protocol)
, POP3 (used for receiving emails, in contrast to SMTP)
, HTTP Protocol
, etc.
Supplement on "Three-Way Handshake, Four-Way Handshake"#
ISN#
An important function of the three-way handshake is to exchange ISNs between the client and server, allowing each party to know how to assemble data by sequence number when receiving data next.
If ISN is fixed, an attacker can easily guess subsequent acknowledgment numbers.
ISN = M + F(localhost, localport, remotehost, remoteport)
M is a timer that increments by 1 every 4 milliseconds. F is a hash algorithm that generates a random value based on source IP, destination IP, source port, and destination port. The hash algorithm must not be easily deduced by external parties.
Detailed Explanation of TIME_WAIT State#
The TIME_WAIT state is set to 2 * Maximum Segment Lifetime (MSL)
, and the reasons for the TIME_WAIT state are:
-
The SOCKET in the LAST_ACK state may resend the FIN packet due to timeout without receiving the ACK packet, so the purpose of the TIME_WAIT state is to resend any potentially lost ACK packets.
-
To ensure the connection is closed, if the link to be closed later establishes a new connection with the same IP address and port, it may be mistakenly regarded as a new request for the old connection (a TCP session is uniquely determined by the four-tuple of [source IP, source port, destination IP, destination port]). Since the maximum segment lifetime for TCP packets is MSL, maintaining a TIME_WAIT state for 2MSL ensures that any unreceived or late packets in both transmission directions have disappeared or been discarded by routers, and after 2MSL, establishing a new connection will not receive application data from the original connection. Thus, by setting a TIME_WAIT state of 2MSL duration, a new connection will not receive application data from the original connection.
Sequence Number Wraparound#
Since ISN is random, sequence numbers can easily exceed 2^31-1. TCP's judgment of issues like packet loss and out-of-order relies on comparing sequence numbers, leading to the so-called TCP sequence number wraparound issue.
Half-Connection and Full-Connection#
Half-Connection Queue:
After the server receives the client's SYN for the first time, it enters the SYN_RCVD state. At this point, the connection has not been fully established, and the server will place this request in a queue, which we call the half-connection queue.
Full-Connection Queue:
Connections that have completed the three-way handshake and are established will be placed in the full-connection queue. If the queue is full, packet loss may occur.
For other details such as SYN flood attacks
, SYN Cache technology
, SYN Cookie technology
, SYN Proxy firewalls
, etc., please refer to: Do You Really Understand "Three-Way Handshake, Four-Way Handshake"?
TCP Reliable Transmission#
Reliable transmission should meet the following criteria:
-
The transmission channel does not produce errors;
-
Ensure the correctness of transmitted data, with no errors, no loss, no duplication, and in order.
Implementation of TCP Reliable Transmission:
-
Use three-way handshake to establish a TCP connection and four-way handshake to release the TCP connection, ensuring that the established transmission channel is reliable.
-
TCP employs the
ARQ protocol
to ensure the correctness of data transmission. -
TCP uses the
sliding window protocol
to ensure that the receiving party can timely process the received data for flow control. -
TCP uses
slow start
,congestion avoidance
,fast retransmit
, andfast recovery
forcongestion control
, avoiding network congestion.
Sliding Window Protocol#
The sliding window protocol means that the sender has a sending window, and data within the range of the sending window is allowed to be sent. The starting position and size of the sending window are dynamically adjusted based on the acknowledgment information received from the receiver and the size of the receiving window.
There are three cases for window sliding:
- The front edge moves to the right, which occurs when data is sent and acknowledged.
- The back edge moves to the right, allowing more data to be sent. This occurs when the receiving window increases or network congestion eases.
- The back edge moves to the left, which occurs when the receiver wishes to shrink the sending window. The TCP standard strongly discourages this situation because the sender may have already sent some data that is now outside the reduced window, leading to errors.
- The front edge of the window cannot move to the left because TCP will clear the data that has already been acknowledged outside the window from the buffer.
ARQ Protocol#
ARQ (Automatic Repeat reQuest) is a strategy for error control in unreliable networks that uses acknowledgments
(ACKs, where the receiver sends a message to inform the sender whether a packet has been correctly received) and timeouts
(the time period during which a confirmation message is awaited) to achieve reliable data transmission. It includes mechanisms such as stop-and-wait ARQ
, continuous ARQ
, error detection
, positive acknowledgment
, retransmission after timeout
, and negative acknowledgment and retransmission
.
Stop-and-Wait ARQ:
"Stop-and-wait" means that after sending a packet, the sender stops and waits for the receiver's acknowledgment before sending the next packet. There are two error scenarios:
- When the receiver receives an erroneous data packet, it discards the packet.
- If the sender does not receive an acknowledgment within a certain time, it retransmits the packet, i.e., timeout retransmission.
Continuous ARQ Protocol:
-
The sliding window protocol combined with automatic repeat request technology forms the continuous ARQ protocol. Continuous ARQ is divided into
Go-Back-N ARQ
andSelective Repeat ARQ
based on different methods of retransmitting data after a timeout. -
The sender slides the sending window forward by one packet position upon receiving an acknowledgment.
-
The receiver generally adopts an
accumulated acknowledgment (ACK mechanism)
, meaning it does not need to send acknowledgments for each received packet but can send an acknowledgment for the last packet in order after receiving several packets, indicating that all packets up to that point have been correctly received.
Go-Back-N ARQ
#
GBN allows the sender to continue sending several frames after sending one data frame without stopping to wait for the acknowledgment frame. The sender can continue sending while waiting for the acknowledgment frame. If an acknowledgment frame is received, it can continue sending data frames. This reduces waiting time and increases utilization. However, if an erroneous frame is received, all subsequent frames must be retransmitted.
Selective Repeat ARQ
#
In contrast to Go-Back-N ARQ, when an erroneous frame is received, it does not require retransmitting all subsequent frames, only the erroneous frames need to be resent, further improving utilization. However, this sacrifices space because it requires caching the correctly received information until all information is correctly received before sending an acknowledgment.
Flow Control#
When sending data through a TCP connection, if the sender sends data too slowly, it can lead to resource wastage; if the sender sends data too quickly, the receiver may not be able to keep up, leading to data loss. Flow control refers to sending data quickly and reasonably within the range that the receiver can handle.
Flow Control Based on Sliding Window#
When establishing a TCP connection, the receiver provides the size of its receive window in the acknowledgment segment. In each acknowledgment segment sent, it can dynamically adjust the size of the receive window based on the situation and inform the sender. As shown in the figure:
The sender sends 100 bytes of data starting from sequence number 1, and the receiver declares its receive window size as 300 bytes in the acknowledgment segment. The sender then sends 300 bytes of data, and the receiver adjusts its receive window size to 50 bytes in the acknowledgment segment. After sending 50 bytes of data, the sender receives an acknowledgment segment from the receiver, which declares the receive window as 0.
When the receive window is 0, the sender stops sending data until the receiver sends an acknowledgment segment indicating that the window size has changed. However, this acknowledgment segment may not be received by the sender, and if it is lost, both parties will be waiting, resulting in a deadlock. To prevent this situation, TCP specifies that when the receive window is 0, a persistent timer is started to periodically send probe packets to determine whether the state of the receive window being 0 has changed.
Additionally, the TCP standard specifies that when the receive window is 0, it will not accept normal data, but can accept zero window probe segments, acknowledgment segments, and segments carrying urgent data.
Congestion Control#
Common algorithms for TCP congestion control include four types: slow start
, congestion avoidance
, fast retransmit
, and fast recovery
.
Slow Start#
TCP maintains a congestion window, denoted as cwnd, which is related to SMSS, the maximum segment size sent. The slow start algorithm specifies that after initializing the congestion window, for each acknowledgment received for a new segment, the congestion window increases by one SMSS in size. The congestion window is in bytes, but slow start increases in units of SMSS. According to the slow start algorithm, after one round of transmission, the congestion window doubles, resulting in exponential growth.
Congestion Avoidance#
In addition to maintaining the congestion window cwnd variable, the slow start algorithm also maintains another variable, the slow start threshold ssthresh. When cwnd grows exponentially to greater than or equal to ssthresh, the slow start algorithm is no longer used, and the congestion avoidance algorithm is adopted for congestion control. The congestion avoidance algorithm specifies that cwnd is increased by 1/cwnd SMSS for each acknowledgment received. This means that instead of doubling cwnd after one round of transmission as in the slow start algorithm, it increases by one SMSS after one round of transmission. This is an additive growth relationship, and when congestion occurs (timeout or receiving duplicate acknowledgment), cwnd is set to 1 SMSS. ssthresh is set to half of the current window size, but at least 2 segments.
In summary: Additive increase, multiplicative decrease.
Fast Retransmit#
If individual segments are lost in the network without congestion occurring, the sender will not receive acknowledgment packets and will retransmit the segment after a timeout, mistakenly believing that congestion has occurred and incorrectly initiating the slow start algorithm, reducing transmission efficiency. The fast retransmit algorithm allows the sender to know about individual segment losses earlier. The fast retransmit algorithm requires the receiver to send acknowledgments immediately without delay, even if it receives out-of-order segments. As shown in the figure:
After receiving M1, the receiver sends an acknowledgment for M1; M2 is lost, and then the receiver sends repeated acknowledgments for M1 every time it receives M3, M4, and M5. The fast retransmit algorithm specifies that when three duplicate acknowledgments are received, the sender considers that the M2 segment is lost and immediately retransmits the M2 segment without waiting for a timeout to retransmit, thus avoiding the sender mistakenly believing that congestion has occurred.
Fast Recovery#
After executing the fast retransmit algorithm, the sender knows that only individual segments are lost, not that congestion has occurred. It will not execute the slow start algorithm but will execute the fast recovery algorithm: set the threshold value ssthresh = cwnd/2, and set cwnd = ssthresh + 3 SMSS. Setting the congestion window value to the threshold value plus 3 segments is because the sender has received three acknowledgment packets, indicating that three packets have left the network and reached the receiver's buffer. These three acknowledgment packets no longer occupy network resources, allowing for an appropriate increase in the size of the congestion window.
Difference Between Congestion Control and Flow Control#
-
Congestion Control: Congestion control acts on the network, preventing too much data from being injected into the network to avoid excessive network load.
-
Flow Control: Flow control acts on the receiver, controlling the sender's sending speed so that the receiver can keep up and prevent packet loss.
V. Differences Between TCP and UDP#
- TCP is byte-stream-oriented, while UDP is message-oriented.
- TCP is connection-oriented (requires three-way handshake), while UDP is connectionless.
- TCP provides reliable data transmission services, with mechanisms for retransmission of lost packets and guarantees of data order; UDP may lose packets and does not guarantee data order.
- Each TCP connection can only be point-to-point; UDP supports one-to-one, one-to-many, many-to-one, and many-to-many interactive communication.
- TCP requires more system resources, while UDP requires less.
Specific programming differences, different parameters for socket():
- UDP Server does not need to call listen and accept.
- UDP sends and receives data using sendto/recvfrom functions.
- TCP: Address information is determined during connect/accept.
- UDP: Address information must be specified each time in sendto/recvfrom functions.
- UDP: shutdown function is ineffective.
VI. HTTP Protocol#
The HTTP protocol stands for Hypertext Transfer Protocol. It is the protocol for transferring hypertext markup language (HTML) from a web server to a local browser, based on the TCP/IP communication protocol for data transmission.
Message Format#
Common Request Headers#
Accept
: Used to tell the server the data types supported by the client.Accept-Charset
: Used to inform the server of the encoding format adopted by the client.Accept-Encoding
: Used to inform the server of the data compression formats supported by the client.Accept-Language
: The language environment of the client.Host
: The hostname the client wants to access through this header.If-Modified-Since
: The time the client tells the server for resource caching.Referer
: The header through which the client tells the server which resource it came from to access the server (anti-leeching).User-Agent
: The header through which the client informs the server of its software environment.Cookie
: The header through which the client can send data to the server.Connection
: Whether to disconnect after processing this request or to keep the connection alive.Date
: The current time value.
Common Response Headers#
Location
: Indicates the address for redirection; this header is used with the 302 status code.Server
: Indicates the type of server.Content-Encoding
: Indicates the type of data compression sent by the server to the browser.Content-Length
: Indicates the length of the data sent by the server to the browser.Content-Language
: Indicates the languages supported by the server.Content-Type
: Indicates the type and content encoding of the data sent by the server to the browser.Last-Modified
: Indicates the last modification time of the server resource.Refresh
: Indicates timed refresh.Content-Disposition
: Indicates to the browser to open the resource in download mode (used when downloading files).Transfer-Encoding
: Informs the browser of the data transmission format.Set-Cookie
: Indicates the cookie information sent by the server to the browser (used for session management).Expires
: Informs the browser how long to cache the returned resource; -1 or 0 means not to cache.Cache-Control
: no-cache.Pragma
: no-cache.
The server uses these two headers to control the browser not to cache data.Connection
: Indicates the connection status between the server and the browser. close: close the connection; keep-alive: keep the connection alive.
Request Status Codes#
Status Code Classification#
- 1XX- Informational, the server has received the request and requires the requester to continue operations.
- 2XX- Success, the request has been successfully received, understood, and processed.
- 3XX - Redirection, further action is needed to complete the request.
- 4XX - Client error, the request contains syntax errors or cannot be completed.
- 5XX - Server error, an error occurred on the server while processing the request.
Common Request Status Codes#
200
OK Request succeeded.204
No Content No content, the server successfully processed the request without returning any content.301
Moved Permanently The requested resource has been moved, permanently redirected.302
Found Found, temporarily redirected.304
Not Modified Not modified, the client's cached resource is the latest, and the client should use the cache.400
Bad Request The client request has syntax errors and cannot be understood by the server.401
Unauthorized Unauthorized, the client is not authorized to access the data.403
Forbidden Forbidden, no permission to access.404
Not Found The requested resource does not exist; the URL may have been entered incorrectly.405
Method Not Allowed Method not allowed.406
Not Acceptable The content characteristics of the requested resource cannot satisfy the conditions in the request header, thus unable to generate a response entity.500
Internal Server Error Internal server error.502
Bad Gateway Invalid gateway, the server acting as a gateway or proxy received an invalid response while executing the request.503
Service Unavailable Service unavailable, the server failed to respond due to maintenance or overload.504
Gateway Timeout Gateway connection timed out, the gateway or proxy server failed to respond within the specified time.
Differences Between Post and Get Requests#
-
GET request data is appended after the URL (i.e., the data is placed in the HTTP protocol header), separated by a ? between the URL and the transmitted data, with parameters connected by &. POST places the submitted data in the body of the HTTP packet.
-
GET can submit a smaller amount of data (because data is appended after the URL), while POST can submit a larger amount of data; browsers and servers have different limits.
-
POST is more secure than GET; the data in GET is visible to everyone in the URL; saying GET is secure refers only to non-modifying information. GET is used to retrieve/query resource information, while POST is used to update resource information.
-
GET is idempotent, while POST is not.
-
GET encoding type is application/x-www-form-url, while POST encoding types are application/x-www-form-urlencoded or multipart/form-data.
-
GET historical parameters are retained in the browser history. POST parameters are not retained in the browser history.
Idempotence#
The idempotence of HTTP methods means that a single request and multiple requests for a particular resource should have the same side effects. That is, sending the same request once and sending it N times should yield the same effect!
Idempotent methods: GET, DELETE, PUT.
PUT and POST are both for creating or updating, with the only difference being that POST is non-idempotent.
HTTP/1.x and HTTP/2.0#
HTTP/1.1#
-
Introduced persistent connections (persistent connection), meaning TCP connections are not closed by default and can be reused for multiple requests without declaring Connection: keep-alive.
-
Introduced pipelining, allowing the client to send multiple requests simultaneously over the same TCP connection, further improving the efficiency of the HTTP protocol.
-
New methods: PUT, PATCH, OPTIONS, DELETE.
-
The HTTP protocol is stateless; each request must include all information. Many request fields are repeated, wasting bandwidth and affecting speed.
HTTP/2.0#
-
HTTP/2 is a completely binary protocol, with header information and data body both in binary, collectively referred to as "frames" (frame): header frames and data frames. This avoids security issues caused by plaintext transmission of content in HTTP/1.x.
-
Multiplexing TCP connections, allowing the client and browser to send multiple requests or responses simultaneously within one connection without needing to correspond in order, thus avoiding head-of-line blocking. This bidirectional real-time communication is called multiplexing.
-
HTTP/2 allows the server to proactively send resources to the client without a request, known as server push.
-
Introduced header compression mechanism, compressing header information using gzip or compress before sending.
HTTPS#
HTTPS communicates via HTTP, using the SSL/TLS protocol to encrypt data packets, making it the secure version of HTTP. The main differences from HTTP are:
- HTTPS requires obtaining a certificate from a CA.
- HTTP and HTTPS use different ports; the former is 80, while the latter is 443.
- HTTP operates over TCP, with all transmitted content in plaintext, while HTTPS operates over SSL/TLS, which runs over TCP, with all transmitted content encrypted.
- HTTPS effectively prevents operator hijacking.
VII. Process from Browser Inputting Domain Name to Displaying Page#
Event Sequence:
- DNS Domain Resolution: Browser cache --> Operating system cache --> Local host file --> Router cache --> ISP DNS cache --> Top-level DNS server/root DNS server, looking up the IP address corresponding to the domain name.
- Establishing TCP Connection: TCP uses a three-way handshake to establish a connection with the server, providing reliable transmission services.
- Sending HTTP Request: Sends the HTTP message to the server through the TCP connection.
- Server Processes Request and Returns HTTP Message.
- Browser Renders Page: The browser first parses the HTML file to construct the DOM tree, then parses the CSS file to construct the rendering tree. Once the rendering tree is complete, the browser begins to layout the rendering tree and draws it on the screen.
Protocols Involved:
-
Application Layer: HTTP (WWW access protocol), DNS (domain name resolution service).
DNS maintains a mapping table between domain names and IP addresses to resolve the domain name. -
Transport Layer: TCP (providing reliable data transmission for HTTP), UDP (used by DNS for transmission).
-
Network Layer: IP (transmission and routing of IP data packets), ICMP (providing error detection during network transmission), ARP (mapping the local default gateway IP address to the physical MAC address).
VIII. Differences Between Session and Cookie#
-
Cookie data is stored on the client's browser, while session data is stored on the server.
-
Cookies are not very secure; others can analyze the cookies stored locally and perform cookie spoofing.
-
For security reasons, sessions should be used.
-
Sessions are stored on the server for a certain period. With increased access, they can consume server performance. To alleviate server performance, cookies should be used.
-
A single cookie cannot exceed 4K of stored data, and many browsers limit a site to a maximum of 20 cookies.
Supplement#
Sessions rely on cookies to some extent. When the server executes the session mechanism, it generates a session ID, which is sent to the client. The client includes this ID in the HTTP request header for each request to the server, and this ID is stored on the client side, with the storage container being cookies.
IX. Long Connections and Short Connections vs. WebSocket#
Differences Between Long Connections and Short Connections#
In HTTP/1.0, short connections are used by default. This means that for each HTTP operation, a connection is established, but the connection is terminated once the task is completed. Starting from HTTP/1.1, long connections are used by default to maintain connection characteristics. The HTTP protocol using long connections sets Connection: keep-alive
in the response header.
In the case of long connections, once a webpage is fully opened, the TCP connection used for transmitting HTTP data between the client and server will not close. If the client accesses another webpage on the same server, it will continue to use the already established connection. Keep-Alive does not maintain the connection permanently; it has a keep-alive time that can be set in different server software (like Apache). Both the client and server must support long connections to implement them.
The long and short connections of the HTTP protocol are essentially the long and short connections of the TCP protocol.
Application Scenarios for Long and Short Connections#
-
Long connections are often used in scenarios with frequent operations and point-to-point communication, where the number of connections cannot be too many. Each TCP connection requires three-way handshakes, which take time. If each operation involves connecting and then operating, the processing speed will significantly decrease. Therefore, after each operation, the connection is not closed, allowing direct sending of data packets for the next operation without establishing a TCP connection. For example, database connections use long connections; using short connections for frequent communication can lead to socket errors, and frequent socket creation is also a waste of resources.
-
In contrast, web services generally use short connections because long connections can consume server resources. In scenarios with thousands or even millions of clients frequently connecting to a web service, using short connections can save more resources. If long connections were used, and there were thousands of users, each occupying a connection, it would be a significant burden. Therefore, short connections are better suited for high concurrency scenarios where each user does not require frequent operations.
Short Polling vs. Long Polling#
Short and long polling are fundamentally different from long and short connections. Long and short connections refer to the mechanism of establishing and maintaining TCP connections between the client and server, while long and short polling refer to the way the client requests the server and the server responds.
-
Short Polling: Repeatedly sends HTTP requests to check whether the target event has been completed. Advantages: Simple to write; disadvantages: Wastes bandwidth and server resources.
-
Long Polling: The server holds the HTTP request (using a loop or sleep, etc.) until the target event occurs (keeping this request waiting for data to arrive or timing out appropriately) and then returns the HTTP response. Advantages: Does not frequently request in the absence of messages; disadvantages: More complex to write.
WebSocket#
Establishing a WebSocket Connection#
The client browser first sends an HTTP request to the server, which differs from a typical HTTP request by including some additional header information, among which the additional header "Upgrade: WebSocket" indicates that this is a request for protocol upgrade.
Differences Between WebSocket and HTTP Long Connections#
- HTTP/1.1 uses Connection: keep-alive for long connections, with persistent connections being the default in HTTP/1.1. Multiple HTTP requests can be completed over a single TCP connection, but each request still requires sending a header separately. Keep-Alive does not maintain the connection permanently; it has a keep-alive time that can be set in different server software (like Apache).
- The long connection of WebSocket is a true full-duplex connection; after the initial TCP link is established, subsequent data can be sent by both parties without sending request headers, and this connection will persist until either the client or server actively closes it. Unlike HTTP long connections, WebSocket allows for more flexible control over when to close the connection, rather than the server immediately closing it when Keep-Alive times out (which is quite unhumanlike).
X. Reference Links#
- How to Calculate Subnet Mask in Computer Networks
- Detailed Explanation of TCP/UDP Protocols
- Summary of Computer Network Knowledge Points (Xie Xiren, 7th Edition)
- Do You Really Understand "Three-Way Handshake, Four-Way Handshake"?
- Network Learning Notes (II): Principles of TCP Reliable Transmission
- Computer Networks (5th)
- Differences Between HTTP Protocol Versions
- Clarification on Long Connections, Short Connections, Long Polling, Short Polling, and WebSocket