From inputting the URL to displaying the webpage.

Recently preparing for the front-end interview, this question is basically a must-ask question. I searched for some information online and organized it myself.

The overall process is as follows:

Enter the URL
DNS resolution
Establish TCP connection
Send HTTP request
Server permanent redirection
Server processes the request and returns an HTTP response
Browser displays HTML
Connection ends

Enter the URL#

The Chinese name for URL is Uniform Resource Locator, which is used to obtain the location and access method of resources. It consists of: protocol://hostname/path/;parameters?query#fragment

DNS resolution#

DNS (Domain Name System) is a distributed database on the Internet that maps domain names to IP addresses. The process of obtaining the IP address corresponding to the hostname is called domain name resolution. The process of DNS resolution is actually to find which machine has the required resources. It acts as a translator, converting the input URL into an IP address. The following is the lookup order of DNS:

Browser cache: Read access records from the browser cache
Operating system cache: Look up the cache in the system running memory
Host file: Look up the host file on the local hard disk
Router cache: Some routers cache visited domain names
ISP (Internet Service Provider) DNS cache: If the local cache is not found, the ISP will look it up in the cache of the current server
Root DNS server: The root domain receives the request, determines which server manages it, and returns the IP of the top-level DNS server to the requester.

After the lookup is completed, the local DNS server sends a request to the domain name resolution server, and the local server returns the IP address to the computer and saves the corresponding relationship in the cache.

Expansion:
DNS query methods:

Recursive: The local DNS server is responsible for querying other DNS servers (generally first query the root domain server, and then query down level by level), and returns the result to the local DNS server, which then returns it to the client.
Iterative: The local DNS server gives the IP addresses of other DNS servers that can resolve the domain name to the client DNS program, and then the program queries these DNS servers (used when the local DNS server cannot answer the client's DNS query).

DNS optimization methods:

DNS caching
DNS load balancing
- Why is it needed: When the requested resources are all on the same machine, the machine may not be able to handle it and crash.
- Principle: Configure multiple IP addresses for a hostname, and return different results for each query in the DNS file according to the order of the recorded IP addresses, guiding the access to different machines.

Establish TCP connection#

After obtaining the IP address, the TCP connection is established through a three-way handshake.

First handshake: The client sends a SYN (synchronization sequence number) packet to the server and enters the SYN_SENT state, waiting for the server to confirm.
Second handshake: After the server receives the SYN packet, it confirms and also sends a SYN packet, which is SYN+ACK packet. The server enters the SYN_RECE state.
Third handshake: The client receives the server's SYN+ACK packet and sends an ACK packet to the server. After sending, the client and the server enter the ESTABLISHED state.

Expansion:
Why three-way handshake: To prevent the transmission of invalid connection request packets from suddenly reaching the server and causing errors.

Send HTTP request#

After establishing the TCP connection, the client initiates an HTTP request. The HTTP message contains three parts:

Request line: Request method + URL + protocol/version
Request header: Transmits additional information about the request and the client itself
Request body: Data to be transmitted

Server permanent redirection#

The server responds to the browser with a 301 permanent redirection response. For example, accessing http://google.com/ will automatically redirect to http://www.google.com/

Purpose:

This will group the visits to the address with and without "www" under the same website ranking, so that the website's ranking in search links will not be lowered.
Using different addresses will result in poor cache performance, and a page with multiple names may appear multiple times in the cache.

Server processes the request and returns an HTTP response#

After receiving the TCP packet from the fixed port, the backend processes the TCP and parses the HTTP protocol. It further encapsulates it into an HTTP Request object for upper-layer use. The HTTP response consists of four parts:

Status line: Protocol version, status code, status description
Response header: Consists of key-value pairs, with each pair on a separate line separated by ":"
Blank line: Separates the request data
Response body

Expansion:
In larger websites, the request is sent to a reverse proxy, and the same application is deployed on multiple servers to distribute a large number of user requests to multiple machines. That is, the client first requests Nginx, Nginx requests the application server, and finally returns the result to the client.

Browser displays HTML#

Displaying HTML in the browser is a process of parsing and rendering. The general process is as follows:

Parse the HTML file to build the DOM tree
Parse the CSS file to build the render tree
The browser starts layout and renders the render tree and draws it on the screen

Expansion:
About reflow and repaint:

Each element in the DOM node exists in the form of a box model, and the process of calculating its position, size, and other attributes by the browser is called reflow.
After these attributes are determined, the browser starts to draw the content, and this drawing process is called repaint.

Reflow and repaint are definitely required during the initial loading of the page, but both processes are very performance-consuming and should be minimized as much as possible.
JS parsing and execution mechanism:
When encountering a JS file during the parsing process, the HTML document suspends the rendering thread and waits for the JS file to be loaded and parsed (because JS may modify the DOM, such as document.write). Therefore, JS code is usually placed at the end of the HTML. JS parsing is done by the JS parsing engine in the browser. JS is single-threaded, but tasks that are time-consuming, such as IO reading and writing, require a mechanism that can execute tasks in the queue first, namely synchronous tasks and asynchronous tasks.
The execution mechanism of JS can be seen as a main thread + a task queue.
Synchronous tasks are tasks on the main thread, forming a stack on the main thread;
Asynchronous tasks are tasks in the task queue, and when there are results, an event is placed in the task queue;
The script first runs the stack, and then extracts events from the task queue and runs the tasks inside.
This process loops continuously and is also called the event loop.

Connection ends#

Nowadays, in order to optimize request latency, the TCP connection is usually kept alive, and the TCP connection is terminated when the current page is closed. Next is the four-way handshake to close the TCP connection:

The host sends a FIN and enters the FIN_WAIT_1 state.
The server receives the FIN and sends an ACK to the host, confirming the sequence number as the received sequence number + 1. The server enters the CLOSE_WAIT state.
The server sends a FIN packet to close the data transmission and enters the LAST_ACK state.
The host receives the FIN and enters the TIME_WAIT state, then sends an ACK to the server to ensure that the server enters the CLOSED state after receiving its own ACK packet.