Learn

284 articlesCategory: All
Network

How a browser displays a web page

How a Browser Displays a Web Page

A web page is not displayed as-is the moment you enter a URL into the browser.

In reality, a sequence happens in order: interpreting the URL, obtaining an IP address through DNS, connecting to the server, establishing a secure connection through HTTPS, sending an HTTP request, receiving a response from the server, and rendering by the browser.

However, not every process necessarily starts from the beginning every time. DNS results may be cached, an existing connection may be reused, or data may be loaded from the browser cache.

This article avoids going too deeply into detailed specifications and organizes what happens before a web page is displayed, following the actual communication flow.

What Happens After You Enter a URL?

For example, suppose you enter the string "https[:]//example.com" in the browser address bar.

The browser first checks the entered string.

It determines whether it is a search keyword or a website URL, and if it is a URL, it interprets the destination.

A URL mainly contains the following kinds of information.

ElementExampleMeaning
SchemehttpsShows which method is used for communication
Hostnameexample.comShows the name of the website you want to connect to
Port number443Shows which communication endpoint to connect to. It is often omitted
Path/aboutShows which location on the server is being requested
Query string?id=10Passes additional information to the server
Fragment#sectionShows a position within the page. Usually not sent to the server

A URL is a form that makes it easy for humans to specify a website.

However, when looking for the server to communicate with on a network, an IP address is ultimately necessary.

For that reason, the browser proceeds to look up the IP address from the host name.

Looking Up the IP Address With DNS

The browser cannot connect to a server with only the name "example.com."

DNS is the mechanism used for this.

DNS maps domain names to IP addresses.

The browser or OS first checks whether DNS results remain locally. If you have accessed the same domain in the past, it may be possible to obtain the IP address from the DNS cache.

If there is no information in the cache, it queries a DNS resolver. The DNS resolver looks for the IP address corresponding to the target domain name, querying multiple DNS servers as needed.

StepWhat happensPurpose
1The browser or OS checks the cacheCheck whether a previously looked-up IP address can be reused
2Query the DNS resolverLook up the IP address corresponding to the domain name
3Obtain an A record or AAAA recordGet the IPv4 or IPv6 destination
4Decide the destination using the IP addressIdentify the actual communication partner

The IP address obtained here is not necessarily fixed.

Even for the same domain name, different IP addresses may be returned depending on region, DNS settings, load balancing, CDN, and other factors.

Connecting to the server

Once the IP address is known, the browser connects to the server that has that IP address.

In web communication, not only the destination IP address but also the port number matters.

Normally, HTTP uses port 80, and HTTPS uses port 443.

Communication methodMain port numberMeaning
HTTP80Often used for unencrypted web communication
HTTPS443Often used for web communication protected by TLS

With HTTP/1.1 and HTTP/2, TCP is normally used for connections to servers.

By contrast, HTTP/3 uses a mechanism called QUIC, which runs over UDP.

This article does not go into the details of HTTP/2 or HTTP/3, but the important point is that the browser uses an IP address and port number to create a connection with the server.

Also, a new connection is not necessarily created every time. If a connection to the same server already remains, the browser may reuse the existing connection.

Creating a secure connection with HTTPS

If the URL starts with "https," the browser communicates with the server using HTTPS.

HTTPS is communication in which HTTP is protected by TLS.

TLS mainly has the following roles.

RoleMeaningImportance
EncryptionMakes communication content harder for third parties to readProtects passwords and page content
Tamper detectionChecks whether content has been changed during communicationMakes it easier to prevent modification by a man-in-the-middle attacker
Server identity verificationChecks whether the destination is the server for the intended domainHelps defend against fake sites and man-in-the-middle attacks

The important point here is that HTTPS is not simply "a mechanism that encrypts communication."

The browser checks the certificate presented by the server and verifies whether that certificate is valid for the domain name being accessed.

Through this check, the browser can judge that "the server I am connected to is likely the intended server."

However, even when HTTPS is used, not all information about the communication is hidden.

For example, the destination IP address, the time communication occurs, and the amount of communication are not things HTTPS alone can completely hide.

Also, depending on the environment, information about the destination domain name may be visible along the network path.

In other words, HTTPS is very important for protecting communication content and verifying the other party, but it is not a mechanism that fully guarantees anonymity.

Requesting a Page With HTTP/HTTPS

Once a secure connection is created, the browser requests the page from the server.

This request is called an HTTP request.

With HTTPS, the contents of the HTTP request are sent while protected by TLS.

For example, the browser may send the following kinds of information to the server.

InformationMeaningNote
MethodWhich operation it wants to performGET is often used for page retrieval
PathWhich page or data it wantsSuch as / or /about
HostWhich domain the request is forAlso used when multiple sites are handled by one IP
User-AgentInformation such as browser and OSUsed for environment detection and display adjustment
Accept-LanguagePreferred languageUsed for decisions such as displaying Japanese pages
RefererWhether the referring page is sentMay not be sent depending on settings or specifications
Identifying information stored for the siteUsed to maintain login state and settings

The server looks at this request and decides what data to return.

Even when accessing the same URL, the returned content may change depending on login state, cookies, language settings, device type, and other factors.

The Server Returns Data

After receiving a request from the browser, the server returns an HTTP response.

An HTTP response includes a status code, headers, body, and other parts.

ElementExampleMeaning
Status code200Shows whether the request succeeded, whether there was an error, and similar information
HeaderContent-TypeShows the data type, cache policy, and similar information
BodyHTML and similar dataThe actual data used by the browser

HTML is central to a web page, but HTML alone does not necessarily complete the whole page.

Many web pages are displayed by combining multiple kinds of data.

DataRole
HTMLRepresents the page structure
CSSSpecifies appearance such as colors, spacing, layout, and text size
JavaScriptPerforms processing and dynamic changes on the page
ImagesDisplays photos, icons, diagrams, and similar content
FontsAdjusts how text looks
Video and audioPlays media content

The browser first receives HTML, then additionally obtains the CSS, JavaScript, images, fonts, and other resources specified in that HTML.

For that reason, even opening a single web page often causes multiple HTTP requests in practice.

The browser displays it on screen

When the browser receives data from the server, it converts that data into a form that can be displayed on screen.

Broadly, the following kinds of processing happen.

StepWhat happens
Parse HTMLRead the page structure
Apply CSSReflect appearance rules
Calculate layoutDecide where each element is placed
PaintDisplay text, images, backgrounds, and similar content on screen
Run JavaScriptChange page content or behavior as needed

The point to understand here is that the browser is not receiving a finished screen from the server.

The server returns materials such as HTML, CSS, JavaScript, and images.

The browser interprets them and assembles the page according to its own screen size, settings, and supported features.

For that reason, even the same web page may be displayed differently depending on PC, smartphone, browser type, screen width, and settings.

One page can still cause multiple communications

An important point in web page display is that even if you open one page, there is not necessarily only one communication destination.

Even if the page body is obtained from "example.com," images, ads, analytics, external fonts, external scripts, and similar resources may be loaded from other domains.

Communication destinationPurposeExample
Page body serverObtain HTMLBody text and page structure
Image delivery serverObtain imagesArticle images and icons
Advertising serverDisplay adsBanner ads and ad scripts
Analytics serverMeasure access statusAnalytics tags
External font serverLoad fontsWeb fonts
External script serverAdd functionalityEmbedded widgets and UI parts

From the user's view, it may look like they only "opened one page," but behind the scenes the browser may be communicating with multiple servers.

Additional communication may also occur not only immediately after opening the page, but also in response to scrolling, button operations, searches, form input, video playback, and similar actions.

In recent web pages, designs that initially load only the minimum data and fetch additional data in response to user actions are also common.

Communication May Be Skipped Because of Caches

When displaying a web page, the browser does not necessarily obtain all data from the server every time.

The browser may temporarily store images, CSS, JavaScript, and other data obtained in the past.

This is called a cache.

When the cache is valid, the browser may use data stored inside the device without obtaining the same data from the server again.

TargetEffect of cache
DNS resultsReduces the need to look up the IP address from the domain name again
ImagesAvoids downloading the same image many times
CSSReuses files related to page appearance
JavaScriptUses the same script without fetching it again
ConnectionExisting connections may be reused

Caches are important for increasing display speed and reducing communication volume.

However, when thinking about anonymity and privacy, it is also necessary to pay attention to information that remains inside the browser, such as caches and cookies.

Important viewpoints for thinking about anonymity

When thinking about anonymity, it is important that opening a web page is not a simple one-time communication.

In web access, multiple elements are involved, including DNS, IP addresses, HTTPS, cookies, User-Agent, Referer, external scripts, and caches.

Even when communication content is protected by HTTPS, the destination IP address, communication timing, communication volume, headers sent by the browser, identification through cookies, and additional communication to external servers become issues from other viewpoints.

ViewpointWhat to check
DNSWhich domain name was queried
IP addressWhich server was connected to
HTTPSWhether communication content encryption, tamper detection, and server identity verification are performed
CookieWhether the user may be identified as the same user
User-AgentWhether environment information such as browser and OS is being sent
RefererWhether the page the user came from is communicated
External communicationWhether communication also occurs with servers other than the page body
CacheWhether past browsing or obtained data remains inside the device

Especially important is that "communication content is encrypted" and "no one can tell who accessed where" are not the same thing.

HTTPS is very important for protecting communication content and verifying the identity of the server you connect to.

However, when thinking about anonymity, you need to look separately not only at HTTPS, but also at DNS, IP addresses, information sent by the browser, cookies, external communication, and other factors.

Summary

Before a browser displays a web page, multiple processes happen step by step.

First, the browser interprets the entered URL and checks the host name.

Next, DNS is used to look up the IP address corresponding to the host name.

Once the IP address is known, the browser connects to the server.

For HTTPS, TLS performs encryption, tamper detection, and server identity verification, creating a secure connection.

On that basis, the browser sends an HTTP request, and the server returns data such as HTML, CSS, JavaScript, and images.

The browser parses the received data and displays it on screen.

In addition, while displaying one page, additional communication may occur through images, ads, analytics tags, external fonts, external scripts, and similar resources.

Opening a web page involves more communication and processing than it appears to from the outside.

Understanding this flow makes it easier to organize not only the basic structure of the web, but also the points to examine when thinking about anonymity and privacy.

Related tools

DNS Leak Test

DNSLeakTest

An external resource related to this article. Open it only when it fits your situation and threat model.

Why it is listed: It can help with the article topic, but it is outside Anonymity Sense and should be checked before use.

URL : https://www.dnsleaktest.com/

Open external site
WebRTC Leak Test

BrowserLeaks WebRTC

An external resource related to this article. Open it only when it fits your situation and threat model.

Why it is listed: It can help with the article topic, but it is outside Anonymity Sense and should be checked before use.

URL : https://browserleaks.com/webrtc

Open external site

Related articles