The trick behind the magic: how your web browser works

Lucia Rodriguez
6 min readSep 2, 2019

The Internet and its impact on our lives are enormous. Starting from the ARPANET project, the Internet, its network protocols and capacities have been blooming. It has changed everything for us.

But how does it works? I’ll explain to you through an everyday task when you’re surfing on the web. Let me introduce you to the web browsing thing. For this example, my objective is to describe what happens when you type https://www.holbertonschool.com in your web browser.

In the first place, what is a web browser?

We’re very used to surf and enjoy Internet content while the later is very graphic. Sometimes we run into some <HTML> documents and we prefer to close this page. Well, the truth is all our beloved beautiful websites and webpages are written in <html> code and one of the functions of browsers is translating this code into graphics, animations, fonts, and colors, just for mentioning few attributes.

So… We start from the very beginning: the address field of your favorite web browser.

But, what https://www.holbertonschool.com means? Well, we’ll dissect what we’re meaning to do.

https:// This section refers to the scheme and often the protocol to be used is stated here. In this case, it means that we’re communicating to the destination server using the HTTPS protocol. HTTP stands for Hypertext Transfer Protocol and the transmission of the Internet’s all multimedia content is based on this. The S in HTTPS is for Secure and when you type it you’re asking encrypted communication between your machine and the remote server. Using HTTPS you can get you encryption via TSL or SSL protocols, both of them intended to protect your data when it is being sent or received. The double slash just after the “http” indicates the authority of the server that we want to invoke. In this case, we want to deal with the hostname.

www. This portion means that we want to communicate with the www. Subdomain of the hostname we want to visit. When you buy a domain, you’re able to configure subdomain to redirect your visitors to some specific file or page of your server.

holbertonschool.com: This is the main domain we want to visit and it means we want to communicate with the server whose hostname is holbertonschool.com.

After this you hit ENTER!

Setting coordinates…

The first thing our web browser does after hitting enter is looking for ways to communicate with the Holbertonschool.com server. But, although fast most of the time, it’s not that easy.

Whatever you want to visit on the Internet, you need its Internet Protocol (IP) address, a unique numerical label for every device connected to the network. But we’re not machines and remembering every IP address we want to visit is difficult. Instead, the Domain Name System (DNS) was created for translating IP addresses, hard to remember, into domain names, easier to remember.

So, the first task to visit a server is to know what door we want to knock. For this, our web browser checks DNS registries. It will start with its cache, and if unsuccessful, the operating system cache, our router cache, our Internet Service Provider (ISP) DNS cache and if unsuccessful even at this point, it will ask for answers to the root DNS servers, the servers that know everything about registered domain names and their IP addresses.

When our web browser needs to ask about DNS queries, it will use mostly the User Datagram Protocol (UDP) because of its speed. The answer will be served in the same way, unless its size would be greater than 512 bytes, so the Transmission Control Protocol (TCP) will be used instead. If the answer indicates that there is an IP for that domain, our web browser will prepare everything for sailing.

TCP, IP, HTTP, UDP? Too many to remember!

Honestly, yes. But all of them refer to the Internet Protocols suite, a set of protocols organized in four layers (Application, transport, internet, and link) and its objective is to assure communication on different degrees and ways.

Sailing!

Web surfing is more like quid pro quo. You give me something, I will give you something in exchange. In this case, a header. For web browsers, headers are like statements of intents. Web browser sends one and gets one on response. We chose to protect our data when we communicate with the remote server, so before we send the HTTP we need to communicate with the later and identify us to create the public and private keys for secure our data. These keys are how secure communication works, and we need both of them to decipher our content. Also, on this process, we get a certificate from the web server which must be proof of it is what it says to be.

So, headers look similar to this:

But before we get a response, our data take a little trip.

First stop: Firewall:

Firewalls are software programs or hardware devices (or both combined) and their function is to protect servers from unauthorized access through allowing, redirecting or blocking ports. The first step in getting our web page is to go through the firewall. Firewall probably redirect traffic between ports on this case.

Getting distributed (load balancer)

When you have loads of traffic on your server, it is highly recommendable to get a load balancer. Its function is to distribute requests to each server to avoid overloads.

Each server”

A server can be many things! A server is a program or a hardware device that serves services and content to other devices called clients. They can be of many types. We can have a web server that deals with all the HTML stuff we need to show to our visitors. An application server handles all the requests and produces the content they are asking for to us. Web server provides the frame and application server provides the photo. If we have some queries (for example, we are asking for a specific date, name or data to the server), our request which started from the web browser will reach the database of the server. This is a program to store organized data and it lets to send your answer presented properly.

So, we have surpassed the firewall, we get served via load balancer where the web server can take our order, pass it to the application server and the later may build a response (consulting database if necessary). After that, we will get a response built by the web and application servers. But before our page is served, our HTTP header will be sent to us.

Contents of a header

Besides the type of content and name of the web browser, the headers we send has a pivotal component: the method. This is, all the ways we can interact with the remote server. The most used, even if we don’t know is the GET method, which allows us to get content from the server if the resource we try to visit is allowed to do that. After that, we have POST (to submit data to the server and get an answer or change something), PUT (to replace something in the server with our data) DELETE (to delete a given data) and other methods available given our server. To get all its allowed methods we need to use the OPTION header and send it to the server.

And back!

Once we got an answer, our web browser will read and decipher it, rendering the resource’s content based on the HTML content we have got. On our case and for 1st September, 2019, we got this:

Resources:

https://tools.ietf.org/html/rfc3986#section-3

https://ns1.com/resources/dns-protocol

http://blog.fourthbit.com/2014/12/23/traffic-analysis-of-an-ssl-slash-tls-session/

https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods

https://developer.mozilla.org/en-US/docs/Learn/Getting_started_with_the_web/How_the_Web_works

--

--