#.#.#.#
, four numbers between 0 and 255 separated by dots. And to represent each number (with 256 possible values), we need exactly 8 bits, and so each IP address is made of 32 bits. But with 32 bits, we can only represent 4 billion values. And since there are more than 4 billion devices connected to the internet, we have a newer version of the protocol, IPv6, which has 128-bit addresses, that the world is starting to transition to.1.2.3.4:80
as the destination address, and 5.6.7.8
as the return address. And there are other complexities, but that’s the basics of how computers can communicate over a network.http://www.example.com/
. It turns out that there’s another technology called DNS, Domain Name System, that many internet providers and organizations maintain, which converts domain names (like example.com
) into IP addresses.
.com
, such as .net
, .org
, .us
, .uk
, and more.www
in front of a domain name is actually a subdomain, and there might be many of them created, each of which pointing to a different server or set of servers. It’s not required, and www
is only used by convention. For example, MIT uses web.mit.edu
for their main website’s address./
at the end implies that we’re asking for the root page of the site, which is conventionally index.html
, where .html
indicates that the file is written in HTML, a language we’ll soon look at.HTTP, Hypertext Transfer Protocol, is another set of rules and conventions for communicating. For example, humans might have the convention of shaking hands when meeting for the first (or subsequent) times. When our browser communicates to web servers through HTTP, too, both computers follow a protocol for making requests and responses.
A request for a webpage will look like this:
GET / HTTP/1.1
Host: www.example.com
...
GET
is an HTTP verb that indicates we want to fetch some resource. The /
indicates we’re looking for the default page, and HTTP/1.1
indicates the version of HTTP our browser is using.Host: www.example.com
is included, since the same server might be listening for and responding to requests for multiple websites. There are also other pieces of information included in the ...
, to help the server respond to us appropriately.The response from the server might look like this:
HTTP/1.1 200 OK
Content-Type: text/html
...
HTTP/1.1
. Then, 200
is a numeric code that means OK
, that the server was able to understand and respond to the request.Content-Type: text/html
indicates that the content of the response is in the language called HTML, in text format.We can open a browser like Chrome, and open the Developer Tools with View > Developer > Developer Tools. A panel will open:
We can click the Network tab, and if we type harvard.edu
into the address bar and press enter, a lot will happen very quickly. We can scroll to the very top, click the first request for harvard.edu
, and see in the right panel, under “Request Headers”, that the browser indeed sends a request that starts with what we expected:
We can scroll in the same panel and see that the response headers are slightly different:
301
, seems to say “Moved Permanently”. And if we look down to “Location:”, we see that the new location is https://www.harvard.edu
. There’s a www
, and also a different protocol, HTTPS, which will encrypt our communication more securely.Another HTTP code, 404
, is “Not Found”, and we get that back if we’re trying to get some URL that the server can’t find. These are some interesting ones:
200 OK
301 Moved Permanently
302 Found
304 Not Modified
401 Unauthorized
403 Forbidden
404 Not Found
418 I'm a Teapot
500 Internal Server Error
...