Introduction to HTTP
Why do we need to automate at a protocol level?
Imagine that we use a functional testing automation tool such as Selenium or Rational Robot. These tools simulate the user actions on the browser, filling inputs, choosing options in comboboxes and clicking buttons and links. That means that these tools need to open a browser for each user they simulate. Imagine now that we need to simulate one hundred users: we need to open one hundred browsers and execute the user actions on them. Well, imagine now that typically this kind of test is done simulating thousands of users, or even hundreds of thousands.
Instead of executing the user actions on the browser, what the load simulation tools do is to simulate the same traffic as a real user would do, at a protocol level. For each virtual user they execute a thread which opens a socket and sends and receives mainly text. It's doable to execute hundreds of threads doing that in a small number of machines.
HTTP stands for Hypertext Transfer Protocol, it is a protocol to transfer text with specific structure and content. A user (or agent) asks for a certain resource, and the server responds. That's why we say "HTTP Request" and "HTTP Response". The communication is a sequence of requests and responses between the user and the server.
It is a stateless protocol, it is not possible to identify which requests come from the same user. In order to face this problem, there is a special mechanism called cookies, small pieces of information that are stored in the client side, sent to the server attached to each request, and one of them is a Session identifier, useful to identify the user and its session.
HTTP is a protocol that works on TCP/IP and typically (but not necessarily) works on port 80, unless it is explicitly specified.
Here we have an example of the message interchange between a user and the server, first the request of a user trying to access to www.example.com and the response from the server:
Example taken from https://tools.ietf.org/html/rfc7230#section-2.1.
The communication process
Let's see a summary of the different steps of a communication between the user and the server, focusing on what we need to understand for performance testing automation:
- A user accesses to a URL (because s/he clicked on a link or s/he wrote it on the browser).
- The browser de-codifies the URL separating the different parts. In that way the browser identifies the protocol to use (HTTP or its secure version: HTTPS), the server name or IP, the port (if it is not declared, it is the port 80 as we mentioned before), and the resource from this server that the user wants to access. Also some parameters can be declared and send to the server.
- The browser opens a TCP connection to the server, to the corresponding port.
- The HTTP request is sent through this connection.
- The server sends the HTTP response on the same connection. This response includes the response code indicating if there was an error, or if everything went well, the data type of the resource requested, and the resource itself.
- The browser closes the connection (sometimes the connections are reused in different requests, but it is not important at this point).
This is the acronym for "Uniform Resource Locators". It's the first thing we see about a request, and even it helps us to identify and distinguish the different requests.
In the image you can see that the URL specifies:
- The protocol (in a webpage there will be HTTP and HTTPS if it's secured).
- The host or domain, identifying where you can find the server. They are useful to remember how to find a server not having to remember its IP.
- The port, as a part of the TCP communication. If it is not specified, it is the port 80. That depends on the configuration in the web server.
- The resource and its path, indicating what we are looking for, and where it is.
- After the "?" symbol, you find the parameters sent to the server as a part of the request.
There are different kinds of requests, and we need to understand some peculiarities of them in order to automate at a protocol level. The two most important and common ones are:
- GET: fetch a resource. All the information required by the server to locate the element is provided in the URL.
- POST: create a new resource. This kind of requests usually has data in the payload of the message.
So, the difference mainly stands in the payload. We will see that we need to pay special attention to the data sent in requests and responses, and we will see easily if the data is in the URL or also in the payload, according to the request type (GET or POST).
As a part of the response message, the server indicates the status in a code, which is a 3 digit number starting with 1 to 5. The status code is important for a performance tester because all load simulation tools interpret them automatically. The HTTP spec defines 5 ranges for specific types of responses:
- 1xx: Informational Messages.
- 2xx: Successful: The most common code is 200 OK, which means that the server sends the resource in the message body of the response.
- 3xx: Redirection: This response indicates that the client has to do something else to get what is looking for. For example, if it's a 301 or 302, a new request to another URL needs to be made, that is specified in the response in the header "location". If it is a 304 it indicates that the client should use the cached copy.
- 4xx: Client Error: There was a problem, and it's in the client side, e.g., the client is requesting a non- existent resource (404) or without the corresponding permissions (401).
- 5xx: Server Error: There was a problem, and it's in the server side, e.g., the server or service is unavailable (503). Typically, it also occurs when the server program throws an uncaught exception.
Now we will see more details related to HTTP but also to HTML, an acronym which stands for Hyper Text Markup Language. It is the language used for describing web documents (web pages), and any browser is able to interpret the different tags and elements present on them, in order to render them correctly on the screen.
GET and POST
Firstly, any link we click generates a GET request. Then, if we are filling data in a form, it can generate different kind of request when we click the "submit" button. If we pay attention to the HTML code in a form, we will find an attribute called "method" which indicates whether the request associated with the "submit" action will be a POST or a GET:
- <FORM METHOD="GET">
- <FORM METHOD="POST">
If the method is POST, the data inserted in the form will be sent in the body of the request, and if the method is GET then the data will be sent in the query, in the URL. I believe that the following example taken from here is a great way to see it explained: If we have a form like this one:
When the user clicks the button it generates the following GET message:
But instead, if we have the POST method defined as here:
Then, the request is a POST message with the data in the body of the HTTP request:
Pay attention to the way the variables and their values are sent, and how the different pairs variable-value are separated one from each other, with a special character "&", but sometimes is a colon ",".
When a user puts a value with certain characters such as spaces, special characters (&, $, #, etc.), the browser has to encode them in order to respect the protocol. Some characters have a special meaning, for example the space. If you remember the HTTP request example, we have a group of words separated with spaces, colons, and some other characters. If the parameters had any of these characters, that could break the parser in charge of identifying the different elements in a request. That is why when we fill a field with something like "performance testing course", what is going to be transferred by the wire is "performance%20testing%20course" if it's a POST and the information goes in the payload, or "performance+testing+course" if it is a GET and the information goes in the URL, the same string but encoded. Then the server has the responsibility to decode all data before using it.
How does the browser process the response?