Introduction to HTTP

Why do we need to automate at a protocol level?

Imagine that we use a functional testing automation tool such as Selenium or Rational Robot. These tools simulate the user actions on the browser, filling inputs, choosing options in comboboxes and clicking buttons and links. That means that these tools need to open a browser for each user they simulate. Imagine now that we need to simulate one hundred users: we need to open one hundred browsers and execute the user actions on them. Well, imagine now that typically this kind of test is done simulating thousands of users, or even hundreds of thousands. 
Instead of executing the user actions on the browser, what the load simulation tools do is to simulate the same traffic as a real user would do, at a protocol level. For each virtual user they execute a thread which opens a socket and sends and receives mainly text. It's doable to execute hundreds of threads doing that in a small number of machines. 

 

HTTP basics

HTTP stands for Hypertext Transfer Protocol, it is a protocol to transfer text with specific structure and content. A user (or agent) asks for a certain resource, and the server responds. That's why we say "HTTP Request" and "HTTP Response". The communication is a sequence of requests and responses between the user and the server. 

 

 

It is a stateless protocol, it is not possible to identify which requests come from the same user. In order to face this problem, there is a special mechanism called cookies, small pieces of information that are stored in the client side, sent to the server attached to each request, and one of them is a Session identifier, useful to identify the user and its session. 
HTTP is a protocol that works on TCP/IP and typically (but not necessarily) works on port 80, unless it is explicitly specified.

 

Here we have an example of the message interchange between a user and the server, first the request of a user trying to access to www.example.com and the response from the server:

 

 

Example taken from https://tools.ietf.org/html/rfc7230#section-2.1.

As you can see, there are many parameters with much information about the communication. Something important to see in the example is that there are "headers" and "body". In the example request, there are only headers (i.e., "User-Agent", "Host", and there could be many more), and in the response example there are headers (i.e., "Server", "Content-Length") and body, which is the payload of the message, typically containing the resource we are looking for, which could be a file, HTML, JavaScript, images, CSS or others. In the example we can see only a piece of text saying "Hello World! My payload includes a trailing CRLF.".

The communication process

Let's see a summary of the different steps of a communication between the user and the server, focusing on what we need to understand for performance testing automation:

  1. A user accesses to a URL (because s/he clicked on a link or s/he wrote it on the browser).
  2. The browser de-codifies the URL separating the different parts. In that way the browser identifies the           protocol to use (HTTP or its secure version: HTTPS), the server name or IP, the port (if it is not                   declared, it is the port 80 as we mentioned before), and the resource from this server that the user wants     to access. Also some parameters can be declared and send to the server.
  3. The browser opens a TCP connection to the server, to the corresponding port. 
  4. The HTTP request is sent through this connection. 
  5. The server sends the HTTP response on the same connection. This response includes the response         code indicating if there was an error, or if everything went well, the data type of the resource requested,     and the resource itself. 
  6. The browser closes the connection (sometimes the connections are reused in different requests, but it is     not important at this point).

 

URL basics

This is the acronym for "Uniform Resource Locators". It's the first thing we see about a request, and even it helps us to identify and distinguish the different requests. 

 

 

 

In the image you can see that the URL specifies:

  • The protocol (in a webpage there will be HTTP and HTTPS if it's secured). 
  • The host or domain, identifying where you can find the server. They are useful to remember how to find a     server not having to remember its IP.
  • The port, as a part of the TCP communication. If it is not specified, it is the port 80. That depends on the       configuration in the web server. 
  • The resource and its path, indicating what we are looking for, and where it is. 
  • After the "?" symbol, you find the parameters sent to the server as a part of the request. 

 

Request Types

There are different kinds of requests, and we need to understand some peculiarities of them in order to automate at a protocol level. The two most important and common ones are:

  • GET: fetch a resource. All the information required by the server to locate the element is provided in the URL.
  • POST: create a new resource. This kind of requests usually has data in the payload of the message.

So, the difference mainly stands in the payload. We will see that we need to pay special attention to the data sent in requests and responses, and we will see easily if the data is in the URL or also in the payload, according to the request type (GET or POST).

Status Codes

As a part of the response message, the server indicates the status in a code, which is a 3 digit number starting with 1 to 5. The status code is important for a performance tester because all load simulation tools interpret them automatically. The HTTP spec defines 5 ranges for specific types of responses: 

  • 1xx: Informational Messages.
  • 2xx: Successful: The most common code is 200 OK, which means that the server sends the resource in     the message body of the response.
  • 3xx: Redirection: This response indicates that the client has to do something else to get what is looking       for. For example, if it's a 301 or 302, a new request to another URL needs to be made, that is specified in     the response in the header "location". If it is a 304 it indicates that the client should use the cached copy.
  • 4xx: Client Error: There was a problem, and it's in the client side, e.g., the client is requesting a non-             existent resource (404) or without the corresponding permissions (401).
  • 5xx: Server Error: There was a problem, and it's in the server side, e.g., the server or service is                   unavailable (503). Typically, it also occurs when the server program throws an uncaught exception. 

 

Now we will see more details related to HTTP but also to HTML, an acronym which stands for Hyper Text Markup Language. It is the language used for describing web documents (web pages), and any browser is able to interpret the different tags and elements present on them, in order to render them correctly on the screen.

GET and POST

Firstly, any link we click generates a GET request. Then, if we are filling data in a form, it can generate different kind of request when we click the "submit" button. If we pay attention to the HTML code in a form, we will find an attribute called "method" which indicates whether the request associated with the "submit" action will be a POST or a GET:

  • <FORM METHOD="GET">
  • <FORM METHOD="POST">

If the method is POST, the data inserted in the form will be sent in the body of the request, and if the method is GET then the data will be sent in the query, in the URL. I believe that the following example taken from here is a great way to see it explained: If we have a form like this one:

 

When the user clicks the button it generates the following GET message:

But instead, if we have the POST method defined as here:

Then, the request is a POST message with the data in the body of the HTTP request:

Pay attention to the way the variables and their values are sent, and how the different pairs variable-value are separated one from each other, with a special character "&", but sometimes is a colon ",".

Encoding

When a user puts a value with certain characters such as spaces, special characters (&, $, #, etc.), the browser has to encode them in order to respect the protocol. Some characters have a special meaning, for example the space. If you remember the HTTP request example, we have a group of words separated with spaces, colons, and some other characters. If the parameters had any of these characters, that could break the parser in charge of identifying the different elements in a request.  That is why when we fill a field with something like "performance testing course", what is going to be transferred by the wire is "performance%20testing%20course" if it's a POST and the information goes in the payload, or "performance+testing+course" if it is a GET and the information goes in the URL, the same string but encoded. Then the server has the responsibility to decode all data before using it. 

How does the browser process the response?

When the browser receives a response, an HTML file, it parses the content according to the HTML specification generating an internal structure known as DOM (Document Object Model). Even before the browser has received the entire HTML document, it begins rendering the website on the screen.  In the HTML there could be many different embedded objects, such as images, CSS (Cascading Style Sheets), Java Script, and more. Once the browser realizes it needs to have these files, it requests them to the server as secondary requests (these requests were generated after the primary requests were made, and these are necessary to complete the content of the whole resource). Therefore, while the browser notices tags that require fetching other URLs, it will send a GET request to retrieve each of these files, following a similar process to the primary request. Once the browser has all the resources for the webpage, it can finish rendering, applying all the styles (those specified in the CSS files), executing the JavaScript code, and waiting for the following action. Actually, as the first thing that is processed and shown is the HTML structure, most forms are shown even before the end of the process, allowing the user continuing filling the different fields and clicking links, buttons or whatever, without the necessity to wait for the whole page to complete. 

 

 

Please follow and like us:
20

Your email address will not be published. Required fields are marked *

Contact CPD Technologies