This is the full name of the protocol best know as HTTP. Let’s start with an explanation why is it used. It is used to access the World Wide Web and is also used by all of the today’s web apps. It is simple and originally developed first for the retrieving static text-based resources. Since then, it is not hard to guess that it extended in many various ways enabling support for the complex distributed applications that are today common place.
What model does the HTTP use? It uses a message-based model. It is best explained on the example-the client sends a request message, and then the server returns its response message. It uses the stateful TCP protocol as its transport mechanism, but it can also use different TCP connections.
HTTP Requests
The requests and the responses of all HTTP messages consist of one or even more headers, where each is on the separate line, and followed by a mandatory black line, which is also followed by an optional message body.
HTTP request has three items which are separated by spaces (the first line of every HTTP):
-A verb. That verb is indicating the HTTP method. The most commonly used one is GET. Its function is to retrieve a resource from the web server. GET doesn’t have a message body, and that is why no further data allows the blank line after the message headers.
-URL. Or, the requested URL. It functions as a name of the resource that is being requested, along with the optional query string which contains the parameters that the client is passing to that resource. That query string is indicated by the ? character in the URL.
-The version of HTTP which is being used. Those are 1.0 and 1.1. The huge amount of browsers use the 1.1 version by its default. The most important thing about 1.0 and 1.1 for you to know-is that the 1.1 version the host request header is mandatory. This is important when the attack of the web app may occur. If you still haven’t learned enough about cyber attacks and safety, there are great sources on the web to read about your cyber security tips.
There are some other points of interest in its sample request, and those are:
-Indicating the URL from which the request originated is done by the referer header.
-Another header which is used to provide the information about the browser or the client’s software which generated the request is called the user-agent header.
-The third header on this list is called the host header. It specifies the hostname that appeared in the full URL that’s being accessed. This is really a must when multiple websites are hosting on the same server.
-The last one is the cookie. A header used to submit additional parameters that the particular server has issued to the client.
HTTP Responses
Consisted of three items, separated with spaces, the first line HTTP response looks like this:
-The version of the HTTP that is being used.
-Indicating the result of the requests by a numeric status code. The most common status code is 200. The 200 means that the request was successful and that the requested resource is being returned.
-The ”reason phrase”. It is further describing the status of the response.
There are, of course, some other points which are of interest in the response:
-The Server header. It contains a banner which is indicating the web server software being used and some other details that matter at the moment.
-The Set-cookie header. It issues a browser further cookie.
-The Pragma header. It instructs the browser not to store the response in its cache. Here we also have the expired header which indicates that the response content expired in the past and it is not relevant anymore.
-The Content-type header. It indicates that the body of the message contains an HTML doc.
-The Content-Length header. Indicating the length of the message body in bytes.
HTTP Methods
Let’s start with the GET and POST. Whenever you’re attacking web apps, you will always have to deal with those most commonly used methods. It is important to get to know the difference between these two methods because they can affect the app’s security if it is overlooked. This is also a common theme when talking about the internet security tips. If you are still not well-educated about the security, you can check some of the best sources including this blog and have the most secure information you need.
So, let me explain the GET. This method is designed to retrieve the resources. It is often used to send parameters to the requested resource in the URL query string. This is enabling users to bookmark the URL for a dynamic resource that they can after reuse. The other users can also retrieve the equivalent resource on a subsequent occasion. Those URLs are displayed on-screen and are logged in the various places. Also, they’re transmitted in the Referer header to the other sites when the external links are followed. And because of all that we mentioned, the query string shouldn’t be used while transmitting the sensitive information.
Okay, now we explained all about the GET method, and we’re going to introduce you to the POST method. It is designed to perform actions. In this case, the request parameters can be sent in the URL query string and both in the body of the message. It is true that the URL can still be bookmarked, but all of the parameters in the sent message will be excluded from the bookmark. They will also be excluded from the various locations, such in those where the URLs are maintained and of course, from the Referer header. All in all, the POST method always needs to be used when the action is being performed.
These are not the only methods that HTTP supports. There are much various of others. They have all been created for specific purposes. We will mention them and explain all now.
HEAD. It actually functions in the same way as a GET request. The only difference is that the server shouldn’t return a message body in its response. It should return the same header that it would have returned to the corresponding GET request. It can be used to check whether a resource is present before making the GET request specially for it.
TRACE. Designed for diagnostic purposes. That means that the server should return in the response body the exact contents of the request message is received. It is mostly used to detect the effect of proxy servers between the client and the server which may manipulate requests.
OPTIONS. It asks the server to report the HTTP methods that are available for the particular resource. Then, the server typically returns a response containing the Allow header which lists itself the available methods.
PUT. This one attempts to upload the specified resources to the server while using the content contained in the body of the request. This method enables to leverage it to attack the app, by uploading an arbitrary script and executing the server (for an example).
URLs
What does the URL stand for? It is a uniform resource locator, a unique identifier for web resources through which those resources can be retrieved.
REST
REST stands for the Representational state transfer. It is a style of an architecture for distributed systems where the requests and the responses contain the representation of the current state of the system’s resources. Both the HTTP and URL conform to REST architectural structure.
HTTP Headers
It is not hard to guess that HTTP supports a huge amount of headers. There are headers that are designed to be used both for the request and the response purposes, but also there are the specific types. We will describe you the common headers you may see when attacking the web app.
General Headers
-Connections. It tells the other end of the communication if it should close the TCP connection after the HTTP transmission has completed, or it should keep it open for the other messages.
-Content-Encoding. This one specifies what kind of encoding is being used for the content contained in the message body.
-Content-Length. Specifying the length of the message body, in bytes.
-Content-Type. Specifying the type of the content contained in the message body for HTML docs.
-Transfer-Encoding. Specifying any encoding performed on the message body in purpose to facilitate its transfer to HTTP.
Request Headers
-Accept. It is telling the server what kind of content the client is willing to accept.
-Accept-Encoding. Telling the server what kind of content encoding the client wants.
-Authorization. This one submits the credentials to the server for one of the built-in HTTP authentication types.
-Cookie. Submitting cookies to the server, which that one previously used.
-Host. Specifying the hostname which has been appeared in the full URL being requested.
-If-Modifies-Since. Specifying when was that when the browser last received the requested resource.
-If-None-Match. Specifying the entity tag. It is the identifier denoting the contents of the message body.
-Origin. Mostly used in cross-domain Ajax requests. Indicating the domain from which the request originated.
-User-Agent. Providing the information about the browser or client’s software which generated the requests.
Response Headers
-Access-Control-Allow-Origin. This one indicates whether the resource can be retrieved via Ajax requests.
-Cache-Control. This one passes caching directives to the browser.
-ETag. It specifies on the entity tag.
-Expires. Telling the browser for how long the content of the message body is going to be valid.
-Location. Redirecting the responses.
-Pragma. Passing the caching directives right to the browser.
-Server. Providing the information of the server that is being used.
-Set-Cookie. Issuing the cookie to the browser. In that case, it submits back to the server.
-WWW-Authenticate. This one is used in the responses that have the status of 401 code. Providing details on the type.
-X-Frame-Options. Indicating how and whether the current response may be loaded within a browsers frame.
Cookies
This is the key part of the HTTP protocol! It is well-known that most web apps rely on it. They can also be used as a vehicle for exploiting vulnerabilities. It is all about the cookie mechanism. It enables the server to send the items of the data to the particular client, and the client stores it and resubmits it to the server. Also, cookies continue with their resubmitment in each subsequent request. It doesn’t need any particular action that is required by the app of the user.
They normally consist of the name or the value paper, but they can also consist of any other string that doesn’t contain a space.
Let’s talk now about the Set-Cookie header which can include many optional attributes. They are used to control how the browser handles the cookie. We will explain it all right now.
-Expires. It sets a date until the each cookie is valid.
-Domain. Specifying the domain for which the cookie is valid.
-Path. Specifying the URL path. For that path, the cookie is valid.
-Secure. When you set this attribute, the cookies are submitted only in the HTTP requests.
-HTTP Only. Also, when setting this attribute, the cookie won’t be able to access via client-side JavaScript.
Status Codes
What does the HTTP response message need to contain? It must contain a status code. That’s what has to be in the first line. It indicates the result of the request. All of the status codes can fit into the five groups, which will we present now.
-1xx which is informational.
-2xx which means that the request itself was successful.
-3xx means that the client himself redirected it to a different source.
-4xx means that particular request contains some error.
-5xx stands for the server which is encountered an error fulfilling the request sent.
Okay, now we’re going to talk about the status codes that you’re most likely to encounter when attacking the web app.
-100 Continue. This one is sent in some circumstances when a client submits a request which is containing the body,
-200 OK. Indicating the successfulness of the request and the response.
-201 Created. This one has returned a response that the request was successful.
-301 Moved Permanently. Redirecting the browser permanently. Guiding it to the different URL.
-302 Found. Redirecting the browser the same as 301, the only difference is that in this case, it is permanent.
-304 Not Modified. This one instructs the browser to use its caught copy of the request sources.
-400 Bad Request. It happens when the client submitted an invalid HTTP request.
-401 Unauthorized. In this case, the server requires HTTP authentication before the request can actually be granted.
-403 Forbidden. No one is allowed to request this resource.
-404 Not Found. It doesn’t exist-the request source.
-405 Method Not Allowed. The method that is used in the previous request is not supported by the URL.
-413 Request Entity Too Large. The body of the request is too large for the server to handle it.
-414 Request URI Too Long. Similar to the previous one.
-500 Internal Server Error. Indicating that the server encountered the error while fulfilling the request.
-503 Service Unavailable. The app which is accessed via the server is not responding.
HTTPS
These protocols use a plain TCP. That is their transport mechanism, unencrypted and can be intercepted by an attacker. That attacker is probably suitably positioned on the network. HTTP and HTTPS are almost the same protocols. Although, the HTTPS is tunneled over the secure transport mechanism. That transport mechanism is called SSL. It protects the integrity of the data and the privacy.
HTTP Proxies
The Proxy is a server which mediates access between the client browser and the destination web server. Once the browser is configured to use a proxy, it makes all the requests to that server. The proxies also provide additional services like caching, access control and authenticating.
When a proxy server is being used, an HTTP places the full URL into the request. The SSL can’t work with the proxy server when the HTTP is used. That would break the secure tunnel and will leave all the communications vulnerable.
HTTP Authentication
We already talked about how the HTTP protocols include their own mechanisms for authenticating the users using various authentication schemes. For an example:
-Basic. This one is the really simple authentication mechanism that sends user credentials.
-NTLM. This is a challenge response mechanism which uses a version of the Windows NTLM protocol.
-Digest. The last one. Uses MD5 checksums of a nonce with the user’s credentials.
Those we mentioned above, may be seen within the organizations to access intranet-based services.