As we all already well know from the cyber security tips, all of the users’ inputs are untrusted. Many of various attacks against the web happened through the unexpected input, causing, of course, the situation that really wasn’t intended to happen from the side of the applications designer. This is why we need a key requirement for the application’s security. In that case, only, the application can be handled safely by the user. These vulnerabilities can happen anywhere in the technology. The necessary defense against these threats is the input validation. You may probably be heard about it already, on the topics related to the internet security tips.
Besides all we have said, it’s a not good thing to miss a fact that no single protective mechanism can be employed everywhere. So, defense against the malicious input is not as easy as it sounds.
Varieties of Input
It is well known that the web applications have the user’s supplied data in many forms. But, it is good to remember that there are some kinds of input validations which are not desirable for all those forms of input. It is true, that often, the app is able for imposing stringent validation checks on some specific input or output. We want to explain this to you on the example, for your better understanding. Let’s say a username that is submitted to a login function which may require a max length of eight characters which have to contain only the alphabetic characters. There are, of course, the other cases, where the app must tolerate a much wider range of possible input. For an example, the max characters limit can be 50, but also cannot contain the HTML markup.
It is also good to mention, that there are the situations where the application may need to accept the arbitrary input from the users. This is also best learned if shown on the example, as always. People always learn the best everything by the examples.
Let’s say a user of the blogging app creates a blog with a subject about web app hacking. Of course, the blog will contain post and after the post-comments, which may contain explicit attack strings, because they are being discussed. In this case, the app needs to store this input in a database. It’s a must. Writing it to disk, displaying it back to users in a way that is safe. Because it looks potentially malicious, it cannot just reject the input.
Although, there are many kinds of input that users enter, using their browser’s interface. Then, a typical app receives a numerous number of the items of such data that just began their life on the server and those are sent to the clients. After all, the client can transmit them back to that server on subsequent requests. Those are for an example-the cookies, which are often hidden from the other files, and cannot be seen by the ordinary users of the app. But, the attacker can view those and modify. If that happens, the app may start performing the very specific validations of the data that is received. For an example-a cookie is indicating the user’s language that he/she prefers. When the app finally detects that server data which is generated and also modified, that indicates that the user is attempting the application which is suitable for vulnerabilities. If that happens, the application needs to reject and to log the incident and do the investigation.
Approaches to Input Handling
It is good to know that the different approaches are commonly taken to the problem of handling user input. There are, of course, the different types of situations and different types of input, so it is important to have the preferable and different approaches, depending on the two things we mentioned first. Also, the combination of approaches is sometimes very needed. So, let us introduce you to these different approaches.
”Reject Known Bad”
We are going to start with this one. Typically employs a blacklist which contains a set of literal strings and patterns that are known to be used in the attacks, that have been seen earlier. This validation mechanism basically blocks any data which match the black list and allows anything else. It is a good thing to mention, as we’re still at the start of this theme, that this is the least effective approach to validating user’s input. The experts say that there are the two main reasons for that. We will explain you all.
The first one is that the typical vulnerability that appears in the web app can easily be been exploited by using a wide serious of input, and those may be often encoded and represented in various ways. It is obvious that the blacklist will probably omit some patterns of input and those may be encoded in various ways. The second problem is that the techniques for the exploitation are constantly evolving.
Besides all we’ve said, it is a good thing to mention that numerous blacklist-based filters are vulnerable to the NULL byte attack. It happens because there are the different ways in which strings are handled in both managed and unmanaged execution contexts. It is true that, in this case, inserting a NULL byte basically anywhere before the blocked expression causes the filters to stop processing the input and not identifying the expression, may be of help.
”Accept Known Good”
We’re coming to the second approach. This one employs a whitelist which contains a whole set of the literal strings and patterns. It may also contain a set of criteria. That criteria are known to match the benign input only. Besides that, it also allows data which match the whitelist and it blocks everything else. It is regarded to be the most effective way of handling the potentially malicious input. This is true when the approach is feasible. In that case, the attacker is really unabated to use the crafted input to interfere with the app’s behaviors.
But, it is a good thing to keep in mind that there are numerous situations where the application must accept data for further processing which does not meet any reasonable criteria. That criteria are, of course, referred to those terms that mean ”good”.
Sanitization
It can be performed by transforming input from its original form to some acceptable form of encoding or decoding. These consider using methods in web apps that include HTML entity encoding, and besides that, URL encoding schemes. The HTML server does need encoding of the certain meta characters for their responding character entity reference. They are predefined and always have a format and a name.
There are so many ways, where the encoding is represented to the app. With the web applications and browsers which are supporting more than the one character encoding type, this became the attackers’ common place for a try to exploit an inherent weakness in the encoding and decoding behaviors and routines. Those applications that require internationalization are always a good candidate for the sanitization.
Safe Data Handling
Do you know why many web apps vulnerabilities arise? Well, that happens because of the user-supplied data which is processed in an unsafe way. Those vulnerabilities can be avoided by 80% just by not validating the input itself, rather by ensuring that the processing which is performed on it is safe. There are the safe programming methods which are available, and those can avoid the problems we mentioned. SQL attacks can easily be prevented through the correct use of parameterized queries for database access. Also, in many other situations, the app’s functionality can be designed in a way that inherently unsafe practices which may be passing input to an operating system command interpreter, and those are avoided. We need to tell that this approach actually cannot be applied to every kind of task. But, wherever it is available to perform it, it can dramatically interrupt the malicious threats.
Semantic Checks
It is important to understand what is actually a semantic checking. It makes sure that the code fits the given grammatical structure and also makes certain that the words used in a code make sense when used together. Sometimes, it happens that the vulnerability inputs which are supplied by the attacker are identical to the inputs some ordinary user may submit. So, what makes it malicious then? It is the different circumstances under which is it submitted. The example may be the attacker who wants to gain the access to some user’s bank account. He/she changes an account number transmitted in a hidden form field. The only thing to prevent this is to validate the app to control that the account number that is submitted belongs actually to the user. To the user, who has submitted it.
Boundary Validation
I’m sure you’ve already heard that the idea of validating data across boundaries is a familiar one. Then, it’s when the core security problem with the web apps arises. Those data received from the users are untrusted. It is a fact that the input Validation’s checks, which are implemented on the client’s side, can really improve the performance and besides that-the user’s experience. Although, it is a truth that they do not provide assurance for the data that actually reaches the server.
When we come to this, it is important to understand that in this case, the app needs to take all the measures to defend itself against any malicious input.
Now we’re coming to the part where we’re talking about the ”bad” and the ”good” server-side apps. It is important to mention that a typical application needs to defend itself against a huge amount of variety input-based attacks. It is really very difficult to devise a single mechanism at the external boundary to be defended against all of these attacks that may happen.
There are many app functions which involve chaining together. This was a great idea, definitely! Together, they make a series of different types of processing. This is for the best safety. The attacker could hardly implement a validation mechanism at the external boundary to foresee all the possible results of processing each piece of user’s input.
The concept of boundary validation. The each individual component or the functional unit of the server-side app threats its input as coming from a potentially malicious source. It is a great and safe way because the data validation is then performed at each of these well-trusted boundaries. It all happens in the addition for the external frontier and the client & server. It is providing the solution. The solution for all the problems we just described. Every component is able to protect itself against all of the specific types of any crafted input to which it may be vulnerable. All of the validations checks can be performed against whatever value the data has. The value of the previous transformations, if I wasn’t clear enough. This is great because it promises that the data cannot come into the conflict with each other.
All in all, if a failed login caused the app to send a warning email to some user, any user which is data incorporated email may need to be checked for SMTP injection attacks.
Multistep Validation and Canonicalization
We need to know when the common problem encountered by input-handling mechanism arises. It happens when a user-supplied input is manipulated in some way and across several steps as the part of the validation logic. The attacker may be able to construct crafted input that will succeed in smuggling the malicious data through the validation mechanism. This is why we need to handle this process carefully. It often occurs when the app is attempting to sanitize the user’s input by removing or encoding the certain characters or expressions.
It is also seen in problems which are related to data canonicalization. After the input is sent from some user browser, it may end up encoded in various ways. Those encoding schemes actually exist for the unusual characters and binary data. These may be transmitted safely over HTTP.
Sometimes happen that the server-side app uses an input filter to block certain JavaScript expressions and characters. In this case, the encoded system may succeed with its bypassing that filter.
It is true, that sometimes, there is no solution for some problems that may occur. But, where the sanitization which is desired involves escaping a problematic character, it may result in an infinite loop. It is true that the problem may be addressed only case-by-case basis, always based on the types on validations being performed.
The best solution is to avoid attempting and to clean some kinds of bad input, also simpy rejecting it all together.