Encoding Schemes For Data




There are several different encoding schemes which web apps use for their data encoding. As we know, HTTP and HTML protocols are text-based. In that purpose, the different encoding schemes are used, ensuring that the mechanisms we mentioned will safely handle the unusual characters and the binary data which may appear. If you read about the cyber security tips, you should have already probably heard about these historically text-based protocols and their mechanisms. When attacking the app, it would require frequently encoding data and a relevant scheme. That would ensure that it’s handled in a wanted way. It may be able to manipulate with the encoding schemes, but that would cause the intention the designer didn’t want.

URL Encoding

URLs contain only the printable version of characters. That is the set of US-ASCII. URLs use ASCII code in the range of 0x20 to 0x7e. There are several characters in this range that are restricted. It is because they have a special meaning within the URL scheme itself. They can also have some special meaning for HTTP protocol.

Why is the URL encoding scheme used? It is used to encode any problematic characters within the extended ASCII character set. In that case, they will be safely transported over HTTP.

% character-the prefix followed by the character’s two-digit ASCII code expressed in a hexadecimal.

+ character-represents the URL-encoded space, so be aware of it.

Unicode Encoding

This is the character encoding standard. It is designed for the support of all the world’s writing systems. Employing various encoding schemes, there are some kinds which can be used to represent the unusual characters in the app.

16-bit Unicode encoding. This one works in a similar way as the URL encoding we already talked about. Forming a character known as a prefix is it’s way to HTTP protocol.

UTF-8. This one is a variable-length encoding standard. It employs one or even more bytes in need to express the each character. When it comes to the transmission to HTTP, then it forms a multiply character simply by using the each byte expressed in hexadecimal and preceded by the prefix-%.

Unicode encoding is of interest for the purpose of the attacking web apps. It can be used in some cases to defeat the input validation mechanisms. It is also possible to bypass the filter (which blocked the malicious expression), by using the various standards and malformed Unicode encoding.

HTML Encoding

Used to represent the problematic characters. In that case, they can be safely incorporated into the HTML document. The various meaning goes for various characters. It is because the metacharacters are used within an HTML for defining a doc’s structure, rather than its content.

It is very important to HTML-encode the characters we mentioned because that’s the only insurance that they are used safely then.

The main interesting in the encoding HTML when you’re attacking the web app is probing for cross site scripting vulnerabilities. If the case where the application returns the user’s input unmodified within its responses, that means that it is probably vulnerable. Also, if the dangerous characters are encoded in the HTML, then it is safe.

Base64 Encoding

This one allows any binary data to be safely represented. It uses only printable ASCII characters. It is also used for the encoding email attachments for the purpose of the safe transmission over SMTP. But, that’s not all. It is beneficial also when encoding user credentials in basic HTTP authentication.

This encoding process inputs data in the block of the three bytes. These blocks are divided into the four chunks, which are consisted of six bits per each. That allows 64 different possible permutations. Each chunk can be then represented using a whole set of 64 characters. The Base64 contains only printable ASCII characters.

There are many web apps which use this encoding system for transmitting binary data within cookies and some other parameters. Also, it is used to hide the sensitive data. In that way, it prevents the trivial modification.

These strings can easily be recognized by their specific character set, and also for the presence of the padding characters at the end of the string.

Hex Decoding

Straightforward hexadecimal encoding. Used mostly when transmitting binary data. All of the ASCII characters are used to present the hexadecimal block.

It is similar as for the Base64-the encoded data can be easily spotted. The advice we have for you is to attempt decoding any such data that the server sends to the client. It does it for understanding its functions. There are many internet security tips and tricks which can improve your safety in today’s technology world.

Remoting and Serialization Frameworks

In the past years, there were various kinds of frameworks which have involved for the purpose of creating a user interface in which client-side code can remotely access various programmatic APIs implemented on the server side.

So, what did they get from it? That, actually, allowed the developers to partly abstract away from the distributed nature of web apps and to write the code in a manner that would be closer to the paradigm of the conventional desktop app.

They provide stub APIs for use on the client side. They can also handle automatically both the remoting of those API calls, to the relevant server-side functions. In that case, the sterilization of any data that is passed to those functions can be done.

We will mention now some examples of the remoting and serialization frameworks. These are Flex and AMF, Silverlight and WCF and Java serialized objects.

So, let us make the summary and get the point and the most important conclusion. When any attack happens, the first thing is to map the target application’s content and functionality. In that case, you can establish how it actually functions, or how it attempts to defend itself. Also, what are the technologies it uses.