What is Internal Implementation Disclosure?
Internal Implementation Disclosure is the process where a web application leaks information about the software to a malicious hacker.

By Tim Trott | Privacy & Security | March 8, 2016

1,548 words, estimated reading time 6 minutes.

| (0) | (0)

Internet Security 101

This article is part of a series of articles. Please use the links below to navigate between the articles.

The process is known as fingerprinting and through leaked internal implementation disclosure a malicious hacker can a more targeted attack on your website or company.

Typical examples of internal implementation disclosure include error messages and server signatures. The wording of a particular error message, even a 404 page, can identify what software and platform you are running the website on.

So let's have a look at a few ways in which an attacker builds up a risk profile.

Internal Implementation Disclosure - Server Response Headers

One of the first places an attacker may look (and the first place you should secure) is the response headers. When we looked at Main in the Middle attacks we looked at the HTTP response headers, specifically at the cookies. But these headers also contain quite a bit of useful information about the server and technology this website is running on.

HTTP/1.1 200 OK 
Date: Wed, 09 Mar 2016 15:58:49 GMT 
Server: Apache/2.4.1 (Unix)
Last-Modified: Wed, 09 Mar 2016 15:57:25 GMT 
Accept-Ranges: bytes 
Content-Length: 30188 
Cache-Control: max-age=3, must-revalidate 
Expires: Wed, 09 Mar 2016 15:58:52 GMT 
X-Clacks-Overhead: GNU Terry Pratchett 
Vary: Accept-Encoding,Cookie 
Keep-Alive: timeout=5, max=100 
Connection: Keep-Alive 
Content-Type: text/html; charset=UTF-8

So from this, we can see straight away that the server is running on Apache version 2.4.1 on a Unix box. Now this is quite important information because now an attacker knows the server and version I am using (I'm not, by the way, using this version), and they can look for vulnerabilities in this version. This can easily be done by checking on the CVE Details database of known software vulnerabilities.

An attacker can also use additional information such as session cookie names and vendor-specific headers (Microsoft IIS is notorious for sending out signatures this way)

Internal Implementation Disclosure - HTTP Fingerprinting

HTTP fingerprinting is used to identify the software being run on the server. The most obvious first telltale sign is the script extension. Clearly, if you are using index.php there is a high chance that the website is running on a Linux/Unix box, under Apache or nginx and the PHP framework. If you are using index.asp chances are that you are using an out-of-date version of IIS on a Microsoft server. If you are using index.aspx chances are you are running a more recent version of IIS and using the .Net framework.

By modifying the HTTP requests in Fiddler, for example, changing the HTTP version from HTTP/1.1 to HTTP/1.0 you can also deduce server information based on the information it returns. Indeed, even an invalid HTTP request can return a server error which may disclose server information. Most websites have custom 404 error pages which hide the server information, but not everyone goes to the effort to hide server error pages which occur when a malformed request is detected. Typically these pages will contain a footer detailing the error that occurred, the date and time, and the server name and version.

Apache default 404 error page leaks Internal Implementation Disclosure

Internal Implementation Disclosure - Disclosure via robots.txt

Most website owners use the robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol. When a robot wants to visit a page on the site it firsts checks for the existence of the robots.txt file and if found it looks inside to see if can access the intended URL.

It is common practice to add certain resources to this file to prevent search engines from indexing sensitive areas of the site. While this is a good thing to do from the point of view of a good well-behaved search engine, in reality, you are telling any malicious hacker the location of areas on the site you want to be kept secret. This can include the URL to the administration areas, pages or directories to be kept private.

An example robots.txt may look like this:

User-agent: *
Disallow: /feed/
Disallow: /trackback/
Disallow: /admin/
Disallow: /includes/
Disallow: /files/
Disallow: /xmlrpc.php
Sitemap: /sitemap.xml

From this, we can see that the /admin/ folder is not to be indexed by search engines and is, therefore, a target for attack. We can also see a /files/ folder which again is another target as is xmlrpc.php. Not only does this disclose information about the server software being run, but also lists out several attack vectors which can be used.

I'm not saying don't use robots.txt, just be careful what you put in it. Any sensitive information should be secured away by more than just relying on hiding it from web robots.

Internal Implementation Disclosure - Risks in HTML Source

Information can also be leaked by analysing the HTML source. Most people don't realise how much information can be gleaned from looking at the page source. All developers know how to do this as do all malicious hackers. They may all seem small and insignificant, but when added up they can be used to build up a good profile of the software you are running, the server software and the operating system. This information can then be used to create a highly targeted attack on your site.

Most of the information is pretty mundane but there are certain telltale signs which can lead to internal information disclosure.

The first thing an attacker may target is the source to identify the server software and even any client-side frameworks such as jQuery.

As with robots.txt, the presence of certain URLs in the source can be indicative of the software being run. For example, if the page is loading content from /content/ and the login link accesses /admin/ chances are the website is running on WordPress.

If there are links to responses accessed from DependencyHandler.axd, or scripts being loaded from a /umbraco/ folder you can be fairly sure that the website is running Umbraco on a Microsoft platform.

Another risk is from yourself or your developers. We've all done it, added in a var_dump or HTML comment with a variable for debugging. They are not shown on the front end and can be easily forgotten, but they are still there, waiting for anyone to view the source.

Some of the more risky HTML comments used by developers have included a link to download a backup or SQL dump file. This was intended as an easy way for a developer to download the file in a test environment and wasn't meant to reach production, but it got left in and as a result, the database had been leaked. Such a small mistake can have disastrous consequences.

Internal Implementation Disclosure - Internal Error Message Leakage

As we briefly mentioned in HTTP Fingerprinting, application error messages can reveal the server software being used. This is normally through internal exception messages which leak up to the top and are shown to the user. Common errors could be unable to access the database, data could not be validated and null reference exceptions.

Most website owners create a custom 404 error page for when something isn't found, but the 500 internal server error page is often omitted.

Certainly, for ASP.Net error messages, the Yellow Screen of Death gives out details of the error in question but also includes line numbers and actual lines of code that failed. This can also reveal, in the instance of database errors, SQL queries or even connection string details. This is a gold mine for malicious hackers to compromise your application.

Further information given out includes the file path to the file causing the error, and a full stack trace showing the current execution path and even server details and exact version information. This is a massive security leak and you should be doing all you can to hide this information away in a custom error page which only gives out the information that the public should see - a human-readable error only.

ASP.Net Exception Yellow Screen of Death

Of course, you should be testing code and eliminating problems like this before the application reaches production. Performing an action on a website and being presented with an error message, default or custom, still identifies an area of the site which contains code which may have a vulnerability and deserves further investigation. We'll see more on this in the SQL injection tutorial.

Internal Implementation Disclosure - Poor Access Controls

The last thing we are going to look at is areas where there are poor, or no, access controls in place on things like diagnostic information as tracing information, backups and server logs. ASP.Net has several areas which may be enabled or installed and without proper access controls prove to be a gold mine to malicious attackers. The built-in trace tool will store page requests and errors in a log accessed at /trace.asd. Another popular logger is Elmah, a logging extension which also logs to a publicly accessible resource. Logging frameworks log details in very predictable paths, all of which will be scanned by an attacker. If it's unprotected you are pretty much open to attacks.

Some smaller websites store backups and logs in a folder in publicly accessible areas of the site, or worse still a developer may log information to a publicly available area because it's easy to access. Although not advertised, it is still accessible.