- Introduction to Hacking
- History of Cryptography
- Online Privacy And Why It Matters
- Supercookies: The Web's Latest Tracking Device
- Ultimate Guide to SSL for the Newbie
- How Internet Security and SSL Works to Secure the Internet
- Man in the Middle Hacking and Transport Layer Protection
- Social Engineering
- Cookie Security and Session Hijacking
- What is Cross Site Scripting? (XSS)
- What is Internal Implementation Disclosure?
- Parameter Tampering and How to Protect Against It
- What are SQL Injection Attacks?
- Protection Against Cross Site Attacks
The process is known as fingerprinting and through leaked internal implementation disclosure a malicious hacker can a more targeted attack on your website or company.
Typical examples of internal implementation disclosure include error messages and server signatures. The wording of a particular error message, even a 404 page, can identify what software and platform you are running the website on.
So let's have a look at a few ways in which an attacker builds up a risk profile.
Server Response Headers
One of the first places an attacker may look (and the first place you should secure) is the response headers. When we looked at Main in the Middle attacks we looked at the HTTP response headers, specifically at the cookies. But these headers also contain quite a bit of useful information about the server and technology this website is running on.
HTTP/1.1 200 OK Date: Wed, 09 Mar 2016 15:58:49 GMT Server: Apache/2.4.1 (Unix) Last-Modified: Wed, 09 Mar 2016 15:57:25 GMT Accept-Ranges: bytes Content-Length: 30188 Cache-Control: max-age=3, must-revalidate Expires: Wed, 09 Mar 2016 15:58:52 GMT X-Clacks-Overhead: GNU Terry Pratchett Vary: Accept-Encoding,Cookie Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: text/html; charset=UTF-8
So from this we can see strait away that the server is running on Apache version 2.4.1 on a Unix box. Now this is quite important information because now an attacker knows the server and version I am using (I'm not by the way using this version), they can look for vunerabilities in this version. This can easily be done by checking on the CVE Details database of know software vulnerabilities.
An attacker can also use additional information such as session cookie names and vendor specific headers (Microsoft IIS is notorious for sending out signatures this way)
HTTP fingerprinting is used to identify the software being run on the server. The most obvious first tell tale sign is the script extension. Clearly if you are using index.php there is a high change that the website is running on a Linux/Unix box, under Apache or nginx and the PHP framework. If you are using index.asp chances are that you are using an out of date version of IIS on a Microsoft server. If you are using index.aspx chances are you are running a more recent version of IIS and using the .Net framework.
By modifying the HTTP requests in Fiddler, for example changing the HTTP version from HTTP/1.1 to HTTP/1.0 you can also deduce server information based on the information it returns. Indeed, even an invalid HTTP request can return a server error which may disclose server information. Most websites have custom 404 error pages which hide the server information, but not everyone goes to the effort of hiding server error pages which occur when a malformed request is detected. Typically these pages will contain a footer detailing the error that occurred, date and time, and the server name and version.
Disclosure via robots.txt
Most web site owners use the robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol. When a robot wants to vist a page on the site it firsts checks for the existence of the robots.txt file and if found it looks inside to see if can access the intended url.
It is common practice to add certain resources to this file to prevent search engines from indexing sensitive areas of the site. While this is a good thing to do from the point of view of a good well behaved search engine, in reality you are telling any malicious hacker the location of areas on the site you want kept secret. This can include the url to the administration areas, pages or directories to be kept private.
An example robots.txt may look like this:
User-agent: * Disallow: /feed/ Disallow: /trackback/ Disallow: /admin/ Disallow: /includes/ Disallow: /files/ Disallow: /xmlrpc.php Sitemap: /sitemap.xml
From this we can see that the /admin/ folder is not to be index by search engines and is therefore a target for attacking. We can also see a /files/ folder which again is another target as is xmlrpc.php. Not only does this disclose information about the server software being run, but also lists out a number of attack vectors which can be used.
I'm not saying don't use robots.txt, just be careful what you put in it. Any sensitive information should be secured away by more than just relying on hiding it from web robots.
Risks in HTML source
Information can also be leaked by analysing the HTML source. Most people don't realise how much information can be gleamed from looking at the page source. All developers know how to do this as do all malicious hackers. They may all seem small and insignificant, but when added up they can be used to build up a good profile of the software you are running, the server software and operating system. This information can then be used to create a highly targeted attack on you site.
Most of the information is pretty mundane but there are certain tell tale signs which can lead to internal information disclosure.
The first things an attacker may target is the source to identify the server software, and even any client side frameworks such as jQuery.
As with the robots.txt, the presence of certain urls in the source can be indicative of the software being run. For example if the page is loading content from /content/ and the login link accesses /admin/ chances are the website is running on WordPress.
If there are links to responses accessed from DependencyHandler.axd, or scripts being loaded from a /umbraco/ folder you can be fairly sure that the website is running Umbraco on a Microsoft platform.
Another risk is from yourself or your developers. We've all done it, added in a var_dump or HTML comment with a variable for debugging. They are not shown on the front end and can be easily forgotten, but they are still there, waiting for anyone to view the source.
Some of the more risky HTML comments used by developers have included a link to download a backup or SQL dump file. Obviously this was intended as an easy way for a developer to download the file in a test environment, and wasn't meant to reach production, but it got left in and as a result the database had been leaked. Such a small mistake can have disastrous consequences.
Internal Error Message leakage
As we briefly mentioned in HTTP Fingerprinting, application error messages can reveal the server software being used. This is normally through internal exception messages which leak up to the top and are shown to the user. Common errors could be unable to access the database, data could not be validated and null reference exceptions.
Most website owners create a custom 404 error page for when something isn't found, but the 500 internal server error page is often omitted.
Certainly for ASP.Net error messages, the Yellow Screen of Death gives out details of the error in question, but also includes line numbers and actual lines of code that failed. This can also reveal, in the instance of database errors, SQL query or even connection string details. This is a gold mine for malicious hackers to compromise your application.
Further information given out includes the file path to the file causing the error, and a full stack trace showing the current execution path and even server details and exact version information. Clearly, this is a massive security leak and you should be doing all you can to hide this information away in a custom error page which only gives out the information that the public should see - a human readable error only.
Of course you should be testing code and eliminating problems like this before the application reaches production. Performing an action on a website and being presented with an error message, default or custom, still identifies an area of the site which contains code which may have a vulnerability and deserves further investigation. We'll see more on this in the SQL injection tutorial.
Poor access controls
The last thing we are going to look at is areas where there are poor, or no, access controls in place on things like diagnostic information such as tracing information, backups and server logs. ASP.Net has several areas which may be enabled or installed and without proper access controls prove to be a gold mine to malicious attackers. The built in trace tool will store page requests and errors into a log accessed at /trace.asd. Another popular logger is Elmah, a logging extension which also logs to a publicly accessible resource. Logging frameworks log details in very predictable paths, all of which will be scanned by an attacker. If it's unprotected you are pretty much open to attacks.
Some smaller websites store backups and logs in a folder in public accessible areas of the site, or worse still a developer may log information to a publicly available area because it's easy to access. Although not advertised, it is still accessible.