Høgskolen i Gjøvik

HiG / IMT / emnesider / IMT4901 / recent / is2008 / thesis2008 / Hammersland, Rune

Hammersland, Rune

Hammersland, RuneFinding weaknesses in web applications through the means of fuzzing

Presentation

Topic covered by the project

Fuzzing is a technique developed by Barton P. Miller at the University of Wisconsin in USA. He and his colleagues have successfully used fuzzing to discover flaws in command line tools for UNIX-like systems, command line tools and GUI programs running under the X11 Window System, as well as command line tools and GUI programs running on Microsoft Windows and Apple Mac OS X. For the command line tools they generated a file containing random strings of characters which they piped to a program to find out if the program crashed (dumped core), or started to hang (typically entering an infinite loop). For the GUI applications they generated random key and mouse presses, as well as other mouse events, like drag and scroll, which they proceeded to send to the program they were testing.
Using this technique, they discovered that several programs didn't handle random key presses too well, many of them crashing. Where source code were available, they studied the core dump and source code to find out where the problem occurred. Many of the problems were due to simple mistakes as neglecting to check the return value of functions before using the result.
Little or no research has been done on fuzzing of web applications. There are some tools available: Paros, SPIKE and RFuzz to mention some. The first two work by acting as an HTTP proxy which allows you to modify POST or GET values passed to a web site. The last one is more like a framework for fuzzing which enables a programmer to programatically fuzz web sites and, optionally, generate statistics through the R project or the Ruby Reports library.
Keywords

Problem description

As evidenced by Miller et al., many applications are not robust enough against random input. While they have researched how fuzzing affects command line and GUI applications, little, or no research has been done on how it affects web applications. Tools do exist, but as to the writer's knowledge, no reports have been published on how web applications stand against fuzzing. With the ubiquitous blogs and user contributed websites that exists in this Web 2.0 world, it would be interesting to find out how robust the most used applications are. When handling great amounts of user input, it is important that there is no way that input can put the web application in an undefined state, in other words: crashing it. Many programmers choose to use a web framework
to avoid having to handle these problems themselves, and others make their own frameworks to simplify things. In both cases erroneous user input might affect their application, as nothing will prevent you from doing ``stupid'' things as evaluating the user input as code (i.e. if you're using a dynamic language like Perl). Articles has been writtin on how a programmer can evaluate untrusted code ``safely'', however, that is
outside the scope of this project.

Justification, motivation and benefits

Because so many web sites gives users the possibility to collaborate and contribute to the site, they are also vulnerable to erroneous input and / or users with bad intents. By typing in random data in the fields provided, either by accident, or by intent, the users may put the web application in an undefined state, where it will no longer respond to new requests.
Through fuzz testing, we can find out how well the web applications handle random input, and not the input the programmer expected (whether he expected legitimate or illegitimate input). By discovering where the applications fails to handle the fuzz data in a controlled manner, we can find out which programming practices resulted in the sloppy code, and possibly correct the mistakes made.

Research questions

Questions we are looking to answer are:

  • Is fuzzing useful for testing web applications?
  • Which web applications handles user input in a manner that is safe from fuzz data?
  • If a web application fails to handle the fuzz data:
    • Why is this?
    • Is it due to bad programming practices?
    • How can the application be fixed?
  • Are there any web frameworks that are more vulnerable to fuzz testing than others?

As stated earlier, research has been done on how well command line and GUI applications handle fuzz data. Some research has also been done on fuzzing for network protocols, but to my knowledge, similar tests have not been done on web applications.

Planned contributions

This project will look at several high profile web applications available for installation on a machine (we will not be looking at how fuzz testing affects hosted solutions, such as YouTube, as testing other people's production systems would be unethical), and how they handle fuzz data as input. We will create a detailed listing of flaws found in the web applications tested, and where possible we will include information on why the application failed, and how to fix the mistake, similarly as what Miller et al. did. We might also check how these applications stand against SQL injection attacks and cross site scripting attacks, but this is not directly related to the random testing technique we know as ``fuzzing''.

17.03.2009