Medusa Group Security Blog: Introduction to fuzzing

Nowadays, most of the applications we use are getting more and more complex. Every user input, every field in configuration file can be a cause of a crash. It is getting harder to manually test the programs covering all the cases.
Fuzzing, on the other hand, is an automated testing technique that makes it possible to cover an infitely many test cases. Fuzzers provide a program unexpected or random input data, while monitoring its behaviour. The goal is to make a program crash or perform an unexpected behaviour while processing an unexpected input data.

How does fuzzing work?

Fuzzers generate a big number of data that is provided as input to program. In order to be more efficient, fuzzers are provided either with a set of example inputs or by a grammar rule that generates an input set that can be normally processed by a program without a crash. This set of initial inputs is also called a corpus.

During each iteration a fuzzer randomly mutates a data from the corpus by doing one or several of the following actions:

Flips any random bit (from 0 to 1, or vice-versa).
Sets a random value to a random byte (to any value from 0 to 255).
Increments a random value by 1.
Permutates random parts of an input.
Removes a random part from an input.

The actions performed by a fuzzer in order to mutate a data can differ depending on a fuzzer, but most of the fuzzers apply the techniques described above.

A mutated data is then provided as an input to the program.

During the fuzzing process the program's state is monitored to detect crashes and save the information related to the crash: input data that caused the crash, registry values, information about environment and etc. Crash data can then be analyzed in order to find and fix the root cause of a bug (or exploit it ;) ).

Genetic algorithms in fuzzing

Modern fuzzers in order to be more efficient employ genetic algorithms. For that reason most of the modern fuzzers need to insturment the target executables in order to have an overview of code coverage during the fuzzing process. Instead of only utilizing the initial corpus that was provided by user, by implementing genetic algorithms, fuzzers constantly add new values to the corpus.

The new values are added based on code coverage: if the newly tested mutated value accessed a section of a program that was not accessed before, the value is being added to the corpus. That way fuzzers try to increase the code coverage to make sure that as many parts of the program as possible are tested.

AFL++

AFLplusplus is a fuzzer based on AFL (American Fuzzy Lop) fuzzer by Michał “lcamtuf” Zalewski. It was developed in order to improve the capabilities of AFL and to continue its support, since the official AFL is no longer maintained.

AFL++ is easy to use. In order to start the fuzzing process the following steps should be taken:

Select a target binary.
Compile the binary with afl++ compilers to instrument it.
Prepare initial corpus.
Run AFL++.
Analyze the results.

Installing AFL++

Install the lastest version with:

$ git clone https://github.com/AFLplusplus/AFLplusplus
$ cd AFLplusplus

Build and install:

$ make distrib
$ sudo make install

Selecting a target binary

For a target binary fuzzgoat will be used, which is an intentionally vulnerable application written to demonstrate fuzzing. Fuzzgoat can be cloned from its repository as follows:

$ git clone https://github.com/fuzzstati0n/fuzzgoat  
$ cd fuzzgoat/

`Compiling the binary with AFL++`

In order to instrument the target binary during compilation afl-gcc will be used:

$ export CC=afl-gcc
$ make

`Prepare initial corpus`

Corpus is a set of files which is used as initial input by fuzzer to perform mutations on. AFL++ is a "smart" fuzzer and by implementing genetic algorithms can find a structure of input file expected by a target binary. Since fuzzgoat is an intentionally vulnerable program, AFL++ will figure out the structure soon enough. That's why a random input file will be created:

$ mkdir fuzzgoat_in
$ echo Medusagroup.tech > fuzzgoat_in/input

Running AFL++

Before starting the fuzzing process an output folder should be created where crash information will be stored:

$ mkdir fuzzgoat_out

To start fuzzing the following command will be used:

$ afl-fuzz -i afl_in -o afl_out -- ./fuzzgoat @@
Breaking the above command into parts:
-i fuzzgoat_in specifies the input directory to take the test cases from.
-o fuzzgoat_out specifies the output directory where files for crashes, hangs and queue are stored.
-- separates the command structures of a target binary and AFL++. The left side is where AFL++'s arguments are specified and the right side is where the target’s run command is.
@@ defines the position where AFL is supposed to insert the test file in the target application’s command structure
After running the command the following interface will apear, which shows the information about the fuzzing process:

Here the information like run time, saved crashes, number of cycles done and many more can be found. 

Analyzing the results
The crash files can be found in fuzzgoat_out folder. The crash files can be used to reproduce and analyze the issue.

Medusa Group Security Blog

Introduction to fuzzing

How does fuzzing work?

Genetic algorithms in fuzzing

AFL++

Installing AFL++

Selecting a target binary

`Compiling the binary with AFL++`

`Prepare initial corpus`

Running AFL++

Analyzing the results

The crash files can be found in fuzzgoat_out folder. The crash files can be used to reproduce and analyze the issue.

No comments:

Post a Comment