How safe is it out there?
Zeroing in on the vulnerabilities of application
security
Abstract
The article presents a statistical analysis of
results obtained from numerous application level penetration
tests performed by Imperva experts for various customers over
the years 2000 - 2003. The research dives into the types of
vulnerabilities found, their sources, the risk they incur, and
their effects. The institutions whose applications were tested
include banks, government institutions, telecommunication firms
and even information security vendors. The article presents a
unique opportunity to take a peek into the usually secluded data
regarding the actual risk posed to web applications. It shows a
constant increase in risk level over years and an overwhelming
overall percentage of applications susceptible to information
theft (over 57%), direct financial damage (over 22%), denial of
service (11%) and execution of arbitrary code (over 8%). The
article analyses results of first time penetration tests as well
as repeat tests (retests) in order to evaluate the evolution of
application security within Web applications over time. Our
conclusion is that without proper application security devices
and secure software development education, the inherent risk to
an application does not decrease and may even increase over
time. Taking into consideration that the organizations whose
applications are included in this report are considered security
aware (they showed the insight to order costly penetration
tests) the results paints a bleak picture of the current state
of Web application security.
Introduction
Application hacking is a security field that
contains a vast number of techniques enabling an attacker to
compromise the confidentiality, integrity, and availability of
an application. One of the common techniques used by application
developers and service providers to increase the security of
their internal data is to have experts try to hack into their
systems in order to locate the numerous security holes within
the systems, so that they can be patched. These efforts, known
as application penetration tests, are very different from one
another and require the testers to constantly develop new
techniques and new methods for the tests to continue to be
successful. Imagination is the key ingredient required by the
tester in order to bypass the fences built by the developers.
Application penetration tests are tests focused on the flow of
the application, as opposed to network penetration tests, which
attack the entire network surrounding the application.
Penetration tests are usually confidential and
their results are usually kept within the closed boundaries of
the tester and the developers of the application, for obvious
reasons. Since the results of such tests are usually
confidential, not many statistical analyses of test results are
ever done. This report is a first trial to conduct a thorough
analysis of real penetration test results
Methodology
Data Collection
Between the years 2000 2003
Imperva ADC conducted hundreds of application penetration
tests for numerous customers. Some of the penetration tests were
first time tests conducted against a system. Others are repeat
tests (retests) that occurred either very closely to the first
test or after a few months' period. The outcome of each
penetration test is a written report submitted to the customer.
The report contains a detailed description of identified
vulnerabilities. Each vulnerability is classified by technique
(e.g. SQL injection, Cross Site Scripting), severity (e.g.
Critical, High or Medium) and potential effect (e.g. direct
financial damage or denial of service). The analysis presented
in this paper is based on the data gathered from 306 such
reports of which 73 were obtained from retests. The raw data
used for the analysis is provided in the appendix.
Classification
As mentioned earlier we used three types of
classifications for the analysis of the data: Technique,
Severity and Potential Outcome. For describing severity we used
five values: Critical, High, Medium, Low and Informative.
Classifying the techniques is a bit more complicated and we were
forced to create a scale that combines both technique and
purpose.
Cross site-scripting an attack aimed
at pushing a script tag into a server that would be sent
from the server to an innocent user browsing the Web server
thus causing the script to be activated in the innocent
user's browser.
SQL injection an attack that manipulates
input data sent to the server, causing it to run a
SQL-generated input that would pull data or change the
contents of its internal data.
Parameter Tampering changing the data within
a parameter sent from one Web page to another in a way that
would alter the behavior of the latter page.
Known vulnerabilities using known
vulnerabilities and exploits on commercial software
platforms. This class holds dozens of attacks that are
widely known and published.
Cookie poisoning changing the
contents of cookie saved in the client's computer in such a
way that it would change the normal flow of the application.
Access to administration area and internal modules
allowing unauthorized access to administrative areas or
other internal modules of an application.
Directory traversal allowing access to
unauthorized server directories.
Improper management of permissions
improper management of the server's permissions allowing a
non-privileged user to access some modules that weren't
originally intended to be seen by that user.
Buffer overflow data sent as input to
the server that overflows the boundaries of the input area,
thus causing the server to misbehave. Buffer overflows can
be used to make the server run a code sent into the
overflowed buffer.
Forceful browsing the ability of an attacker
to directly access unauthorized Web pages by bypassing the
logical flow of the application, possibly avoiding
authentication requirements and credentials checking.
Denial of service causing the site to
malfunction due to some sort of denial of the service it is
offering by means of bandwidth consumption, site defacement,
and such.
Session hijacking capturing the
session of another user, which in effect means being able to
impersonate the user in the eyes of the application.
Brute force attacks designed to steal
of passwords or session ids, by means of enumerating a large
number of password/session ID options.
Information gathering attacks whose purpose
is not to actually perform an attack, but rather to reveal
information on the system, which can further assist in other
attacks.
Setting Risk Levels within Tests
Each vulnerability found during a penetration test
is classified with one of 5 risk levels:
Critical
High
Medium
Low
Informative (1)
The risk level takes into consideration several
factors. The most important factor is the potential damage of
such an attack. Attacks allowing direct financial damage (such
as purchasing of stocks at half price, or wire transferring
money to a 3rd party in an offshore bank account) or sensitive
information theft are naturally of critical or high risk.
Potential damage varies according to the nature of the
applications. For instance, ecommerce sites usually take denial
of service attacks more seriously than an extranet employee
portal does.
The second factor is the complexity of the required
attack. Vulnerabilities which require exceptional skills to take
advantage of are obviously of lower risk than those that can be
easily exploited by script kiddies.
Lastly, the source of attack can influence the risk
level. If an attack can only be performed by a subset of the
company's employees, its risk level is likely to be lower than
that of an attack that can be carried out by any hacker coming
from an anonymous proxy.
Results
The results of the tests are as follows: (2)




The results are presented for each year separately.
The results describe the total number of vulnerabilities of each
type found, rather than the number of applications vulnerable to
a specific type of attack. While the difference in terminology
might seem minor, the difference in the results in significant.
Many applications tend to have a specific vulnerability
appearing numerous times.
Looking at the entire collection of results from
all the years together we can generated the following
conclusion. Parameter tampering is the most common
vulnerability, constituting 16% of all the vulnerabilities. The
second is permissions improper management, at 13% of the
vulnerabilities. The third is the SQL Injection at 10%, and the
fourth is Cross-Site Scripting at 9%. Figure 5 presents a chart
with the distribution of all the vulnerability types.
We also attach a chart illustrating the
distribution of the risk, which shows that vulnerabilities with
high and critical risk are the most common at more than 50%
coming from either category.

Figure 1 Percentage of Attack Types, and risks: 2000 - 2003
Discussion
The Evolution of Application Security Risks
The first step in our analysis takes a bird's-eye view of the
results by looking at the number and risk level associated of
the vulnerabilities found in each application over time. Figure
1 below compares the number of critical and high risk
vulnerabilities to the average number of vulnerabilities per
test over time. We find that although the average number of
vulnerabilities per tested application tends to decrease with
time, the average risk level increases.

Figure 1 Overall Risk by Year: 2000 - 2003
Are There Any Secure Applications Out
There?
Another look at the results compares the number of tests in
which we found critical vulnerabilities with the number of tests
in which we found no vulnerabilities. The portion of presumably
secure application is very small and relatively stable. Some of
the tests that yielded no vulnerabilities (in fact all of them
in 2001) are retests performed after previously uncovered
vulnerabilities were fixed. In contrast, the portion of tests
yielding critical vulnerabilities constantly grows over the
years to an overwhelming 89% in 2003.
|
|
2000 |
2001 |
2002 |
2003 |
|
% of reports without any problem: |
0% |
4% (3) |
9% (4) |
3% (5) |
|
% of reports with CRITICAL problems
(regardless of the number of critical problems) |
47% |
61% |
71% |
89% |
Vulnerabilities and Their Effects
In order to emphasize the meaning of the results we classify the
vulnerabilities by effect on the system. We use four categories
which can be stated in simple terms that apply not only to
information security experts, but to end users as well.
1. Execution of arbitrary code on the server
2. Unauthorized arbitrary information retrieval
(includes private information theft)
3. Direct financial damage
4. Denial of Service
Each class may be the consequence of a wide variety
of attacks. For example execution of arbitrary code can be
achieved by either taking control of the server using improper
permissions management, or by performing a buffer overflow
attack that causes a terminal window to be opened at the client
side.
As can be seen from the results graphs, execution
of arbitrary code is fairly infrequent. However, direct
financial loss is a risk we found in almost Ό of tested
application. And the most amazing of all is continuous rise in
applications susceptible to information theft and denial of
service, where 60% of tested application were found to be at
risk.

It's important to note that while both denial of
service and execution of code appear to remain static over time,
the actual vulnerabilities changed. In 2000, most of the DoS and
arbitrary code execution vulnerabilities were related to the web
server platform. Over time, production web server platforms of
security-aware organizations became generally more secure, yet
with the increase of vulnerabilities in the applications
themselves, the numbers remain relatively stable.
Another important issue to notice is the generally
low number of denial of service vulnerabilities. This is
partially due to the fact that in a large portion of the
penetration tests performed, the customer limited the test to
non-destructive tests. This means that in fact, the numbers of
Denial of Service attacks would have been higher without these
limitations.
Is Penetration Testing a Silver Bullet for
Application Security Risks?
It is our experience that many organizations regard penetration
tests as an important means to mitigating application security
risks. This assumes that once a test report is issued,
vulnerabilities are fixed and that new vulnerabilities are not
introduced. Many organizations, however, do not bother with
repeating the penetration test after the problems were allegedly
fixed. The information we collected over the years from
customers that DO repeat penetration tests indicates that
failing to perform a repeat penetration test may lead to a false
sense of security.
The revealed vulnerabilities can be categorized
into three classes: vulnerabilities that were missed by the
first penetration test, vulnerabilities that were uncovered by
the first penetration test but repeated themselves in the
retest, and new vulnerabilities that were introduced during the
period between penetration tests. Among the retests in which
vulnerabilities were encountered all displayed vulnerabilities
of either High or Critical risk level.
Retests analyses
In one third of all retests we found previously
encountered vulnerabilities of which half were claimed to be
fixed by programmers. This figures indicate that programmers
either did not understand the problem, did not know how to fix
it or in many occasions just tried to hide it (e.g. disable
detailed error messages on web server hoping to avoid SQL
injection attacks). In 10% of the retests we found new,
previously uncovered vulnerabilities. This is not due to
incompetence of the first penetration testing team. Most of the
applications we tested required many man years work to
construct. In comparison, the calendar time reserved for
penetration tests ranged from 4 to 14 days of at most 2 testers.
In a single case of a system that required hundreds of man years
to construct, the calendar time reserved for penetration testing
exceeded 2 month and the penetration test team included 3 people
at a time. In 60% of the retests we found completely new
vulnerabilities which were either introduced during the fixing
of the first group of identified vulnerabilities, or were
introduced with the application's development evolution.
The Problem of the Ever Changing
Application
Only a small portion of all organizations bother to periodically
conduct penetration tests against the application. Most will
perform at most one penetration test upon the launch of a new
application, some will perform a second penetration test soon
after the vulnerabilities from the first penetration test are
patched. However, in a few cases we were able to perform
periodic application penetration tests at intervals of 1 year, 6
month, and (in one case) once a month. This gave us an
opportunity to analyze the behavior of security risks for a
single application over time.
It turns out that all retests that were performed
with a long period between tests revealed vulnerabilities that
were already uncovered by the first penetration test. In 60% of
those retests we found vulnerabilities that were actually fixed
after the first penetration test and were reintroduced over
time. A little bit of detailed research within the tested
organizations revealed that those vulnerabilities were
introduced during various change cycles that the applications
went through during the period of between penetration tests.
Some of the changes were introduced by programmers who had never
seen the report of the first penetration tests. Nonetheless some
of the changes that reintroduced old vulnerabilities were
performed by the same programmers that introduced the original
ones.
In 50 of the 73 applications, the retest was
performed immediately after the initial test and we found that
the security holes were indeed fixed. In 23 of the immediate
retests, we observed the same errors. This is due to one of two
reasons: either the customers didn't fix the bugs; or the fix
was incorrect.
In the following chart we can see that the risk
level of the application established after the periodic
penetration test (excluding those that were performed within a
month of each other) remained the same or even increased.

By comparing the type of vulnerabilities that were
uncovered in the retest to those found in the original
penetration tests, we find that in many cases programmers fix a
specific instance of a vulnerability rather than eliminate it
completely. This was very evident with SQL injection
vulnerabilities that were only fixed in those same modules that
we explicitly mentioned in our reports. Other modules suffering
from the same vulnerability were not fixed. One of the major
reasons for this type of behavior is that the penetration test
(and the changes its results incurred) is the last stage in an
already delayed project. Hence programmers are under a
tremendous time pressure. Also, in some cases in which
application programming is outsourced, the subcontractor is not
being paid for the time and effort put in patching the
vulnerabilities (the subcontractor was bound by the contract to
deliver a secure application). Hence, it is the interest of
programmers to invest as little time as possible.
Conclusions
The above analysis of penetration test results
clearly allows us to draw a number of conclusions regarding
application security risks and the ways to mitigate them.
First and foremost the results clearly show that
application level hacking does pose a prominent risk to most
applications. Also, the risk incurred by application level
hacking is on the rise. This is mainly due to increasing
functionality of applications, (they provide much more access to
sensitive data) and the constant evolution of hacking
techniques. Keeping in mind that the sample population used for
our analysis is security-aware organizations, it is likely that
more general figures regarding risk level are much higher.
Considering the fact that applications tend to
undergo many and frequent changes, even an application that went
through a thorough penetration test and patching cycle before it
was launched is likely to become vulnerable over a time frame of
6 months to a year. Hence, application security cannot be
reduced to a singular effort at a single point in time.
Another conclusion is that penetration test and
fixes, however important as an audit mechanism, they do not
yield completely secure applications. This is not because of
lack of penetration test skills but rather lack of resources. No
organization would invest an equivalent amount of time in
application QA and application penetration testing. Hence no
organization can expect the two processes to have the same
yield. Moreover, knowing that applications subjected to a
thorough QA process still display bugs, one cannot expect
applications that went through a much shorter penetration test
to display no vulnerabilities. In addition, if an application
does not undergo penetration testing very frequently (i.e. with
every change that the application undergoes) we can assume that
there are periods of time (between the penetration tests) in
which the risk level is very high.
It is therefore our conclusion, based on the
analysis detailed in this paper, that in order to truly mitigate
application security risks over time, organization must
incorporate into their networks true application security
solutions. Such solutions, which are the application security
equivalent of network access controls (e.g. Firewall, NIDS,
Router ACLs) will protect applications against application level
attack techniques in a constant manner, providing protection
against both known and unknown attacks.
Appendix
And the Oscar goes to
Here are the most risky attacks that application developers and
penetration testers should be aware of since they are the most
likely to succeed.

Or, if we try to say how many applications out
there suffer from attacks, we can extrapolate these numbers to
the following conclusion. (Notice that the numbers don't add up
to 100 %, rather it represents the percentage of attacks that
included the attack class.)

Following is a tabular view of the exact results of
the analysis:

Footnotes
Note that information disclosure attacks are
not necessarily of 'Informative' risk level. Their risk
level is set to informative when the information gathered
has no immediate outcome that allows further attacking the
system. A source-disclosure vulnerability, for instance, is
likely to be classified with a higher risk level, assuming
that the source can be later used for identifying other
vulnerabilities.
See Appendix for
tabular view of the results.
All of them are retests where the initial test
had some errors.
8% were initial tests without any error, and 1%
are retests.
2% were initial tests without any error, and 1%
are retests.
|