When doing a little reading about how to test software you will quickly find that there are a lot ways to test software with a lot of different terms used. In this article I will give a brief overview of the various types of testing commonly used in Software Testing, as well as provide advice on when each type of testing is appropriate.
Black box Testing
Black box and White box testing, are two of the more common terms you’ll see when it comes to testing. These are, however, more of a way of classifying testing then actual testing types. The theory behind Black box testing, is, that while testing, you treat the software application as a black box, that you cannot see into. In other words, you have no idea of its inner workings, and you test it, based inputs and expected outputs. As an example, consider a simple calculator application. With Black box testing, you have no idea how the calculator was coded, what language was used, what algorithms are used, how data is stored, etc. You just see a calculator that you can test by entering various calculations and checking that the application gives the correct answer. In practice, Black box testing would only occur in some kind of User Acceptance testing, Beta testing, or some other kind of testing where a third party is doing the testing. For any testing done ‘in-house’, the tester would typically have at least some knowledge of the architecture of the software.
White box Testing
White box testing, is on the opposite end of the testing spectrum. In White box testing, the tester has intimate knowledge of the architecture and the code. This knowledge allows the testers to tailor their tests to test specific potential problem areas. For example, if some data value is stored as a 16-bit signed integer the tester can try to see what happens if values greater than 215 are used. As another example, if the tester knows that the back end is using a relational database, they may try entering data what contains some of the SQL keywords or reserved characters such as LIKE or %. Having internal knowledge of the software can also help make testing more efficient. For example, if the tester knows that the same sorting algorithm is used on several of the application’s screens, they can test sorting on one of the screens more thoroughly and then only do a quick test of sorting on the other screens, making the assumptions that the same defects will be found on all the screens. As I mentioned, Black box and White box testing are more classifications than actual forms of testing, and form the ends a of a continuum along which other testing types fit. Traditional software testing, performed by an in-house test team is neither all Black box or all White box and is often referred to as Gray box testing.
On the testing continuum, Unit Testing lies far on the White box side of things. Unit testing it the testing done by developers while they are developing their code. With unit testing, whenever you write a method, function, class, etc. you should be writing a corresponding unit test (or more typically a set of tests). These unit tests should test all aspects of this method, function or class. For example, if you write a function that parses a String and searches for any numbers in the String, you would write several unit tests that would thoroughly test this function, testing for both positive (String that contains numbers) and negative (String with no numbers) results, and testing any error handling, such as inputting an empty String. The main problem with Unit Testing is, that it is often not done, or not done completely. Many developers don’t have the discipline or, more likely, the time to write several tests every time they write a new function. It is definitely a discipline that needs to be developed, but if it can be accomplished it can be very beneficial. Once you have a complete set of unit tests, you can be much more confident when making changes to your code, because after making any changes, you just run your set of unit tests again, and if they pass you are good to go. It can also help give the test team more solid builds to test with, since a full set of unit tests can be run on each build prior to giving the build to be tested. Unit testing will definitely not be able to find all the problems in the software. Problems with the User Interface, for example, cannot be easily exposed with Unit Testing. Also, since by definition, Unit Testing, is testing each unit in isolation, some problems that occur with interactions between functions, will also not be easily found. Since unit tests works at the code level, however, they may be able to find problems more efficiently than having to have a tester find the problem through the User Interface. By its nature, Unit Testing is something that lends itself to being automated. Most of the time, unit tests are actually small programs themselves, that perform the testing. This allows the execution of the tests to be done automatically, and since typically no user interface is involved the tests can be run quite quickly. There are several Unit Testing packages available for almost any modern programming Language. Some of the more common ones include nUnit for for .NET programming, jUnit for Java. There are also unit testing package for other technologies such as SQLUnit for testing SQL database, XUnit and WUnit for testing XML.
Back end Testing
This is another form of testing that is near the White box end of the spectrum. Back end testing focuses on the data storage part of the application. This data is typically stored in a database, but may also be stored in some kind of file such as with XML. For the more complex database applications, Back end testing would involve testing any stored procedures that are used to store, manipulate and/or retrieve data. For any application that stores data, Back end testing would also involve making sure that data entered in from the User Interface makes it into the back end, and is stored as expected. It could also involve making sure that the data that is stored, can be accessed and retrieved as expected. Functional Testing Functional Testing is what is typically thought about, when you think of Software Testing and lies somewhere in the middle between White box and Black box testing. Functional testing typically involves running an application through the user interface, and testing that a piece of functionality, or a feature works as expected. For example, a course management system it may have functions to add a Student, add a Course, assign a Student to a Course, etc. In functional testing, the tester tests that each of these functions works as expected. How the functionality should work, is typically defined by the system requirements and testers would write and execute test cases based on these requirements. Tests would include, using both valid and invalid data, and checking for proper error handling.
System Testing is quite similar to Functional testing and they are often done together. System Testing also involves running the application through the user interface, and testing its functionality, based on the requirements; but tries to focus more on the system as a whole. In the above example of a Course Management application, in System Testing you might test a scenario that includes: creating a user, then testing that the user can be added to a course and then that a grade can be added to the user for the course. Often System Testing involves using several individual pieces of the software’s functionality together in one test.
Regression testing is a critical part of testing, but is often overlooked. Whenever a defect gets fixed, a new feature gets added, code gets re-factored or changed in any way, there is always a chance that the changes may break something, that was previously working. Regression testing is the testing of features, functions, etc. that have been tested before, to make sure they still work, after a change has been made to software. Within a set of release cycles, the flow is typically as follows: The testers will test the software and find several defects. The developers will fix the defects, possibly add a few more features and give it back to be tested. The testers will then have not only test the new features, but test all of the old features to make sure they still work. Questions often arise as to how much Regression Testing needs to be done. Ideally, in Regression Testing, everything would be tested just as thoroughly as it was the first time, but this becomes impractical as time goes on and more and more features and, therefore, test cases get added. When looking at which tests to execute during regression testing, some compromises need to be made. When deciding, you will want to focus on what has changed. If a feature has been significantly added to or changed, then you will want to execute a lot of tests against this feature. If a defect has been fixed in a particular area you will want to check that area to see that the fix didn’t cause new defects. If, on the other hand, a feature has been working well for some time and hasn’t been modified only a quick test may need to be executed.
This is also sometimes referred to as a sanity test. This is a quick test or set of tests that is meant to check that the software is stable enough for further testing. It will typically test installation of the software, starting it up and executing one or two of the main functions or features. It is often used as a decision gate before proceeding to further testing i.e. after the test team gets a new build, they will run the sanity tests and if that passes will then proceed to further testing.
Beta testing is testing that is done after all the features have been added to a software application and the application is stable. The application is then given to a small group of users who try it out and provide feedback. This testing is very helpful, because often developers and even testers, are focused too much on the details of the application and forget about basic usability and the overall user experience. Beta testing can often uncover problems in these areas.
This type of testing falls on the Black Box side of the testing spectrum. Acceptance testing is when the final customer or user does their own testing. This is most common when a software development company is contracted to develop a software application for another company. The software development company develops and tests the software and then gives it to the purchasing company who then tests to see if the software produced meets their needs. This testing is often not as thorough as some of the other testing, but often reveals a lot of defects that are mainly caused be requirements not being clearly understood by the development team.
Performance testing is often placed in the category of Non-Functional testing, and looks at how a software application or system reacts when put under load or stress. This kind of testing is critical for large, multi-user systems, but should be done for smaller applications has well. Performance testing can be broken into two main categories: Load testing and Stress testing. The purpose of Load testing is to check that the system is responsive while operating under expected maximum loads. The definition of responsive will be up to the requirements specifications of the system, and determining what the expected load, will take a bit of analysis. For a generic web based application, as an example, a test might be to make sure that a page is rendered in under two seconds, while 1000 users are accessing the system. The purpose of stress testing, however, is to see how much the system can handle before failing and to see how well it fails and if it can recover. In the above example, a stress test could be designed to slowly keep adding active users to the system and see at what point page loading either stops or becomes unacceptably slow; and then see if the users are removed, does the system return to normal. Of all testing types, performance testing is one of the most challenging because much analysis and many assumptions need to be made about how many users may use the system, how they will use the system, etc. Then there is a the challenge of actually creating this load. This is typically done through the use of simulator tools which simulate a large number users, performing typical actions on a system.
- Junit – http://www.junit.org/
- Nunit – http://www.nunit.org/index.php
- JUnit Test Infected – http://junit.sourceforge.net/doc/testinfected/testing.htm
- Software Testing on Wikipedia – http://en.wikipedia.org/wiki/Software_testing
- Software Testing – http://www.ece.cmu.edu/~koopman/des_s99/sw_testing/
Note: this article first appeared in the March 2010 Issue (Volume 23, No. 3) of ASPects, The Monthly Newsletter of the Association of Shareware Professionals.