Testing Tips and Tricks of the Trade

A ‘top ten’ list of State Change Tests by Matt Pierce, Volt

As testers, we have all had our share of nasty bugs whose detection is difficult. The bug might be intermittent, so we spend hours attempting to discern why the failure is inconsistent over time. The bug might work on all test machines except one, so we spend hours attempting to discern why the failure is inconsistent across machines. Worst of all, the bug might have occurred only once, so we spend hours attempting to determine the exact repro path that broke the app. In the worst cases, we do not find an answer, and are faced with the decision of what to do with the bug.

Do we just report the bug’s symptom(s) and classify it as non-reproducible, "confident" that we tried everything possible to isolate a consistent repro path? Or is there additional testing that we can do to further identify the bug so that it will actually get fixed?

I have encountered these difficult bugs several times in my career. Occasionally, I felt compelled to take the easy way out by writing-off a bug as non-reproducible. I rationalized my actions: "Well, I’ve spent three days trying to repro this bug and have to move on," or "Well, it only fails on that weird machine in the lab but runs everywhere else". Ironically, this negative mentality prevented me from performing my job (finding bugs and ensuring they get fixed). I should have been thinking positively: "I know the app works here but not there; thus I just need to determine the differences between machines, or differences over time and therein will I find the cause of the bug." If I were thinking positively, I would have realized that I was just a couple of State Snapshots and a File Comparison away from detecting additional symptoms, or even the causal agent.

What is a State Snapshot and how can it assist you in isolating additional symptoms, or even identifying causal agents? A State Snapshot is a picture of the state of a computer (or network) at a specific point in time in a specific condition. A State Snapshot is captured and then placed into a file. The use of files enables easy capture and comparison of multiple State Snapshots so that the differences are readily apparent. Causal agents and additional symptoms will virtually always exist somewhere in the snapshot delta. Thus, by filtering out the similarities, you are reducing the effort it takes to identify additional symptoms, or even causal agent(s).

The State Snapshot Process consists of four general steps:

  1. Repro Path: First, identify a repro path to both the failed state and the passed state. The different repro paths could be different machines, different operating systems, different points in time (before and after install), etc. Note that this step is very difficult for non-reproducible bugs…hence, all the more reason to take a State Snapshot immediately after a bug has occurred when there is reason to suspect the bug is non-reproducible.
  2. Refine Repro Path: Next, attempt to reduce the number of steps in the repro paths so that the failed state matches the passed state as closely as possible. This will minimize the number of differences you need to review. (Translation: don’t try to compare registries on an NT machine against a Win95 machine because there will be thousands of non-applicable differences to wade through. Rather, try to compare the differences between registries of two Win95 machines. Or better, the same Win95 machine before and after some general action breaks the system—like an install.)
  3. File Snapshot: Next, capture all relevant state information into a text file format. The techniques vary and are listed in the next section below. Examples include using RegEdit.exe to dump the registry to a file, using the DOS DIR command, etc. Be sure to get two State Snapshots, one for the failed state, and one for the passed state (might be more than two).

File Comparison: Finally, review the state differences with a file comparison utility such as ExamDiff.exe (http://www.nisnevich.com/examdiff/examdiff.htm) or fc.exe (WinNT).

Below, I have listed 10 uses and methods for the State Snapshot Process:

  1. Registry: Do you think you might have a registry problem? Perhaps your app’s setup.exe is not putting items into the registry, or is changing the wrong items? Do you want to know if uninstall returns the registry to the exact same state as before the install?
  2. RegEdit.exe has a menu item "Export Registry File" under the "Registry" menu permitting you to take a snapshot of the entire registry. You can use this feature to compare registry snapshot files before and after install, before install vs. after uninstall, between two machines where one fails and the other passes, etc. Shotgun.exe and similar tools help to automate registry state tracking.

  3. Directory: Do you think you might be missing some files on machine 1 but machine 2 has them all? Do you think that the install changed a file somewhere, but you don’t know which one and where?
  4. You can shell out and use the DOS DIR command to take snapshots of the hard-drive file names, file sizes, and file date/time stamps in multiple states. These State Snapshots can then be compared to determine which files changed. To make a snapshot, shell out to DOS and run DIR <path> /s > <target file>. For example: DIR C:\ /s C:\TEMP\PASSED.TXT

  5. SQL Server Schema: Does one test server work and another have errors? Did the last build include back-end changes of which you are unaware?
  6. If yes, then a schema comparison will assist you in quickly identifying what component changed or is missing. Go into SQL Server Enterprise Manager and select a Server and Database. Then select the menu item "Object", and the sub-item "Generate SQL Scripts…". Next, check "All Objects", select each of the checkboxes in the "Scripting Options" frame (ignore the Security frame if you want), and be sure the "File Options" is set to "Single File". Next, click the "Script" button. Finally, enter a target filename when the "Save As" dialog pops up and press "Save". You now have a large text file containing all generated scripts for all tables, stored procedures, triggers, views, defaults, rules, etc. Take a similar snapshot of the same database in another state (on another server, before or after upgrade patch is run, etc.) and compare the two schemas.

  7. SQL Server Data: Do you want to determine where data went at the backend after a front-end operation (as you’re quickly learning the backend of a new app)? Do you want to check if the reference tables (zipcodes, etc.) on your ship CD match those in your master database?
  8. Save the results of a SQL query execution window into a text file for multiple states. Then, compare the State Snapshot files to look for data mismatches, record counts variances, etc. For example, you could take multiple state snapshots of a query that lists the record counts for all tables. Then compare the files to quickly isolate which tables had data added or removed. You could also do a simple select to dump the content of target table to a file, then compare the contents over time, or across systems.

  9. Compatibility: Is the new code broken? Well, let’s compare it to the output produced from the old system. Do reports from the old system match up with reports from the new system? Do data operations on the new system produce the same results on the same data as the old system?
  10. Compatibility issues are great for file comparison utilities. To test printed output: setup a new printer that outputs to a file (not a COM nor LPT port). Use "Start" button > "Settings" > "Printers" > "Add Printer", "My Computer", "File", … If you can, setup a text type printer; but if you must (due to nature of app) go ahead and setup binary output via HP or other printer. Next, run both the old and new versions of the test app. Print out identical reports, one from each system. Perform a file comparison on the two report files to locate any differences. The same snapshot comparison between old and new can be applied to tables modified by a test case in SQL Server, to data files created by the app, etc.

  11. Release CD: Do the Gold CD contents really match the final build drop?
  12. WinDiff.exe gives you a slick way of comparing all files in a directory against what should be a mirror image in your release drop point.

  13. Missing .DLL Files: Does your app not run after an install on machine N but works fine on all others? Does the app work on the developer’s machine, but not on your test machines?
  14. If yes, then the failing machine is probably just missing a DLL, or other support file. Grab a utility like WinNT’s Wps.exe, or MSOffice’s SysInfo to export a list of all loaded .DLL’s to snapshot files for comparison (sort by running column in SysInfo).

  15. Code Changes: Where were the code changes concentrated this build? What is a crude estimate of the churn rate (when lacking a better method)?
  16. When testing a new build, it is often a good idea to know where changes occurred and focus testing accordingly for that build-test cycle. If your group is using VSS, then you can quickly sort all source files by date, and then do a "Show History" and "Diff" the new build against previous builds. If you are not using VSS, then you can use WinDiff (or other util) to locate areas of source code change.

  17. INI File: Do you think something changed in the ODBC.INI or the application .INI during install?
  18. For those of us occasionally testing legacy apps, this could occur. To use the file comparison utility, just rename your .INI files to represent the different states (i.e.: Failed.ini, Passed.ini, etc.) and then compare them.

  19. Find Files or Folders: Did you just do an install and forget where the error.log or setup.log went? Do you wonder what files have changed today since you did an install, or since your machine was broken by some unknown change?

Well, the Win95 / WinNT4.0 "Find File or Folder" dialog can be of great assistance. Unlike the state change tracking methods listed above, this one does not require a file comparison utility. Simply right click the "Start" button and click "Find…" Select the "Look In" folder to C:\, and the File Spec to "*.*". Now for the most important part: select the "Date Modified" tab and set the date option to "During the Previous 1 days" Click the "OK" button, and presto, you now have a list of all the files touched by today’s activities.

In conclusion, don’t give up on those seemingly intermittent, non-reproducible bugs, or otherwise bizarre bugs. By using the State Snapshot process, you might be able to isolate addition symptoms that give you a consistent repro path, or triggers identification in the developer’s mind. On bugs that are external to the code, you might even be able to identify the subtle causal agent behind the bug (e.g.: wrong version of a DLL, or ODBC drivers missing, or registry setting is corrupt, etc.).