« NASA testbed details | Main | "Orthogonal defect classification..." by Chillaredge et al »
February 21, 2004
"Using testbeds to accelerate technology maturity... SCRover" by Boehm et al
[Using testbeds to accelerate technology maturity... SCRover" by Boehn et al] Interesting ideas about how to measure reliability and how to show that you've improved reliability using seeded errors. the results are in the context of UML architecture level work, but the ideas are interesting for us at the machien level.
Their seeded defect model for evaluating a technology's impact on reliability is particularly interesting. Here's some quotes:
Suppose an experiment shows that in a given situation, the technology being evaluated finds 3 defects. How can we tell whether this is 100% of 3 defects or 3% of 100 defects? The best technique found to date is the seeded defect technique adapted from previous statistical techniques to software testing [18].
[18] is H. Mills, "On The Statistical Validation of Computer Programs", IBM Federal Systems Division Report 72-6015, 1972.
If we insert 10 representative defects into the software, and the technology being evaluated finds 6 of them, the maximum likelihood estimate is that the technology has found 60% of both the seeded and the unseeded defects. In general, if we insert I seeded defects, and the technology finds S seeded defects and U unseeded defects the maximum likelihood estimate of the total number T of unseeded defects is T = I*(U/S).
The "seeded defects" are really just the defects found during normal code and architecture reviews. The experiment, then, is to see how many of those known, I mean seeded, errors were found using the new analysis technique.
"These defects were classified under a categorization schema similar to Orthogonal Defect Classification [5]" [5] is [5] R.Chillarege, I.S.Bhandari, J.K.Chaar, M.J.Halliday, D.S.Moebus, B.K.Ray, and M-Y.Wong, “Orthogonal Defect Classification- A Concept for In-Process Measurements”, IEEE Transactions on Software Engineering, 18(11), 1992.
The other interesting thing about the paper is their analysis of their Mae tool.
USC’s Mae technology serves as an intermediate step between the UML diagrams and the implemented system. Mae is an extensible architectural evolution environment developed on top of xADL 2.0 [6] that provides functionality for capturing, evolving, and analyzing functional architectural specification [21].
Basically, they are using UML to describe their system and they want to convince themselves that the UML diagrams say the right things. That's hard because UML isn't executable and doesn't have a formal model.
So their results are interesting, but not that applicable to our tool since we are working on machine code, not architectural level stuff.
In the end, I think we use the same seeded error model and work to improve the maximum liklihood estimate by finding more errors in the code until we are able to find all the errors in a piece of code. We focus in on the "most difficult, intracate 1000 lines of machine code in the software" Then throw the rest into the environment. Real time works because we have real time in the environment. That's not bad. year to year we show that we are decreasing the MLE for the number of remaining errors.
Open questions for us:
- What kinds of errors are we going to find? Look up the orthogonal error model.
- How are we going to seed errors? We are claiming that we can find errors that are difficult to fidn using any other method. So we have to have a method for seeding realistic errors without the ability to detect them.
Posted by jones at February 21, 2004 10:36 PM