« Numbered Pseudocode in Latex | Main | Java Swing Componets and how to use them »

October 09, 2003

"Formal Analysis of the Remote.." by Havelund

[Formal Analysis of the Remote Agent Before and After Flight - Havelund, Lowry, Park, Pecheur, Penix, Visser, White (ResearchIndex)] This is a methodology paper about using explicit model checkers to find concurrency errors in space flight software written in Java. This is a rare methodology paper in the sense that the authors were able to analyze a non-intentional bug that occured during deployment in a working piece of code.

1. What problem is being addressed?
NASA wants to launch a remote agent into space. They'd like some reassurance, before the launch, that the RA will function correctly. This is a hard problem because a flight control system is built on several microprocessors using complex concurrent software. Complex concurrent software is difficult to debug. Explicit state model checking is a good tool for debugging complex concurrent software, but explicit model checkers suffer from capacity limits and input language limits. The real problem is that NASA wants/needs to be convinced that explicit model checking can result in more reliable flight control systems.

2. What approach is being taken to solve the problem?
The authors completed two case studies in which they applied EMC to the RA. In the first, they translated a hunk of LISP code to PROMELA and ran spin on an abstraction of the resulting PROMELA. This required alot of work and was an inspiration for the java pathfinder. Using SPIN located 5 high-quality we-couldn't-have-found-them errors. When the mission was in flight, the controller experienced a deadlock. The second verification effort was a case study to find the deadlock that occured in space. The deadlock was found and was isomorphic to one of the original 5 errors.
In the second case study, the verification team got a copy of all the java code and nothing else. They set up a front end "review team" to look over the code and identify regions likelty to contain the error. They set up a back end team to recieve the code identified by the front end team and actually find the error. The review team identified about 700 lines of code. they turned that 700 lines of code over to the JPF the JPF translated it into promela and spin found the error.
More specifically, the error depended on a missing critical section around a conditional wait.

3. What is the claim about the approach?
"formal methods tools can find concurrency errors that indeed lead to loss of spacecraft functions, even for complex software required for autonomy"

4. How do the authors support their claim?
The claim is an existence claim. They support their claim by showing that such an error has been found using SPIN.

5. Do you believe they established their claim?
Yes.

6. How does their approach, claim and justification relate to your problem and solution?
The deadlock that happened in flight was caused by a race condition between two concurrent threads. We are argueing that threads are the main problem in flight controll software.

Posted by jones at October 9, 2003 10:58 AM

Comments

Post a comment

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


Remember me?