« May 2004 | Main | July 2004 »

June 30, 2004

[Research] hash function references

[Partitioned Hashing] Our work in model checking using magnetic disk needs some more background in hash functions since that turns out to be a central part of the work.

Turns out "partitioned hashing" in database land means splinigin the key into partitions and using each parition in the hash value computation.

"Uniform hashing" means each key has the same probability of ending up in each given chain. < ahref="http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/hashTable.htm">uniform hashing

"Perfect hashing" means there are no collisions.

Posted by jones at 08:00 AM | Comments (0)

June 28, 2004

[Research] Statis analysis introduction

[Introduction to Static Analysis and Logic Programming, Deadline analysis of interrupt driven software by Brylow and Palsberg, STatic checking of interrupt-driven software" also by Brylow, Damgaard and Palsberg] The main idea is to use static anlysis to identify program points where concurrency errors might occur. Then use carefully controlled interleaving at the assembly level to either confirm or refute those concurrency errors. Static anslysis is a kind of control abstraction, the model checking is a refinement of the control abstraction to locate errors.

Posted by jones at 09:00 AM | Comments (0)

June 24, 2004

[Misc.] Big List of Presidential Candidates

[Politics1 - 2004 U.S. Presidential Election (P2004)] FYI

Posted by jones at 08:36 PM | Comments (0)

June 08, 2004

[Tech Support] everything you ever wanted to konw about gdb breakpoints

[Debugging with GDB - Set Breaks] Including how to break at a certain line a given file.

Posted by jones at 09:51 PM | Comments (0)

June 03, 2004

[Research] Magnetic disk in model checking

[ Murphi description language and automatic verifier , Using Magnetic Disk, della Penna, Exploiting Locality, CHARMe 2003, della Penna, Integrating RAM and Disk]

We are having a hard time reproducing the results of the original Murphi-with-a-disk paper. We aren't seeing 15% slowdowns due to using disk. We think its because their RAM table is so small that their RAM results are really slow.

della Penna also has conflicting claims regarding the speed of the original algorithm. The della Penna algorithm is 10x faster than the previous algorithm and only gets a slowdown of 3x slower, on average. The previous algorithm only gets a 15% slowdown so 10x faster than a 15% slower algorithm is not going to be 3x slower?!

della Penna's experiments allow 1x, 1/2 and 1/10 of the memory needed by the disk based algorithm. They report the numbers from the 1/10 memory results. This means that when 1/10 of the memory needed for RAM-only is used by the disk-based algorithm, then they are 3x slower on average. However, if they use 1x the memory needed for RAM only, then they get 1.89x slower on average. And, they are 1.95x slower when using 1/2 the needed memory.

We are going to get blindsided by compression options. We didn't compress anything. That puts us in a different ballpark.

In relation to our work:


  1. We did most of our experiments with 1x the memory needed to do the whole thing in RAM.
  2. We didn't use compression etc. options, I think.
  3. We still claim that disk is irrelevant.
  4. We don't understand the resutls in the Stern Dill paper, but can probably reproduce them if we use compression an the right amounts of memory.

Later that day... In our experiments, the relative speed up is very sensitive to the amount of memory given to the RAM based algorithm. If we allocate just enough memory to the hashtable to completely store the required states (and use the double hash hash table) then the RAM algorithm slows significantly due to the inefficiency of a large double hash hash table with a high hash load. The slowdown when using just-enough memory and a chained hashtable is less severe because chained hash tables, in genral, perform more efficiently at high hash loads. Typically, one computes speed up relative to the fastest known sequential algorithm--rather than the sequential algorithm i nthe worst possible setting.

Second, the number of states in the model is a signficant facter in the speedup/slowdown computation that is missed in the previous experimental setups. Previous experiments focus on "speedup/slowdown as a function of the ratio between memory needed to complete the model and memory allocated in the disk based algorithm." So allocating 1/10 the required memory to the disk based algorihtm is not as important as the number of states in the model. Because, as the number of states increases, then the time required to do checkTable increases significantly thus increasing the disk-algorithm verification time at all memory allocation ratios.

Third, we can cut down our file IO time by appending the newly added states to a partition rather than rewriting the entire partition to disk after a swap. This will be paritcularly helpful at the end when partitions are kind of big.

Fourth, we get alot of swaps at the beginning and end of the search. as expected. However, using a chained hash table the swaps at the begining are less costly because the tables are so small. The swaps at the end are quite costly since the tables are larger--unless we do the append trick rather than writing the whole file.

Posted by jones at 09:57 AM | Comments (0)