« December 2004 | Main | February 2005 »

January 31, 2005

[Projects] Discussion of Crossing the Desert

I was a little surprised today that so many people were having so many issues with the project. Well, I wasn't surprised that there were issues, but I was surprised that I didn't hear anything until today.

Here's the idea: post your problems related to project 1 here. If you've got a solution to your problem or someone elses' post it here too. I get mail everytime a comment is posted.

Posted by jones at 06:48 PM | Comments (2)

[Projects] Crossing the desert

[File distribution, project description (see page 5)] This project is due January 31. The problem is solvable using a greedy algorithm, the main challenge is designing the cost function. (Distribution has been fixed).

Posted by jones at 03:28 PM | Comments (2)

January 27, 2005

[Announcements] Online access to C# and .net reference books

[Safari Books Online] It turns out that our library has a subscription to an online technical book service. It turns out that it includes almost all of the O'Reilly books. While O'Reilly is most famous for its unix-y technical books, O'Reilly also has a slew of books on C# and .net programming.

So you can get access to a bunch of C# and .net programming reference books for free. And that's a good thing.

If you are accessing from off-campus then you will have to type in your RouteY information to get access.

Here's a note from John Christensen at the library that gives more details about this and other collections available online:
"I am covering Computer Science this year for the library while we try to hire a new librarian over that area since our last one left. I teach classes about library use which include computer science students and I asked some of them today how to get the word out about a computer science resource the library has. They suggested that I contact the faculty and ask you to get the information out through announcements in classes as well as an announcement on the CS webpage. The resource I am referring to is the Safari collection of computer science books that the library subscribes to. You may or may not be familiar with the Safari collection. You can view it at: http://proquest.safaribooksonline.com/?uicode=byuprovo We have a partial subscription to Safari. Our subscription includes all the O’Reilly books in the collection. They represent about 1/5 of the collection and we are doing this as a trial. I have been told by the library automations people over here that O’Reilly is probably the premier publisher and there are a number of our automations people that have had personal subscriptions to Safari before the library started their subscription.

The CS students I talked to all said that if they had known about Safari they would have used it a lot and will now be trying it out. As you can see from the webpage from Safari, it is the full text of the books and they are indexed so you can search for answers to questions within the books and then go right to the material to answer those questions. Could you let students know about Safari in your classes? I have sent a similar email to your CS Webmaster also asking for his help in spreading the word. Students can get to it on the BYU library webpage by going to “Search by Discipline,” then “Physical and Mathematical Sciences,” then “Computer Science.” On our Computer Science page it is the last item at the bottom of the page. It is available off-campus. If you are off-campus and connect you will get a page asking for your Y-Net information and when you input that information you will be allowed into Safari just as though you were on campus."

Posted by jones at 10:39 AM | Comments (0)

January 26, 2005

[Homework] Minimum spanning trees (due 1/26)

1. Can a minimum spanning tree have a cycle? why or why not.
2. Suppose you are going to build a network connecting n computers. Suppose you know the cost to connect any pair of computers. Should you use Kruskal's algorithm or Prim's algorithm? Why?

solution
1. No. It wouldn't be a tree first of all. And second of all, if a "minimum spanning graph,er,tree" had a cycle, then you could drop at least one edge and get something that is more minimal and still spans every vertex in the graph. There's an assumption about edge costs being made here. What is it?

2. Prim's. See first full paragraph on page 198.

Posted by jones at 02:34 PM | Comments (2)

January 25, 2005

[Lectures] Minimum spanning tree notes and reading

We didn't exactly get through all of Monday's lecture, however, we did have a productive discussion of minimum spanning trees. At the end of class, I said read section 3.6 (I think) that's wrong you want to read section 6.3 for the homework. The lecture notes are in this directory... [Index of /~jones/cs312/lectures/prior_semesters] On wednesday we are going to move on to Dijkstra's algorithm for shortest paths.

Posted by jones at 08:57 AM | Comments (0)

January 24, 2005

[Homework] Parts of a Greedy Crossing the Desert Algorithm (due 1/24)

Answer the Five Demands of a Greedy Algorithm for the crossing the desert problem. Due 1/24 by email before 1pm.

The Five Demands of a Greedy Algorithm are:


  1. What is the candidate set for a given node? Meaning, which nodes can I choose the next node in my solution from?
  2. What is the solution function? Meaning, how do I know if I have created a solution and I can stop?
  3. What is the feasibility function? Meaning, how do I know if adding more candidates will eventually result in a solution?
  4. What is the selection function? Meaning, how do I chose the next candidate for inclusion?
  5. What is the objective function? Meaning, how do I measure the goodness of a solution?

Solution


  1. You may choose any node within capacity/3 from a node in the solution set if you need a round-trip caching-trip to build a cache or you may choose any node with capacity/2 from a node if you don't need a round-trip caching trip.
  2. You can stop when you have added the start node to the solution set. You also may stop if there are no more nodes to add or if you have exceeding 1 Million food on all paths.
  3. The feasibility function is whether or not you have exceeded 1 Million food on all paths.
  4. Select the node from which it is cheapest to build the cache at a node in the solution set.
  5. How much food does it take to get from start to finish?

Posted by jones at 03:02 PM | Comments (2)

January 21, 2005

[Study Problems] Prim's algorithm on a disconnected graph

What happens if Prim's algorithm is deployed on a disconnected graph? A disconnected graph has some nodes that are not connected to any other nodes.

Posted by jones at 02:33 PM | Comments (1)

[Homework] Computing the cost (due 1/21)

Suppose you are at an oasis at (2,7) and want to travel to an oasis at (10,16). Your carrying capacity is 100 and you want to cache 105 food units at oasis (10,16). How much food do you need to build a cache of 105 food units at (10,16) assuming you start at oasis (2,7) and end at oasis (10,16)? Explain how you computed your result. Due 1/21.

solution:

You need to store 105 food at the end. The distance beween the oases is

 
sqrt ((2-10)^2 + (7-16)^2) = 12.041
So you can carry with you an extra
 
LastTripCacheCap = 100 - 2 * 12.041 = 75.918
food units to build the cache on your last trip to (10,16).

That means you will need

 
CacheRoundTrips = max (0, ceil (105 - 75.918) / (100 - 3 * 12.041))) = 1
caching trip.
The cost of that caching trip is
 
CacheRoundTripsCost = 2 * 12.041 + (105 - 75.918) = 53.164

food units. And a total of
 
TotalCost = (100 - 12.041) + 43.164 = 141.123

food units to build a cache of 105 food at (10,16) and end a oasis (10,16)

Posted by jones at 08:40 AM | Comments (0)

January 19, 2005

[Homework] Homework Grades

The first two homeworks have already been graded and placed on blackboard. If you have questions about your grade then send me an email and I'll try to explain why you got the score you did.

Also, I have not recieved submissions from many people. Please check to make sure that if you submitted homework it has been graded.

Posted by terry at 05:39 PM | Comments (0)

[Homework] amortized analysis (due 1/19)

Use the accounting trick method to show that the count algorithm from the text page 113 has amortized linear running time. Assume that the counter is intially set to 0. Due January 19 before class via email.

solution
As in the textbook, add 2 tokens per call to count and remove 1 token for each trip through the while loop. Each trip, except for the last one, through the while loop flips a bit from 1 to 0. The last trip flips a bit from 0 to 1. This is like adding 1 to a binary number and rippling the carry bits until there are no more carry bits to ripple.

The crux of the homework is to show that each call either increases the number of tokens in the account by the increase in the number of bits set to 1 or decreases the number of tokens in the account by the decrease in the number of bits set to 1.

Posted by jones at 08:33 AM | Comments (0)

[Lectures] Notes for amortized analysis and Crossing the Dessert

The notes I used for the lectures on 1/19 and 1/14 in which the topics were amortized analysis and crossing the desert can be found in 0-04-amortizedpnp.pdf and 0-05-breatherdesert.pdf at this location Index of /~jones/cs312/lectures

Posted by jones at 07:51 AM | Comments (0)

January 18, 2005

[Projects] New Visualizer for the Desert Lab.

Roger has thoughtly spent his nights and weekends working away to create a new visualizer for the desert lab. I have tried it. It works great! The difference between the new version and the version we distributed before is that the new version graphically shows all the nodes so it might make it easier to guess what your solution is doing.

If you are interested in this new visualizer download it from where you downloaded the lab originally (the new version has been posted in place of the old version).

If you have any problems then please send an email to cs312ta[at]gmail[dot]com.

Posted by terry at 06:47 PM | Comments (0)

[Projects] Maximum Parsimony Scoring

[A good explanation (.ppt), Google Search: maximum parsimony scoring algorithm ] It looks like max parsimony scoring can/should be done one letter at a time. That's good because that's the only way I know how to do it!

Maximum parsimony is the process of finding the simplest explanation possible for a set of data. In phylogenetics, it means finding the simplest tree that might explain the evolution of a set of taxa. Of course, the trick is how to define simple. For this lab, we are going to define "simple" as requires the fewest mutations of each nucleotide assuming that each nucleotide's mutation is independant of the others.

Posted by jones at 09:29 AM | Comments (0)

January 15, 2005

[Exams] Do you know what you think you know?

There was an interesting article about the relationthip between self-assessment and actual competency. I am not sure about their methods, but the conclusions are interesting--especially in light of our discussion about homework and study problems.

Basically, the more people thought they know, the less they actually konw. And the less people thought they knew, the more they actually knew.

Posted by jones at 08:26 AM | Comments (1)

January 14, 2005

[Announcements] We will do 312 in C#

Based on feedback and discussion, we have decided to do 312 in C# this year. The first project is designed for you to write a C++ dll for an application (because we needed to get the project designed before the C# decision). The remaining projects will require you to use C#.

I put a little link to learning C# at google for your convenience.
[Google Search: learning C# ]

Posted by jones at 01:23 PM | Comments (0)

[Study Problems] max rule, big O, NP

I think the general feeling in the class is that some homework would be helpful to let people know what they should be learning at this point in the class. Here's a few questions you should be able to answer. Its due Wednesday January 19 (before class, via email as usual). Well, I am not sure where I was getting that feeling from, but that's not what I got discussing that idea with the class. So we've created "study problems." Study problems are like homework problems except they aren't required or graded. I will post solutions later.

1. Prove that O(f(n) + g(n) + h(n)) = O(max(f(n),g(n),h(n)). You don't need to give the level of detail we gave in class last week, but you do need to write the proof. I would expect to see proofs that are about 3-6 lines long.

solution Proof idea: the proof follows the proof in the book except the constant is 3 instead of 2.
Proof: suppose q(x) is in O(f(x) + g(x) + h(x)) then q(x) is in O(max(f(x), g(x), h(x)) because three times the greatest of f(x), g(x) and h(x) is greater than or equal to the sum, more precisely:

 
q(x) <= c(f(x) + g(x) + h(x)) <= 3c(max(f(x), g(x), h(x)).

Next, suppose q(x) is in O(max(f(x), g(x), h(x)) then q(x) is in O(f(x) + g(x) + h(x)) because the greatest of f(x),g(x) and h(x) is less than or equal the sum of all three (assuming all three functions are positive valued). In formulas:
 
q(x) <= c max (f(x),g(x),h(x)) <= c (f(x) + g(x) + h(x)).

Proving the max rule for a fixed number of functions is pretty easy, once you see the pattern. However, proving the max rule for any number of functions using induction is a little harder--but not much.

2. Put these functions in order of increasing growth with the most slowly growing functions on the left. That is, put f(n) before g(n) if f(n) is O(g(n)). (n-2)!, 2^(2n), 0.0001n^4, n log n.

solution
n log n, 0.0001n^4, 2^(2n), (n-2)!.
There were two non-obvious things here. The first one you should all have known. The 0.0001 on the front of n^4 is irrelevant. Second, a factorial grows faster than an exponential. To see that a factorial grows faster than an exponential, recall that n! includes multiplication by n, n-1, n-2, all the way down to 1 and 2^n includes multiplication of 2 n-times. So in the factorial term, you multiply n terms, of which n-2 are geater than 2, and in the exponential term, you multiply n terms all of which are equal to 2.


3. What is the difference between "undecidable" and "NP-Complete"?

solution The difference is that "undecidable" means you can't solve it and "NP-complete" means is probably really hard to solve, but solvable given enough time.

This is an easy question designed to make sure people realize that NP-complete doens't mean undecidable. It just means really hard.

Posted by jones at 08:26 AM | Comments (1)

January 13, 2005

[VisualStudio] Graph Visualization and Layout in C#

The his entry covers stuff I have looked at and tried to get graph layout and visualization to work in C# for the phylogenetics project.

The basic plan thus far is:



Before we can really use this, we need to figure out if the students can install the GraphViz and SVG Viewer on the downstairs lab machines. This is going to be critical for the viewer to work. The only other choice is to try and ship the viewer as a standalone application. I am not sure, however, that we can statically link in the things we need for that.

Posted by egm at 12:30 PM | Comments (0)

January 12, 2005

[Announcements] Devotional Thoughts

[For the Strength of Youth (pdf)] We will have a thought from For the Strength of Youth each Wednesday after a devotional or forum. The person giving the thought will select a paragraph or less from For the Strength of Youth and say how its related to the devotional or forum. This should take 2-3 minutes. If you missed the devotional or forum, you can find a summary in the Religion Section of NewsNet or the Campus Section.

Mail me if you don't want to give one of these thoughts and I won't ask you to. Otherwise, you are fair game!

Posted by jones at 02:41 PM | Comments (0)

[Projects] Random thoughts on a phylogenetics lab

We think we have reached concensus on a lab from phylogenetics. The following isn't quite a full write up of the lab, but its a convenient place to record the ideas that will go into it.

Do a simplification of branch swapping. In which we randomly pick an interior edge, delete it AND its endpoints to form a quartet of trees. Then we recurse on the four subtrees. The recursion bottoms-out at say 4 taxa (leaves). The recombination step takes the 4 subtrees and recombines them in any of the 24 possible re-joinings of 4 subtrees around a single (new) interior edge. A scoring function based on maximum parsimony ranks the 24 recombinations and picks the very best one. This one is passed up to the parent for recombination and the next level up.

The tree datastructure isn't really a tree because it doesn't have a root. It doesn't have a root because that doesn't make sense from a biological stand point (I think). So the tree is really a Steiner topology. We then need a data structure that supports the splitting and joining of Steiner topologies. One way to do this is to do a doubly linked list with three links per node rather than just 2.

An interior edge is an edge that connects two interior points. An interior point is a branching point with degree three. In the datastructure, an interior point is a node with three non-null pointers.

The biological interpretation still makes sense, but the difficulty of implementing the algorithm has been reduced. That means we will still find meaningful solutions, but not optimal solutions.

Problems to solve:

Posted by jones at 01:29 PM | Comments (0)

January 11, 2005

[Homework] What does it take to show NP-completeness?

If you want to prove that problem X is NP-complete, what do you have to prove about problem X? Turn this in (the normal way, via email or on paper) by 2pm Wednesday January 13. If you skimed over Chapter 12 already, this will be quite easy. Solution has been posted below...

Solution: You must show that X is in NP. This can be done by either giving a non-deterministic Turing machine that solves X is polynomial time, or, by giving a deterministic algorithm that verifies a solution to X in polynomial time. One can prove that a problem is in NP if and only if there is a deterministic algorithm for verifying "guessed" solutions in polynomial time.

Second, you must show that every other problem in NP can be reduced to X in polynomial time. More precisely, you must find a function f such that: for all Y in NP, y is in Y iff f(y) is in X. And you must show that your function f can be computed in polynomial time on a deterministic turing machine.

The first problem that was shown to be NP complete is the satisfiability problem. Given a boolean formula, like (a /\ b \/ c) the satisfiability problem is the problem of determining whether or not values can be assigned to a, b and c such that the formula evaluates to true. Showing that satisfiability is NP complete required showing how to take any non-deterministic turing machine and reduce it to a satisfiability problem in polynomial time. Interestingly, this was done using a process that looks a lot like synthesizing a circuit from a program. In the end, the satisfiability problem looks like a circuit you might use to implement the program.

Posted by jones at 04:59 PM | Comments (0)

[Projects] Part of a Phylogenetics project

[Majority Tree Algorithm] After dividing up a tree into pieces, one needs a way to recombine the trees. This algorithm looks like a good idea for 312.

Posted by jones at 02:13 PM | Comments (0)

[VisualStudio] Memory optimization in managed code

[Memory Lane: Rediscover the Lost Art of Memory Optimization in Your Managed Code -- MSDN Magazine, January 2005] A student in another section pointed out this article. Its an interesting read.

Posted by jones at 02:11 PM | Comments (0)

January 10, 2005

[Projects] Disk covering method

[The Disk Covering Method for Large Scale Tree Reconstruction] We are still looking for a good description of a divide and conquer method for phylogenetics. This algorithm looks like a good candidate. Now we need a more detailed description.

Posted by jones at 08:47 PM | Comments (0)

[Projects] How hard was it to learn C#?

Based on what I am hearing from you all, here are the conclusions I've reached regarding C# in 312...

  1. This wasn't a very good experiment. Something that actually required writing more C# would have been helpful.
  2. The visual Studio aspect of the project (ie, the windows and buttons and etc) were interesting and easy. Though it was wierd to use code you didn't write.
  3. The most annoying part of the change was the syntax (like figuring out how to write "exponent").
  4. VisualStudio's help interface got mixed reviews.
  5. The majority of students were either neutral or for using using C#.

Posted by jones at 02:51 PM | Comments (0)

[Lectures] Finishing big-O, and notation in lectures

The lecture on the max rule proof on Friday got a little dense. I think it would have been better if we'd discussed the idea behind the proof before diving into the detail. We might also need to discuss big-O a little more too. Very few lectures in 312 will have that much notation!

By the way, the reading for each day and the lecture notes will help you understand the lectures better. The big-O stuff comes out of these lecture notes: 011-asymptoticbounds.pdf and 012-asymptotic2.pdf in the old lecture notes directory.

Posted by jones at 02:27 PM | Comments (0)

January 08, 2005

[VisualStudio] Remote desktop client for Mac

[Remote Desktop Client for Mac] If you've got a mac and a windows machine and a network connection, then you can use this program to connect to your windows machine from your mac. I use it at home all the time. rdesktop is a linux/unix/X11 version of the same thing that you might also find useful.

Posted by jones at 12:03 PM | Comments (0)

January 06, 2005

[Syllabus] TA Hours

The TA hours for this class can be found on the following webpage.

http://students.cs.byu.edu/~cs312ta/class-info/w05-001/office-hours.html

Hope to see you here.

Posted by terry at 12:12 PM | Comments (0)

January 05, 2005

[Syllabus] Terry Contact Information

The best way to get in hold of me is through email. My email address is

terry AT byu DOT edu.

Also during office hours I'll try to be logged into Windows Messenger from the above email address.

If anyone wants to meet with me, but can't come during my office hours then send me an email and I'll try to find a time that will work.

Posted by terry at 05:15 PM | Comments (0)

[Homework] How hard is it to learn C#?

[Visual C# Developer Center] In which you spend no more than 3 hours writing a simple C# windows application that does simple data manipulation with a button and text box. The purpose is to see how hard it is for you to write in C#. Details in the full body of this post... Due Monday January 10th by 1pm

Eric Mercer (who teaches the other section) and I were trying to guess how hard it would be to have 312 students write in C# rather than C++. We think it wouldn't be that hard, but want to try an experiment to find out. Eric and I agree that using C# is much more natural and elegant in VisualStudio than using C++ in VisualStudio. But the problem is that 312 is supposed to be about algorithms not learning a new language. So if learning C# is going to be a big drag, then we'll stick with C++. Otherwise, if we can use C# without distracting from the purpose of the class then the projects will be a whole lot more fun.

So here's the homework...

Create and build a Visual C# application that


See the helpful post on using VisualStudio for tips.

Posted by jones at 01:51 PM | Comments (10)

[Projects] Creating a windows application with a textbox and button in C#

A few pointers for creating a windows application in C# that has a textbox and a button. Clicking the button changes the value in the textbox. This has alot of interface programming problems, but should be a nice introduction to what the tools do.

Posted by jones at 01:46 PM

[Syllabus] CS 312 Section 003 syllabus

[Syllabus for CS 312 Section 003] Contains class policies, assignments, grading information etc. Also contains contact information.

Posted by jones at 09:53 AM | Comments (0)

January 04, 2005

[Lectures] Notes from Last Year

[Lecture notes] They are mostly pdf files, with some text files. The first two digits of the file name are roughly the order in which the lectures were given. We will probably not do very many powerpoint lectures (since there's only 10 students in the class, we can just discuss stuff) put these should give you an idea of what's in there.

Posted by jones at 11:07 AM | Comments (0)

January 03, 2005

[Schedule] CS 312 Section 003 schedule

[312 Schedule] This is what the schedule looks like today. I have a day by day schedule, but its likely to change as the semester goes on. The project due dates might even change on this one.

Posted by jones at 09:29 PM | Comments (0)

[Projects] Drill core alignment project

[Drill core alignment] In which you use the longest common subsequence algorithm to align soil layers in drill core samples. The hard part is figuring out what to do with misaligned layers. We restrict the problem to two dimensions to make it easier. Due on or about March 2.

Posted by jones at 11:38 AM | Comments (0)

January 02, 2005

[Projects] Crossing the desert

[Crossing the desert] the project description is on page 5. We will use a different input format because you will be reading problems from an Access database and writing to a database. Well, the TA code will handle the database stuff and you get to implement the function that solves the problem. More details about the TA code (including a link to the TA code) coming soon.

Posted by jones at 07:43 PM | Comments (0)