Personal Projects

From Earlham CS Department
Jump to navigation Jump to search

Return to Bioinformatics

Combined project!

Making life easier for a research scientist: A case study

A crime scene investigator, Dr. Doolittle, is called to the scene of a gruesome murder. As he pulls into the Chicago zoo, Dr. Doolittle finds the mane attraction, a large African male lion, lying in a pool of blood. The zookeeper has isolated several suspects who were found near the scene and are also covered with blood: a chimpanzee, river buffalo, opposum, platypus, and rat. The lion has unfortunately been so maimed during the attack that it is difficult to tell whether the cause of death was strangulation, crushing, or biting. Fortunately Dr. Doolittle knows that that all of the suspects' genomes have been sequenced and so he collects a sample of the blood at the scene to take back to the lab.

Dr. Doolittle centrifuges this blood sample separating it into red blood cells, buffy coat and plasma. After discarding the buffy coat and plasma, he lyses the red blood cells in the plasma and extracts the total RNA. Using advance sequencing techniques, he obtains the following strand of RNA:

For this step, we will need to design a program for ourselves to reverse translate this protein sequence into a strand of mRNA to provide the students with in the case study.

    1. Using your hand out, translate this RNA sequence into a corresponding amino acid sequence
    2. Find out to which protein this amino acid sequence corresponds by performing a BLAST  search at
    3. Find the 3D structure for this protein using protein data base at

Phew, that was a lot of work! Dr. Doolittle knows that you are a computer scientist (programmer?) and thinks you may be able to simplify the process he went through. He really hopes you are able to help because he's gotten calls for 5 more zoo crimes already!! As he walks you through the process you realize that, indeed, you could write a few short programs to simplify the process.

First, you write a program to translate the RNA strand to the protein sequence. We'll write this program.

    1. Open emacs.
    2. Write the following program: (will be provided)
    3. Input the RNA sequence.
    4. Run the program. 
    5. Copy the protein sequence.

Next you write a program that utilizes information from both the NCBI and PDB databases to give you both the identity of the murderer and the protein structure once you provide the protein sequence. We'll write this program, too!

    1. Open emacs.
    2. Write the following program: (will be provided)
    3. Paste the protein sequence.
    4. Run the program.
    5. Marvel at how much faster the process becomes.

Now Dr. Doolittle uses these programs to find the culprits in the other five cases. (This is to illustrate how taking the time to write a program saves time in the long run)

Case 1: RNA = blah
Case 2: RNA = blah
Case 3: RNA = blah
Case 4: RNA = blah
Case 5: RNA = blah

**NOTE** from "Developing Bioinfo. Comp. Skills" :
With a little clever programming, you can develop scripts that allow you to hit a web server with multiple requests without entering them manually into a form, but if you're capable of doing that, you're probably able to download a local copy of the software and run it on your own machine. Using your own processor in such cases avoids slow data transfer to and from remote sites and is also considered more polite than running huge jobs on someone else's server.