There are maybe three pounds of microorganisms in the average human being, and their behavior affects their human hosts (and vice-versa). There is now a program to sequence the genomes of these organisms, to create the larger genome of the collection of the human and its related microorganisms. That genome will be a couple of orders of magnitude larger than the genome of the individual human. One assumes that it will be subject to more diversity.
There are at least 1000 diseases now classified in the International Classification of Diseases. We can think of diseases as the result of the function of the genes, or of the interaction of organisms with different genomes. Thus an infectious disease is the result of the response of the human, and his associated organisms, to the infecting ageny, each determined by its genome and its history.
And of course we are interested in not only the interrelationship of genes and disease, but of development and of all of the traits of interest to people.
The Celera human genome project established a benchmark in the use of computer power.
Upon the establishment of the genome project at Celera in 1998, the company purchased and connected 700 CPUs and 70 terabites of hard drive space. This computing system was established to run the initial test of their algorithm code, which was used to sequence the genome of the Drosophilla fruit fly with a 13-fold coverage of the genome successfully in 1999. The most surprising thing about this approach was that it succeeded in coding the algorithm and sequencing the 120 Megabase pair genome of the fruit fly to that extent of completeness in just 11 months. Myers (Gene Myers, a professor of Computer Science at Berkeley) then modified the process so that the Whole Genome Shotgun Sequencing process would make a 5-fold coverage of the human genome, as he believed it would be adequate to provide a complete sequence of the human genome. In addition, Venter purchased 4 supercomputers referred to as the GeneMatcher from a company called Parcel Inc. Parcel Inc, a company that typically produces computers for government agencies such as the NSA, created this machine specifically for matching character strings, such as putting together sequences of DNA like a puzzle. It was composed of 7000 processors arranged to perform over 1000 times faster than any Pentium computer. With this new technology, on September 8, 1999, Celera began its sequencing of the human genome using this approach, and completed the first assembly of the whole human genome in June 17, 2000, only 9 months after the project began.
The understanding of the relationship of the genome to disease will involve statistical analysis of the health histories and individual genomes of tens or hundreds of thousands of people. It is expected that few if any conditions will be explained by a single gene. Even eye color is a complex phenomenon under control of different genes. So think about the computer power that will be used in the coming generation to clarify the genetic basis of disease and behavior.
No comments:
Post a Comment