E-mail ID : info@iamg.in |
Online Submission |
Click Here For Online Submission |
Instructions for authors |
Genetic Clinics |
Editorial board |
Get Our Newsletter |
Subscribe |
Send Your Feedback |
Feedback Form |
About Us |
IAMG |
GeNeViSTA
1990 | HGP started; ELSI program founded |
1992 | Second generation Human Genetic Map developed |
1994 | HGP’s human genetic mapping goal achieved |
1995 | HGP’s human physical mapping goal achieved; First bacterial genome (H. influenzae) sequenced |
1996 | First human gene map established; Pilot project for Human Genome sequence began in the US; Yeast genome sequenced; Bermuda principles established |
1997 | E. coli genome sequenced |
1998 | (C. elegans) genome sequenced; Celera genomics entered HGP race (Private HGP) |
1999 | Full scale Human Genome sequence began; Sequence of chromosome 22 completed |
2000 | Draft version of both Public and Private HGP completed; Fruit fly genome sequenced; Executive order issued barring genetic discrimination in US federal workplace |
2001 | Draft version of HGP published |
2002 | Draft version of mouse genome sequenced, completed and published; Draft version of rice genome sequenced, completed and published |
2003 | HGP completed with all its goals achieved |
Firstly, the HGP welcomed collaborators from any nation in an effort to move beyond borders, to establish an all-inclusive effort aimed at understanding the molecular basis of human life. This was planned to be done using different approaches. The group of publicly funded researchers that eventually assembled (18 countries and more than 200 laboratories) was known as the International Human Genome Sequencing Consortium (IHGSC). It is unfortunate that India was not among the 18 countries. Secondly, HGP worked on the Bermuda principles drafted in 1996. This required that all human genome sequence information, greater than 2 kb, be made freely and publicly available within 24 hours of its assembly. This was usually done by uploading all the sequences on the very same night of sequencing. This provided researchers all around the world, access to HGP data and greatly accelerated the ongoing research. A number of terms and definitions were introduced because of the HGP, some of which are given in Table II.
BAC | Bacterial artificial chromosome vector carrying a genomic DNA insert, typically 100±200kb |
Contig | A contiguous sequence of DNA created by assembling shorter, overlapping sequenced fragments of a chromosome (whether natural or artificial, as in BACs). |
Scaffold | The result of connecting contigs by linking information from paired-end reads from plasmids, paired-end reads from BACs, known messenger RNAs or other sources. The contigs in a scaffold are ordered and oriented with respect to one another. |
Sequence tagged site | STS stands for sequence tagged sites, a short DNA segment that occurs only once in a genome and whose exact location and order of bases is known. |
Genetic map | A genome map in which polymorphic loci are positioned relative to one another on the basis of the frequency with which they recombine during meiosis. The unit of distance is centimorgans (cM), denoting a 1% chance of recombination. |
Physical map | A map showing the locations of identifiable markers spaced along the chromosomes. A physical map may be constructed from a set of overlapping clones. |
Functional genomics | The study of genomes to determine the biological function of all the genes and their products. |
Draft genome sequence | The sequence produced by combining the information from the individual sequenced clones (by creating merged sequence contigs and then employing linking information to create scaffolds) and positioning the sequence along the physical map of the chromosomes. |
Methylation | Addition of methyl groups to DNA to suppress gene transcription. |
SNP | Single Nucleotide Polymorphism (SNP): A common single-base-pair variation in a DNA sequence. |
Haplotype | A specific combination of alleles or sequence variations that are likely to be inherited together. |
Genomic library | Contains DNA fragments that represent the entire genome of the organism (coding and non coding). |
The ultimate goal of the Human Genome Project was to decode the exact sequence of all 3.2 billion nucleotide bases that make up the human genome and to identify estimated genes in the human DNA [Collins et al., 1998].
One of the goals of HGP was to decipher the genome of organisms like mice, fruit flies and roundworms. Manipulations on such small organisms are easier and hence experiments based on them, especially breeding provides vital information about developmental and functional genetics which can be applied to human health and diseases. Also, studying different genomes would give us insight into the evolutionary conservation of genes and development of unique genes. It could also lead to understanding which would help in combating human diseases.
Celera Genomics (‘Celera’ is Latin for swiftness) was established in 1998 by the Perkin Elmer Corporation and Craig Venter. Earlier, Craig Venter was head of TIGR, a non-profit genomics research institution where using unique whole genome shotgun sequencing method, he had sequenced the genome of H. influenza [Weber & Myers, 1997]. Celera Genomics announced that they would finish the human genome sequencing in three years. The establishment of Celera Genomics heralded a race between the government’s HGP and Celera. Celera began its project later than the HGP, but used a faster method powered by the world’s largest private assemblage of supercomputers. Celera planned for preliminary patents on over 6,000 genes and full patents on a few hundred genes before releasing their sequence. However, a significant portion of the human genome had already been sequenced when Celera entered the field, and thus Celera did not incur any costs in obtaining the existing data, which was freely available to the public from Genbank. Celera sequenced the human genome at a fraction of the cost of the public project, approximately $3 billion of taxpayer dollars versus about $300 million of private funding. In 2001 however, with the intervention of the White House, both Celera and Public HGP officially ”tied” and made the joint official statement of initial HGP draft completion in 2000 (Fig 1).
Several key projects helped to crystallize the HGP. These included:
Two main phases of the public HGP were:
Celera used two independent data sets together with two distinct computational approaches to determine the sequence of the human genome. The first data set was generated by Celera using DNA of five anonymous individuals, out of which one DNA was selected blindly. Plasmid clones were made. Sequencing and tracking from both ends of plasmid clones from 2, 10, and 50 kb libraries was done. This generated 27.27 million DNA sequence reads (5.11 fold coverage of the genome). The second data set was obtained from the publicly funded Human Genome Project (BAC Clones); here, Celera ”shredded” the Human Genome Project DNA sequence into 550-base-pair sequence reads representing a total of 16.05 million sequence reads. The company then used a whole-genome assembly method and a regional chromosome assembly method to sequence the human genome (Figure 3).
The completed HGP Drafts were published separately by the Public HGP (Nature) and the Private Human Genome Project (Science) a day apart on 15th February and 16th February 2001 respectively [IHGSC,2001; Venter et al., 2001]. However, the formal HGP Draft completion announcement was made public on 26 June, 2000 jointly by the two organizations.
The $3.8 billion spent on the HGP may well represent the best single investment ever made in science. From 1988 to 2010, HGP added $796 billion in U.S. economic output, $244 billion in personal income for Americans and generated 3.8 million job-years of employment [Lander, 2011].
Along with sequencing the Human genome the genome sequences of 599 viruses and viroids, 205 naturally occurring plasmids, 185 organelles, 31 eubacteria, seven archaea, one fungus (yeast), two animals (round worm and mouse) and one plant (mustard weed) were identified [IHGSC, 2004].
Sequencing of 99% of euchromatic DNA was finished to 99.99% accuracy. The number of genes in the human genome was estimated to be around 22,000 (~1.5%) and 3.7 million SNPs were mapped (Table III).
Other major feats were:
Genome Sequencing | HGP begins-1990 | HGP ends-2003 | 12 years after HGP
|
Cost to Generate a Human Genome Sequence | ~$1 billion | 10-50 million | $1000 |
Human Genome Sequences | 0 | 1 | Thousands |
Number of Genes with Known Phenotype/ Disease-Causing Mutation | ~53 | ~1474 | ~9578 |
Number Phenotypes/Disorders with Known Molecular Basis | ~61 | ~2264 | More than 6000 |
Number of Published Genome-Wide Association Studies (GWAS) | 0 | 0 | ~1600 |
Since the completion of HGP there has been a revolution in the field of genetics and now we have entered the –omics era. Scientists, world over, are studying different microbial, plant, animal, human and cancer genomes. New genomes of different organisms are being coded. Among the most important contemporary human genome related-projects are the International HapMap Project, 1000 Genomes Project, ENCODE project, Human Epigenome Project, 100,000 Genomes Project (UK) and 1,000,000 Genomes Project (US) etc.
The completion of HGP transformed genetics into genomics. High throughput sequencing techniques and microarray technology has made analysis of genome simpler and cheap. This has transformed not only research but patient care as well. Identification of causative genes for monogenic disorders, genome wide association studies for multifactorial disorders and genomic and expression analysis of tumors are important applications of results of the HGP. As this has opened a new exciting era for clinicians and geneticists, the challenges are also apparent. The challenges include the need of functional validation of each new sequence variant identified and better understanding of modifier genes to predict pathogenicity and to develop better understanding of genotype-phenotype correlation. Now we know that the more we learn about the human genome, the more there is to explore. And these words from the HGP Draft article befit our concluding thoughts.
“We shall not cease from exploration. And the end of all our exploring will be to arrive where we started, and know the place for the first time.” – T. S. Eliot.
1. Collins FS, et al. New goals for the U. S. Human Genome Project: 1998-2003. Science 1998; 282: 682-689.
2. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 2001; 409: 860-921.
3. International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 2004; 431: 931-945.
4. Lander ES. Initial impact of the sequencing of the human genome. Nature 2011; 470.
5. Venter JC, et al. The sequence of the human genome. Science 2001; 291: 1304-1351.
6. Weber JL, Myers EW. Human whole-genome shotgun sequencing. Genome Res 1997; 7: 401-409.
Abstract | Download PDF |