A gene is a region of DNA that encodes function. A chromosome consists of a long strand of DNA containing many genes. A human chromosome can have up to 500 million base pairs of DNA with thousands of genes.
In biology, a gene is a sequence of DNA or RNA that codes for a molecule that has a function. During gene expression, the DNA is first copied into RNA. The RNA can be directly functional or be the intermediate template for a protein that performs a function. The transmission of genes to an organism’s offspring is the basis of the inheritance of phenotypic traits. These genes make up different DNA sequences called genotypes. Genotypes along with environmental and developmental factors determine what the phenotypes will be. Most biological traits are under the influence of polygenes (many different genes) as well as gene–environment interactions. Some genetic traits are instantly visible, such as eye color or number of limbs, and some are not, such as blood type, risk for specific diseases, or the thousands of basic biochemical processes that constitute life.
Genes can acquire mutations in their sequence, leading to different variants, known as alleles, in the population. These alleles encode slightly different versions of a protein, which cause different phenotypical traits. Usage of the term “having a gene” (e.g., “good genes,” “hair colour gene”) typically refers to containing a different allele of the same, shared gene. Genes evolve due to natural selection or survival of the fittest of the alleles.
The concept of a gene continues to be refined as new phenomena are discovered. For example, regulatory regions of a gene can be far removed from its coding regions, and coding regions can be split into several exons. Some viruses store their genome in RNA instead of DNA and some gene products are functional non-coding RNAs. Therefore, a broad, modern working definition of a gene is any discrete locus of heritable, genomic sequence which affect an organism’s traits by being expressed as a functional product or by regulation of gene expression.
The term gene was introduced by Danish botanist, plant physiologist and geneticist Wilhelm Johannsen in 1905. It is inspired by the ancient Greek: γόνος, gonos, that means offspring and procreation.
The expression of genes encoded in DNA begins by transcribing the gene into RNA, a second type of nucleic acid that is very similar to DNA, but whose monomers contain the sugar ribose rather than deoxyribose. RNA also contains the base uracil in place of thymine. RNA molecules are less stable than DNA and are typically single-stranded. Genes that encode proteins are composed of a series of three-nucleotide sequences called codons, which serve as the “words” in the genetic “language”. The genetic code specifies the correspondence during protein translation between codons and amino acids. The genetic code is nearly the same for all known organisms.:4.1
The total complement of genes in an organism or cell is known as its genome, which may be stored on one or more chromosomes. A chromosome consists of a single, very long DNA helix on which thousands of genes are encoded.:4.2 The region of the chromosome at which a particular gene is located is called its locus. Each locus contains one allele of a gene; however, members of a population may have different alleles at the locus, each with a slightly different gene sequence.
The majority of eukaryotic genes are stored on a set of large, linear chromosomes. The chromosomes are packed within the nucleus in complex with storage proteins called histones to form a unit called a nucleosome. DNA packaged and condensed in this way is called chromatin.:4.2 The manner in which DNA is stored on the histones, as well as chemical modifications of the histone itself, regulate whether a particular region of DNA is accessible for gene expression. In addition to genes, eukaryotic chromosomes contain sequences involved in ensuring that the DNA is copied without degradation of end regions and sorted into daughter cells during cell division: replication origins, telomeres and the centromere.:4.2 Replication origins are the sequence regions where DNA replication is initiated to make two copies of the chromosome. Telomeres are long stretches of repetitive sequence that cap the ends of the linear chromosomes and prevent degradation of coding and regulatory regions during DNA replication. The length of the telomeres decreases each time the genome is replicated and has been implicated in the aging process. The centromere is required for binding spindle fibres to separate sister chromatids into daughter cells during cell division.:18.2
Prokaryotes (bacteria and archaea) typically store their genomes on a single large, circular chromosome. Similarly, some eukaryotic organelles contain a remnant circular chromosome with a small number of genes.:14.4
Main article: Gene expression
In all organisms, two steps are required to read the information encoded in a gene’s DNA and produce the protein it specifies. First, the gene’s DNA is transcribed to messenger RNA (mRNA).:6.1 Second, that mRNA is translated to protein.:6.2 RNA-coding genes must still go through the first step, but are not translated into protein. The process of producing a biologically functional molecule of either RNA or protein is called gene expression, and the resulting molecule is called a gene product.
The nucleotide sequence of a gene’s DNA specifies the amino acid sequence of a protein through the genetic code. Sets of three nucleotides, known as codons, each correspond to a specific amino acid.:6 The principle that three sequential bases of DNA code for each amino acid was demonstrated in 1961 using frameshift mutations in the rIIB gene of bacteriophage T4 (see Crick, Brenner et al. experiment).
Additionally, a “start codon“, and three “stop codons” indicate the beginning and end of the protein coding region. There are 64 possible codons (four possible nucleotides at each of three positions, hence 43 possible codons) and only 20 standard amino acids; hence the code is redundant and multiple codons can specify the same amino acid. The correspondence between codons and amino acids is nearly universal among all known living organisms.
Transcription produces a single-stranded RNA molecule known as messenger RNA, whose nucleotide sequence is complementary to the DNA from which it was transcribed.:6.1 The mRNA acts as an intermediate between the DNA gene and its final protein product. The gene’s DNA is used as a template to generate a complementary mRNA. The mRNA matches the sequence of the gene’s DNA coding strand because it is synthesised as the complement of the template strand. Transcription is performed by an enzyme called an RNA polymerase, which reads the template strand in the 3′ to 5′ direction and synthesizes the RNA from 5′ to 3′. To initiate transcription, the polymerase first recognizes and binds a promoter region of the gene. Thus, a major mechanism of gene regulation is the blocking or sequestering the promoter region, either by tight binding by repressor molecules that physically block the polymerase, or by organizing the DNA so that the promoter region is not accessible.:7
In prokaryotes, transcription occurs in the cytoplasm; for very long transcripts, translation may begin at the 5′ end of the RNA while the 3′ end is still being transcribed. In eukaryotes, transcription occurs in the nucleus, where the cell’s DNA is stored. The RNA molecule produced by the polymerase is known as the primary transcript and undergoes post-transcriptional modifications before being exported to the cytoplasm for translation. One of the modifications performed is the splicing of introns which are sequences in the transcribed region that do not encode protein. Alternative splicing mechanisms can result in mature transcripts from the same gene having different sequences and thus coding for different proteins. This is a major form of regulation in eukaryotic cells and also occurs in some prokaryotes.:7.5
Translation is the process by which a mature mRNA molecule is used as a template for synthesizing a new protein.:6.2 Translation is carried out by ribosomes, large complexes of RNA and protein responsible for carrying out the chemical reactions to add new amino acids to a growing polypeptide chain by the formation of peptide bonds. The genetic code is read three nucleotides at a time, in units called codons, via interactions with specialized RNA molecules called transfer RNA (tRNA). Each tRNA has three unpaired bases known as the anticodon that are complementary to the codon it reads on the mRNA. The tRNA is also covalently attached to the amino acid specified by the complementary codon. When the tRNA binds to its complementary codon in an mRNA strand, the ribosome attaches its amino acid cargo to the new polypeptide chain, which is synthesized from amino terminus to carboxyl terminus. During and after synthesis, most new proteins must fold to their active three-dimensional structure before they can carry out their cellular functions.:3
Genes are regulated so that they are expressed only when the product is needed, since expression draws on limited resources.:7 A cell regulates its gene expression depending on its external environment (e.g. available nutrients, temperature and other stresses), its internal environment (e.g. cell division cycle, metabolism, infection status), and its specific role if in a multicellular organism. Gene expression can be regulated at any step: from transcriptional initiation, to RNA processing, to post-translational modification of the protein. The regulation of lactose metabolism genes in E. coli (lac operon) was the first such mechanism to be described in 1961.
A typical protein-coding gene is first copied into RNA as an intermediate in the manufacture of the final protein product.:6.1 In other cases, the RNA molecules are the actual functional products, as in the synthesis of ribosomal RNA and transfer RNA. Some RNAs known as ribozymes are capable of enzymatic function, and microRNA has a regulatory role. The DNA sequences from which such RNAs are transcribed are known as non-coding RNA genes.
Some viruses store their entire genomes in the form of RNA, and contain no DNA at all. Because they use RNA to store genes, their cellular hosts may synthesize their proteins as soon as they are infected and without the delay in waiting for transcription. On the other hand, RNA retroviruses, such as HIV, require the reverse transcription of their genome from RNA into DNA before their proteins can be synthesized. RNA-mediated epigenetic inheritance has also been observed in plants and very rarely in animals.
1^ Jump up to:
a b Gericke, Niklas Markus; Hagberg, Mariana (5 December 2006). “Definition of historical models of gene function and their relation to students’ understanding of genetics”. Science & Education. 16 (7–8): 849–881. Bibcode:2007Sc&Ed..16..849G. doi:10.1007/s11191-006-9064-4.
4^ Jump up to:
a b Johannsen, W. (1905). Arvelighedslærens elementer (“The Elements of Heredity”. Copenhagen). Rewritten, enlarged and translated into German as Elemente der exakten Erblichkeitslehre (Jena: Gustav Fischer, 1905; Scanned full text.
5 Jump up
^ Noble D (September 2008). “Genes and causation” (Free full text). Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences. 366 (1878): 3001–3015. Bibcode:2008RSPTA.366.3001N. doi:10.1098/rsta.2008.0086. PMID 18559318.
9 Jump up
^ Vries, H. de, Intracellulare Pangenese, Verlag von Gustav Fischer, Jena, 1889. Translated in 1908 from German to English by C. Stuart Gager as Intracellular Pangenesis, Open Court Publishing Co., Chicago, 1910
10^ Jump up to:
a b c Gerstein MB, Bruce C, Rozowsky JS, Zheng D, Du J, Korbel JO, Emanuelsson O, Zhang ZD, Weissman S, Snyder M (June 2007). “What is a gene, post-ENCODE? History and updated definition”. Genome Research. 17 (6): 669–681. doi:10.1101/gr.6339607. PMID 17567988.
12 Jump up
^ Avery, OT; MacLeod, CM; McCarty, M (1944). “Studies on the Chemical Nature of the Substance Inducing Transformation of Pneumococcal Types: Induction of Transformation by a Desoxyribonucleic Acid Fraction Isolated from Pneumococcus Type III”. The Journal of Experimental Medicine. 79 (2): 137–58. doi:10.1084/jem.79.2.137. PMC 2135445. PMID 19871359. Reprint: Avery, OT; MacLeod, CM; McCarty, M (1979). “Studies on the chemical nature of the substance inducing transformation of pneumococcal types. Inductions of transformation by a desoxyribonucleic acid fraction isolated from pneumococcus type III”. The Journal of Experimental Medicine. 149 (2): 297–326. doi:10.1084/jem.149.2.297. PMC 2184805. PMID 33226.
13 Jump up
^ Hershey, AD; Chase, M (1952). “Independent functions of viral protein and nucleic acid in growth of bacteriophage”. The Journal of General Physiology. 36 (1): 39–56. doi:10.1085/jgp.36.1.39. PMC 2147348. PMID 12981234.
15 Jump up
^ Watson, J. D.; Crick, FH (1953). “Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid” (PDF). Nature. 171 (4356): 737–8. Bibcode:1953Natur.171..737W. doi:10.1038/171737a0. PMID 13054692.
16 Jump up
^ Benzer S (1955). “FINE STRUCTURE OF A GENETIC REGION IN BACTERIOPHAGE”. Proc. Natl. Acad. Sci. U.S.A. 41 (6): 344–54. Bibcode:1955PNAS…41..344B. doi:10.1073/pnas.41.6.344. PMC 528093. PMID 16589677.
17 Jump up
^ Benzer S (1959). “ON THE TOPOLOGY OF THE GENETIC FINE STRUCTURE”. Proc. Natl. Acad. Sci. U.S.A. 45 (11): 1607–20. Bibcode:1959PNAS…45.1607B. doi:10.1073/pnas.45.11.1607. PMC 222769. PMID 16590553.
18 Jump up
^ Min Jou W, Haegeman G, Ysebaert M, Fiers W (May 1972). “Nucleotide sequence of the gene coding for the bacteriophage MS2 coat protein”. Nature. 237 (5350): 82–8. Bibcode:1972Natur.237…82J. doi:10.1038/237082a0. PMID 4555447.
19 Jump up
^ Sanger, F; Nicklen, S; Coulson, AR (1977). “DNA sequencing with chain-terminating inhibitors”. Proceedings of the National Academy of Sciences of the United States of America. 74 (12): 5463–7. Bibcode:1977PNAS…74.5463S. doi:10.1073/pnas.74.12.5463. PMC 431765. PMID 271968.
20 Jump up
^ Adams, Jill U. (2008). “DNA Sequencing Technologies”. Nature Education Knowledge. SciTable. Nature Publishing Group. 1 (1): 193.
25^ Jump up to:
a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab ac ad ae af ag ah ai aj ak Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell (Fourth ed.). New York: Garland Science. ISBN 978-0-8153-3218-3.
27 Jump up
^ Bolzer, Andreas; Kreth, Gregor; Solovei, Irina; Koehler, Daniela; Saracoglu, Kaan; Fauth, Christine; Müller, Stefan; Eils, Roland; Cremer, Christoph; Speicher, Michael R.; Cremer, Thomas (2005). “Three-Dimensional Maps of All Chromosomes in Human Male Fibroblast Nuclei and Prometaphase Rosettes”. PLoS Biology. 3 (5): e157. doi:10.1371/journal.pbio.0030157. PMC 1084335. PMID 15839726.
28 Jump up
^ Braig M, Schmitt CA (March 2006). “Oncogene-induced senescence: putting the brakes on tumor development”. Cancer Research. 66 (6): 2881–4. doi:10.1158/0008-5472.CAN-05-4006. PMID 16540631.
29^ Jump up to:
a b Bennett, PM (March 2008). “Plasmid encoded antibiotic resistance: acquisition and transfer of antibiotic resistance genes in bacteria”. British Journal of Pharmacology. 153 Suppl 1: S347–57. doi:10.1038/sj.bjp.0707607. PMC 2268074. PMID 18193080.
30 Jump up
^ International Human Genome Sequencing Consortium (October 2004). “Finishing the euchromatic sequence of the human genome”. Nature. 431 (7011): 931–45. Bibcode:2004Natur.431..931H. doi:10.1038/nature03001. PMID 15496913.
32 Jump up
^ Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (July 2008). “Mapping and quantifying mammalian transcriptomes by RNA-Seq”. Nature Methods. 5 (7): 621–8. doi:10.1038/nmeth.1226. PMID 18516045.
33 Jump up
^ Pennacchio, L. A.; Bickmore, W.; Dean, A.; Nobrega, M. A.; Bejerano, G. (2013). “Enhancers: Five essential questions”. Nature Reviews Genetics. 14 (4): 288–95. doi:10.1038/nrg3458. PMC 4445073. PMID 23503198.
34 Jump up
^ Maston, G. A.; Evans, S. K.; Green, M. R. (2006). “Transcriptional Regulatory Elements in the Human Genome”. Annual Review of Genomics and Human Genetics. 7: 29–59. doi:10.1146/annurev.genom.7.080505.115623. PMID 16719718.
35 Jump up
^ Mignone, Flavio; Gissi, Carmela; Liuni, Sabino; Pesole, Graziano (2002-02-28). “Untranslated regions of mRNAs”. Genome Biology. 3 (3): reviews0004. doi:10.1186/gb-2002-3-3-reviews0004. ISSN 1465-6906. PMC 139023. PMID 11897027.
36 Jump up
^ Bicknell AA, Cenik C, Chua HN, Roth FP, Moore MJ (December 2012). “Introns in UTRs: why we should stop ignoring them”. BioEssays. 34 (12): 1025–34. doi:10.1002/bies.201200073. PMID 23108796.
37 Jump up
^ Salgado, H.; Moreno-Hagelsieb, G.; Smith, T.; Collado-Vides, J. (2000). “Operons in Escherichia coli: Genomic analyses and predictions”. Proceedings of the National Academy of Sciences. 97 (12): 6652–6657. Bibcode:2000PNAS…97.6652S. doi:10.1073/pnas.110147297. PMC 18690. PMID 10823905.
40 Jump up
^ Spilianakis CG, Lalioti MD, Town T, Lee GR, Flavell RA (June 2005). “Interchromosomal associations between alternatively expressed loci”. Nature. 435 (7042): 637–45. Bibcode:2005Natur.435..637S. doi:10.1038/nature03574. PMID 15880101.
41 Jump up
^ Williams, A; Spilianakis, CG; Flavell, RA (April 2010). “Interchromosomal association and gene regulation in trans”. Trends in Genetics. 26 (4): 188–97. doi:10.1016/j.tig.2010.01.007. PMC 2865229. PMID 20236724.
42 Jump up
^ Beadle GW, Tatum EL (1941). “Genetic Control of Biochemical Reactions in Neurospora”. Proc. Natl. Acad. Sci. U.S.A. 27 (11): 499–506. Bibcode:1941PNAS…27..499B. doi:10.1073/pnas.27.11.499. PMC 1078370. PMID 16588492.
43 Jump up
^ Horowitz NH, Berg P, Singer M, Lederberg J, Susman M, Doebley J, Crow JF (2004). “A centennial: George W. Beadle, 1903-1989”. Genetics. 166 (1): 1–10. doi:10.1534/genetics.166.1.1. PMC 1470705. PMID 15020400.
45 Jump up
^ Parra G, Reymond A, Dabbouseh N, Dermitzakis ET, Castelo R, Thomson TM, Antonarakis SE, Guigó R (January 2006). “Tandem chimerism as a means to increase protein complexity in the human genome”. Genome Research. 16 (1): 37–44. doi:10.1101/gr.4145906. PMC 1356127. PMID 16344564.
49 Jump up
^ Woodson SA (May 1998). “Ironing out the kinks: splicing and translation in bacteria”. Genes & Development. 12 (9): 1243–7. doi:10.1101/gad.12.9.1243. PMID 9573040.
51 Jump up
^ Koonin, Eugene V.; Dolja, Valerian V.; Morris, T. Jack (January 1993). “Evolution and Taxonomy of Positive-Strand RNA Viruses: Implications of Comparative Analysis of Amino Acid Sequences”. Critical Reviews in Biochemistry and Molecular Biology. 28 (5): 375–430. doi:10.3109/10409239309078440. PMID 8269709.
53 Jump up
^ Domingo, E; Escarmís, C; Sevilla, N; Moya, A; Elena, SF; Quer, J; Novella, IS; Holland, JJ (June 1996). “Basic concepts in RNA virus evolution”. FASEB Journal. 10 (8): 859–64. PMID 8666162.
55 Jump up
^ Miko, Ilona (2008