Ch 21 Genomes and Their Evolution

Genomics: study whole sets of genes and interactions

Bioinformatics: applications of computational methods to the storage and analysis of bio data


I. 2 New approaches have accelerated the pace of genome sequencing

1. 3 stages used by the Human Genome Project

a. linkage mapping: based in recombination frequencies (remember Sordaria lab?) of markers like RFLPs and STRS

b. physical mapping

c. DNA sequencing: dideoxy chain termination method

2. A whole genome shotgun approach to genome sequencing: computer recognizes overlapping sections and puts together in one continuous strand


II: Use bioinformatics to analyze genomes and their function: centralized resources for analyzing genome sequences NCBI or GENBANK

Identifying protein coding genes and Understanding genes and their products at the systems level (gene circuits and protein interaction networks)

Proteomics: study sets of proteins


III: Genomes vary in size, number of genes and gene density

A.     genome size

B.     Number of genes: alternate RNA splicing and post transcriptional modifications > more proteins than genes

C.      Gene density: humans and mammals : low gene density

Bacteria have no introns

Euk genomes: long regulatory sections

Humans 10,000 X noncoding DNA as bacteria


IV: Multicellular eukaryotes have much noncoding DNA and many multigene families

Code (1.5%) = genes for proteins, rRNA, tRNA, miRNA    Noncode = play important roles

Persistence in diverse genomes

Ex, human, rat, mouse have 500 regions of identical noncode

                        Higher level of conservation than in exons

Unique noncoding: gene fragments and psuedogenes: former genes that have accumulated mutations over long time = nonfunctional


Repetitive DNA: sequences that are present in multiple copies in the genome

   ¾ = transposable elements and related sequences

  1. Transposable elements: DNA can move, does not code for normal functioning protein, 100s to 1000s bp
    1. Transposition process of “jumping genes” – ( not actually jumping out)

1.      moves via a type of recombination process

    1. Ex. Indian Corn (Barbara McClintock): identified color changes in corn kernels that only made sense if genes jumped and disrupted original gene for color


Eukaryotes: 2 types of transposable elements

1.      transposons: move within genome by DNA intermediate “cut and paste” mechanism  or copy and paste

2.      Mostly *** retrotransposons: moves by RNA intermediate (also codes for reverse transcriptase)

-        always leaves copy at original site



***REVERSE TRANSCRIPTASE: evidence that perhaps retroviruses evolved from retrotransposons


  1. Sequences related to transposable elements

- 25% -50% of mammal genome, higher in plants and amphibians

- Single unit is 100s to 1000s bp

- Dispersed copies are similar (some can still move, others too mutated to move)


Humans and Primates:

Alu elements (10% of human genome)  is 300 bp long, dispersed throughout chromosomes, do not seem to code for functional product but cannot affect functional genes   


            L1: Line 1 (17% of genome)  is 6500 bp with a low rate of transposition

                        Within are regions that block RNA polymerase

Q: what would be an advantage to blocking RNA Polymerase?

                        Found in 80% of human genes: perhaps regulation? Gene expression?


  1. Repetitive DNA (15%) : not due to transposons

Q: Then how did the repetitive DNA get there?

      Meiotic error: DNA replication, recombination (shuffling, crossover ..)

                        5% duplications 10,000 – 30,000 bp

                        Copied form one chromosomal location to another


            Simple Sequence DNA (3%)  is 15-50 nt long

                        STR (short tandem repeat)  2-5 nt long ex. GTTACGTTAC

                                    Genetic profiling

                                    Often found at telomeres and centromeres


Q: why are STRs found here?

                                    May help organize chromatin

Telomeric DNA binds to proteins to protect from degredation and joining of chromosomes


  1. Genes and Multigene Families: DNA coding = proteins, tRNA, sRNA, rRNA

Over half human genes are in families

Identical:   rRNA is final product

Nonidenticalglobind related families of genes

                        Alpha globin: chromosome 16 diff forms and affinity for oxygen

                        Beta globin: chromosome 11

                        Also some pseudogene

Q: Why diff. forms of the same gene?

 Different forms are expressed at diff times in development to allow Hb to be effective in changing environments.

            - embryo and fetal forms have a higher affinity for oxygen

Q: Why are rRNA transcribed from a single unit repeated 100s to 1000s times?

Helps make millions of ribosomes quickly for active protein synthesis

Primary transcript is cleaved into 3 rRNA molecules that combine with proteins =ribosome

21.5 Duplication, rearrangement and mutation of DNA contribute to Evolution

A. Duplication of Entire Chromosome sets is usually lethal

            Polyploidy:   1 extra sets of chromosomes, plants more common

                                    Extra sets of chromosomes accumulate mutations and genes may

                                    evolve novel functions if the essential one set of genes are expressed


B. Alterations of chromosome structure: RECOMBINATION = SPECIATION (mating)

            6 million years human n=23 chimpanzees n=24

                        2 ancestral chromosomes fused

Q: How do we know 2 ancestral chromosomes fused?

                        *** telomeres are evidence since they are found inside the chromosome

Q: What can you infer about the similar blocks of DNA on the human and mouse chromosome?

Similarities between mouse and human suggest stability after divergence


Looking at 8 mammal species, construct evolutionary history by chromosomal rearrangements: duplications, inversions (breaking and reinserting)

            100 million years ago dinosaurs go extinct

                        Mammalian adaptive radiation : unequal crossover, lots of mistakes 

Q: Why so many mistakes? Why adaptive radiation a mechanism for speciation ?


                        8 species: duplications, inversions,=meiotic mistakes

            Chromosomal rearrangement – generation of new species

            *hot spots (breakage points do not seem to be random) = congenital disaster?


C. Duplication and divergence of gene sized regions of DNA

- transposable elements provide crossover sites even if misaligned (duplication and deletion)

            - slippage of DNA template

            - create STRs

            - also exist as multigene families



D. Evolution of Genes with related functions “the human globin genes”

Duplication events can lead to evolution of multigene families

-450 -500 million years ago: alpha and beta globin duplicated and diverged

-myoglobin and  plant leghemoglobin function as a monomers

Q: How can duplication of genes give rise to families of genes with similar functions?

            *after duplication, then gene must accumulate mutations in order for families to arise

E. Evolution of Genes with novel functions

Duplicated genes do not have to have similar functions

One copy can alter to become a new protein product

            Lysozyme and alpha lactalbumin: (both found in mammals, only lysozyme in birds)

                        Look extremely similar in code and protein structure

Q: What can we infer about the Evolutionary history of birds and mammals regarding these 2 proteins?

            Duplication occurred in mammals after they diverged and one copy attained mutations


F. Rearrangement of parts of genes: exon duplicating and shuffling

            Ex. Collagen: highly repetitive amino acid sequence = repetitive patterns of exons

Mixing of exons even form nonallelic genes = meiotic errors and recombination

                                    Exon shuffling

Tissue plasminogen activator (TPA): extracellular protein helps control blood clotting

4 domians of 3 types: each is an exon (each type of exon is also found in other proteins)


G. How transposable elements contribute to Genomic Evolution

1. promote recombination

2. disrupt cellular genes or control elements (remember activators and enhancers for turning up the volume on production)

            3. carry entire genes or individual exons to new locations

Q: how is it possible that these 3 transposition processes can increase evolution?

Since transposable elements are throughout the chromosomes  they allow Cross over at homologous regions

Usually chromosomal translocation is lethal however this accounts for why we find alpha globin and beta globin on different chromosomes

Also Alu inserts may create weak alternate splicing site = sometimes you get an alternate protein while you still get the efficient one most of the time


21.6 Comparing genome sequences provides clues to evolution and development

            Distantly related species: analyzing highly conserved genes

            Closely related species

            Humans and chimps -1.2% genetic differences

-        2.7 % difference due to deletion, insertion(duplications, repetitive)

-         1/3 human duplication not in chimps

o       Some of these are human disease

o       More Alu in humans

o       More copies of a retroviral provirus in chimps

Some genes evolve faster in humans

-        defense against malaria, TB

-        one gene regulates brain size

gene classified by function

-        code for transcription factors (evolve fast)


Ex. FOXP2: transcription factors whose gene shows rapid change in humans and functions in vertebrate vocalization

            Mutations here lead to severe speech and language impairment

                        Only 2 aa differences between a human and a chimp


B. Comparing genomes in a species

            Genetic variation in humans (200,000 years =short time)

            Diversity in humans due to SNP (single nucleotide polymorphisms) occur 1 in 300 bp

                        Some inversions, duplications and deletions


  1. Comparing developmental processes : HOX genes
    1. EVO-DEVO evolutionary developmental biology

Wide spread conservation of Developemental Genes among animals

Homeobox (same in many invertebrates and vertebrates)

-        180 nt sequence w/ 60 amino acid domain

-        Homeotic genes in Drosophila control the formation of the anterior and posterior structures

HOX genes can do more and are found in other regions of chromosomes

-        bicoid (polarity of egg)

-        segmentation

-        can act as a transcription regulator

Even some homeotic regions for regulation are found in yeast and plants

Q: So what is the evolutionary significance of hox genes (homeobox DNA)?

Homeobox DNA evolved early in life and was sufficiently valuable to organisms to be conserved in plants and animals over hundreds of millions of years.


Proteins with homeo domains probably regulate development by coordinating the transcription of batteries of developmental genes (Remember activators and enhancers from last unit)

-        switching on and off

-        in embryos different combination of HOX genes are active in different parts of the embryo