Home     Genomes     Genome Browser     Blat     Tables     PCR     Session    FAQ     Help  
  Schema for Most Conserved - PhastCons Conserved Elements, 5-way Vertebrate Multiz Alignment
  Database: bosTau4    Primary Table: phastConsElements5way    Row Count: 1,005,876
Format description: Browser extensible data
fieldexampleSQL type description
bin 585smallint unsigned Indexing field to speed chromosome range queries.
chrom chr1varchar(255) Reference sequence chromosome or scaffold
chromStart 5171int unsigned Start position in chromosome
chromEnd 5242int unsigned End position in chromosome
name lod=18varchar(255) Name of item
score 253int unsigned Score from 0-1000

  Sample Rows
 
binchromchromStartchromEndnamescore
585chr151715242lod=18253
585chr176087862lod=36341
585chr11249212885lod=52387
585chr12743627532lod=16238
585chr12755427688lod=30318
585chr12779927837lod=14221
585chr12795228191lod=52387
585chr13737637444lod=18253
585chr13757837640lod=17246
585chr13803838122lod=16238

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.


  Most Conserved (phastConsElements5way) Track Description
 

Description

This track shows predictions of conserved elements produced by the phastCons program. PhastCons is part of the PHAST (PHylogenetic Analysis with Space/Time models) package. The predictions are based on a phylogenetic hidden Markov model (phylo-HMM), a type of probabilistic model that describes both the process of DNA substitution at each site in a genome and the way this process changes from one site to the next.

Methods

Best-in-genome pairwise alignments were generated for each species using blastz, followed by chaining and netting. A multiple alignment was then constructed from these pairwise alignments using multiz. Predictions of conserved elements were then obtained by running phastCons on the multiple alignments with the --most-conserved option.

PhastCons constructs a two-state phylo-HMM with a state for conserved regions and a state for non-conserved regions. The two states share a single phylogenetic model, except that the branch lengths of the tree associated with the conserved state are multiplied by a constant scaling factor rho (0 <= rho <= 1). The free parameters of the phylo-HMM, including the scaling factor rho, are estimated from the data by maximum likelihood using an EM algorithm. This procedure is subject to certain constraints on the "coverage" of the genome by conserved elements and the "smoothness" of the conservation scores. Details can be found in Siepel et al. (2005).

The predicted conserved elements are segments of the alignment that are likely to have been "generated" by the conserved state of the phylo-HMM. Each element is assigned a log-odds score equal to its log probability under the conserved model minus its log probability under the non-conserved model. The "score" field associated with this track contains transformed log-odds scores, taking values between 0 and 1000. (The scores are transformed using a monotonic function of the form a * log(x) + b.) The raw log odds scores are retained in the "name" field and can be seen on the details page or in the browser when the track's display mode is set to "pack" or "full".

Credits

This track was created at UCSC using the following programs:

  • Blastz and multiz by Minmei Hou, Scott Schwartz and Webb Miller of the Penn State Bioinformatics Group.
  • AxtBest, axtChain, chainNet, netSyntenic, and netClass by Jim Kent at UCSC.
  • PhastCons by Adam Siepel at Cornell University.

References

PhastCons

Siepel A, Bejerano G, Pedersen JS, Hinrichs A, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005 Aug;15(8):1034-50.

Chain/Net

Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci USA. 2003 Sep 30;100(20):11484-9.

Multiz

Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004 Apr;14(4):708-15.

Blastz

Chiaromonte F, Yap VB, Miller W. Scoring pairwise genomic sequence alignments. Pac Symp Biocomput. 2002;:115-26.

Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison R, Haussler D, Miller W. Human-Mouse Alignments with BLASTZ. Genome Res. 2003 Jan;13(1):103-7.