Home     Genomes     Genome Browser     Blat     Tables     PCR     Session    FAQ     Help  
  Schema for TransMap UCSC - TransMap UCSC Gene Mappings
  Database: bosTau4    Primary Table: transMapAlnUcscGenes    Row Count: 101,095
Format description: Summary info about a patSpace alignment
fieldexampleSQL type info description
bin 585smallint unsigned range Indexing field to speed chromosome range queries.
matches 750int unsigned range Number of bases that match that aren't repeats
misMatches 183int unsigned range Number of bases that don't match
repMatches 0int unsigned range Number of bases that match but are part of repeats
nCount 0int unsigned range Number of 'N' bases
qNumInsert 0int unsigned range Number of inserts in query
qBaseInsert 0int unsigned range Number of bases inserted in query
tNumInsert 0int unsigned range Number of inserts in target
tBaseInsert 0int unsigned range Number of bases inserted in target
strand +char(2) values + or - for strand. First character query, second target (optional)
qName uc008koy.1-1.1varchar(255) values Query sequence name
qSize 945int unsigned range Query sequence size
qStart 0int unsigned range Alignment start position in query
qEnd 945int unsigned range Alignment end position in query
tName chr1varchar(255) values Target sequence name
tSize 161106243int unsigned range Target sequence size
tStart 41242int unsigned range Alignment start position in target
tEnd 42175int unsigned range Alignment end position in target
blockCount 5int unsigned range Number of blocks in alignment
blockSizes 53,221,257,170,232,longblob   Size of each block
qStarts 0,54,276,536,713,longblob   Start of each block in query.
tStarts 41242,41295,41516,41773,41943,longblob   Start of each block in target.

  Connected Tables and Joining Fields
        bosTau4.transMapInfoUcscGenes.mappedId (via transMapAlnUcscGenes.qName)
      hgFixed.transMapSrcUcscGenes.id (via transMapAlnUcscGenes.qName)

  Sample Rows
 
binmatchesmisMatchesrepMatchesnCountqNumInsertqBaseInserttNumInserttBaseInsertstrandqNameqSizeqStartqEndtNametSizetStarttEndblockCountblockSizesqStartstStarts
585750183000000+uc008koy.1-1.19450945chr11611062434124242175553,221,257,170,232,0,54,276,536,713,41242,41295,41516,41773,41943,
585801173000000+uc001nhz.1-1.29870986chr11611062434124242218753,224,254,170,217,22,34,0,54,279,536,713,930,952,41242,41295,41519,41773,41943,42161,42184,
585740196000000+uc008kpc.1-1.59420937chr116110624352417533532387,549,0,388,52417,52804,
732303105819300000-uc007zzg.1-1.1374303743chr116110624310143315864411110,27,45,13,26,26,13,6,14,7,16,15,3,47,11,18,16,26,19,14,11,9,32,10,10,46,11,25,4,60,8,22,12,17,45,47,24,11,15,18,30,10,24,8,19, ...0,10,37,82,95,121,157,170,176,190,201,217,232,235,282,293,361,377,404,423,437,448,457,489,499,509,555,576,602,619,680,688,710,72 ...101433,101460,101492,101544,101574,101609,101635,101653,101661,101677,101684,101701,101729,101735,101783,101795,101813,101833,10 ...
73241683220300000-uc002yuf.1-1.1380003800chr116110624310143715825958118,4,73,59,18,29,47,7,22,71,50,27,15,36,136,5,16,27,14,26,19,45,12,96,10,34,32,49,44,62,26,43,27,10,16,44,8,72,24,29,221,182,10 ...0,118,122,198,257,275,305,393,442,465,537,593,620,635,671,807,812,831,870,888,914,935,983,995,1092,1102,1136,1168,1223,1267,1340 ...101437,101573,101584,101657,101732,101751,101780,101827,101834,101856,101927,101977,102005,102021,102058,102200,102208,102224,10 ...
5868564000000-uc007zzf.1-1.1705227386chr1161106243163666163832619,16,36,20,51,7,319,346,364,400,420,471,163666,163685,163701,163745,163773,163825,
7316935633000000+uc007zze.1-1.1232402324chr11611062432190793390825619,25,18,70,7,58,26,84,174,160,91,102,226,28,19,37,11,28,25,61,9,49,36,15,16,4,15,17,4,12,10,41,45,32,42,39,25,19,71,12,19,38,20 ...0,19,44,65,136,143,204,230,314,488,648,742,845,1071,1099,1118,1155,1167,1195,1222,1284,1293,1342,1378,1393,1409,1413,1428,1445,1 ...219079,219099,219139,219157,219227,219235,219293,219322,331232,333962,337305,337396,337498,337725,337760,337782,337821,337832,33 ...
7319243883300000+uc002yue.1-1.1242902429chr1161106243219113339083354,19,19,35,124,84,174,160,184,32,37,272,27,40,21,33,48,94,12,17,31,163,34,27,33,78,19,94,25,57,124,20,37,27,140,0,4,53,72,116,240,324,498,658,842,874,914,1187,1214,1254,1278,1311,1359,1462,1474,1491,1522,1685,1719,1746,1779,1857,1876,1980,2 ...219113,219120,219139,219160,219195,219322,331232,333962,337305,337490,337523,337560,337832,337865,337912,337933,337967,338016,33 ...
731832393000000+uc002yud.1-1.1227402274chr11611062432198213390833715,8,6,13,46,2,54,21,174,160,184,32,37,272,27,40,21,33,48,94,12,17,31,163,34,27,33,78,19,94,25,57,124,20,37,27,140,0,17,25,31,45,92,94,148,169,343,503,687,719,759,1032,1059,1099,1123,1156,1204,1307,1319,1336,1367,1530,1564,1591,1624,1702,1721, ...219821,219836,219846,219858,219871,219917,219923,219978,331232,333962,337305,337490,337523,337560,337832,337865,337912,337933,33 ...
731744373000000+uc002yuc.1-1.1216202162chr11611062432204303390833128,29,174,160,184,32,37,272,27,40,21,33,48,94,12,17,31,163,34,27,33,78,19,94,25,57,124,20,37,27,140,0,28,57,231,391,575,607,647,920,947,987,1011,1044,1092,1195,1207,1224,1255,1418,1452,1479,1512,1590,1609,1713,1739,1797,1924,195 ...220430,220459,331232,333962,337305,337490,337523,337560,337832,337865,337912,337933,337967,338016,338110,338124,338149,338183,33 ...

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.


  TransMap UCSC (transMapAlnUcscGenes) Track Description
 

Description

This track contains UCSC Gene alignments produced by the TransMap cross-species alignment algorithm from other vertebrate species in the UCSC Genome Browser. For closer evolutionary distances, the alignments are created using syntenically filtered BLASTZ alignment chains, resulting in a prediction of the orthologous genes in cow.

TransMap maps genes and related annotations in one species to another using synteny-filtered pairwise genome alignments (chains and nets) to determine the most likely orthologs. For example, for the mRNA TransMap track on the human assembly, more than 400,000 mRNAs from 23 vertebrate species were aligned at high stringency to the native assembly using BLAT. The alignments were then mapped to the human assembly using the chain and net alignments produced using blastz, which has higher sensitivity than BLAT for diverged organisms.

Compared to translated BLAT, TransMap finds fewer paralogs and aligns more UTR bases. For closely related low-coverage assemblies, a reciprocal-best relationship is used in the chains and nets to improve the synteny prediction.

Display Conventions and Configuration

This track follows the display conventions for PSL alignment tracks.

This track may also be configured to display codon coloring, a feature that allows the user to quickly compare cDNAs against the genomic sequence. For more information about this option, click here. Several types of alignment gap may also be colored; for more information, click here.

Methods

  1. Source transcript alignments were obtained from vertebrate organisms in the UCSC Genome Browser Database. BLAT alignments of RefSeq Genes, GenBank mRNAs, and GenBank Spliced ESTs to the cognate genome, along with UCSC Genes, were used as available.
  2. For all vertebrate assemblies that had BLASTZ alignment chains and nets to the cow (bosTau4) genome, a subset of the alignment chains were selected as follows:
    • For organisms whose branch distance was no more than 0.5 (as computed by phyloFit, see Conservation track description for details), syntenic filtering was used. Reciprocal best nets were used if available; otherwise, nets were selected with the netfilter -syn command. The chains corresponding to the selected nets were used for mapping.
    • For more distant species, where the determination of synteny is difficult, the full set of chains was used for mapping. This allows for more genes to map at the expense of some mapping to paralogus regions. The post-alignment filtering step removes some of the duplications.
  3. The pslMap program was used to do a base-level projection of the source transcript alignments via the selected chains to the cow genome, resulting in pairwise alignments of the source transcripts to the genome.
  4. The resulting alignments were filtered with pslCDnaFilter with a global near-best criteria of 0.5% in finished genomes (human and mouse) and 1.0% in other genomes. Alignments where less than 20% of the transcript mapped were discarded.

To ensure unique identifiers for each alignment, cDNA and gene accessions were made unique by appending a suffix for each location in the source genome and again for each mapped location in the destination genome. The format is:

   accession.version-srcUniq.destUniq
Where srcUniq is a number added to make each source alignment unique, and destUniq is added to give the subsequent TransMap alignments unique identifiers.

For example, in the cow (bosTau4) genome, there are two alignments of mRNA BC149621.1. These are assigned the identifiers BC149621.1-1 and BC149621.1-2. When these are mapped to the human (hg18) genome, BC149621.1-1 maps to a single location and is given the identifier BC149621.1-1.1. However, BC149621.1-2 maps to two locations, resulting in BC149621.1-2.1 and BC149621.1-2.2. Note that multiple TransMap mappings are usually the result of tandem duplications, where both chains are identified as syntenic.

Credits

This track was produced by Mark Diekhans at UCSC from cDNA sequence data submitted to the international public sequence databases by scientists worldwide.

References

Zhu J, Sanborn JZ, Diekhans M, Lowe CB, Pringle TH, Haussler D. Comparative genomics search for losses of long-established genes on the human lineage. PLoS Comput Biol. 2007 Dec;3(12):e247.

Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 2008 Mar 1;24(5):637-44.

Siepel A, Diekhans M, Brejová B, Langton L, Stevens M, Comstock CL, Davis C, Ewing B, Oommen S, Lau C et al. Targeted discovery of novel human exons by comparative genomics. Genome Res. 2007 Dec;17(12):1763-73.