Schema for Spliced ESTs - Cow ESTs That Have Been Spliced |
|
|
Database: bosTau4 Primary Table: intronEst Row Count: 870,826
Format description: Summary info about a patSpace alignment
field | example | SQL type | description |
bin | 886 | smallint unsigned | Indexing field to speed chromosome range queries. |
matches | 429 | int unsigned | Number of bases that match that aren't repeats |
misMatches | 0 | int unsigned | Number of bases that don't match |
repMatches | 0 | int unsigned | Number of bases that match but are part of repeats |
nCount | 1 | int unsigned | Number of 'N' bases |
qNumInsert | 1 | int unsigned | Number of inserts in query |
qBaseInsert | 1 | int unsigned | Number of bases inserted in query |
tNumInsert | 3 | int unsigned | Number of inserts in target |
tBaseInsert | 1120 | int unsigned | Number of bases inserted in target |
strand | - | char(2) | + or - for strand. First character query, second target (optional) |
qName | AA961327 | varchar(255) | Query sequence name |
qSize | 453 | int unsigned | Query sequence size |
qStart | 16 | int unsigned | Alignment start position in query |
qEnd | 447 | int unsigned | Alignment end position in query |
tName | chr11 | varchar(255) | Target sequence name |
tSize | 110171769 | int unsigned | Target sequence size |
tStart | 39476297 | int unsigned | Alignment start position in target |
tEnd | 39477847 | int unsigned | Alignment end position in target |
blockCount | 5 | int unsigned | Number of blocks in alignment |
blockSizes | 21,4,84,132,189, | longblob | Size of each block |
qStarts | 6,28,32,116,248, | longblob | Start of each block in query. |
tStarts | 39476297,39476318,39476672,... | longblob | Start of each block in target. |
|
| |
|
|
Connected Tables and Joining Fields |
|
|
bosTau4.all_est.qName (via intronEst.qName)
bosTau4.estOrientInfo.name (via intronEst.qName)
bosTau4.gbCdnaInfo.acc (via intronEst.qName)
bosTau4.gbSeq.acc (via intronEst.qName)
bosTau4.gbStatus.acc (via intronEst.qName)
bosTau4.imageClone.acc (via intronEst.qName)
| |
|
|
|
|
bin | matches | misMatches | repMatches | nCount | qNumInsert | qBaseInsert | tNumInsert | tBaseInsert | strand | qName | qSize | qStart | qEnd | tName | tSize | tStart | tEnd | blockCount | blockSizes | qStarts | tStarts |
---|
886 | 429 | 0 | 0 | 1 | 1 | 1 | 3 | 1120 | - | AA961327 | 453 | 16 | 447 | chr11 | 110171769 | 39476297 | 39477847 | 5 | 21,4,84,132,189, | 6,28,32,116,248, | 39476297,39476318,39476672,39477130,39477658, |
886 | 425 | 0 | 0 | 0 | 3 | 6 | 4 | 1121 | + | AA961328 | 451 | 0 | 431 | chr11 | 110171769 | 39476298 | 39477844 | 7 | 23,86,132,137,8,12,27, | 0,23,109,241,382,391,404, | 39476298,39476670,39477130,39477658,39477797,39477805,39477817, |
959 | 475 | 6 | 0 | 10 | 4 | 7 | 12 | 4481 | + | AA908026 | 530 | 32 | 530 | chr11 | 110171769 | 49080309 | 49085281 | 14 | 38,251,88,4,33,3,16,4,11,5,6,11,8,13, | 32,70,321,410,414,447,450,466,470,481,490,497,508,517, | 49080309,49084809,49085061,49085149,49085154,49085188,49085192,49085209,49085214,49085226,49085236,49085244,49085256,49085268, |
761 | 454 | 0 | 0 | 3 | 0 | 0 | 3 | 772 | - | AA908022 | 460 | 3 | 460 | chr14 | 81345643 | 23167527 | 23168756 | 4 | 195,74,100,88, | 0,195,269,369, | 23167527,23168051,23168384,23168668, |
761 | 377 | 3 | 0 | 7 | 2 | 2 | 3 | 772 | - | AA908011 | 409 | 20 | 409 | chr14 | 81345643 | 23167597 | 23168756 | 6 | 56,69,74,100,36,52, | 0,57,126,200,300,337, | 23167597,23167653,23168051,23168384,23168668,23168704, |
968 | 453 | 2 | 0 | 7 | 1 | 1 | 2 | 507 | - | AA908009 | 480 | 17 | 480 | chr19 | 65312493 | 50259966 | 50260935 | 3 | 66,72,324, | 0,67,139, | 50259966,50260280,50260611, |
757 | 294 | 0 | 0 | 6 | 0 | 0 | 2 | 2167 | + | AA961325 | 575 | 17 | 317 | chr21 | 69173390 | 22579779 | 22582246 | 3 | 127,66,107, | 17,144,210, | 22579779,22581188,22582139, |
757 | 295 | 0 | 0 | 3 | 0 | 0 | 2 | 2167 | - | AA961326 | 501 | 10 | 308 | chr21 | 69173390 | 22579781 | 22582246 | 3 | 125,66,107, | 193,318,384, | 22579781,22581188,22582139, |
658 | 615 | 4 | 0 | 4 | 0 | 0 | 5 | 1424 | + | AA908015 | 665 | 8 | 631 | chr23 | 53376148 | 9694325 | 9696372 | 6 | 71,81,149,173,92,57, | 8,79,160,309,482,574, | 9694325,9694483,9694920,9695744,9696222,9696315, |
658 | 465 | 2 | 0 | 1 | 1 | 2 | 4 | 1119 | + | AA908012 | 474 | 0 | 470 | chr23 | 53376148 | 9694326 | 9695913 | 5 | 70,81,149,140,28, | 0,70,151,300,442, | 9694326,9694483,9694920,9695744,9695885, |
|
Note: all start coordinates in our database are 0-based, not
1-based. See explanation
here.
| |
|
|
Spliced ESTs (intronEst) Track Description |
|
|
Description
This track shows alignments between cow expressed sequence tags
(ESTs) in GenBank and the genome that show signs of splicing when
aligned against the genome. ESTs are single-read sequences, typically about
500 bases in length, that usually represent fragments of transcribed genes.
To be considered spliced, an EST must show
evidence of at least one canonical intron, i.e. the genomic
sequence between EST alignment blocks must be at least 32 bases in
length and have GT/AG ends. By requiring splicing, the level
of contamination in the EST databases is drastically reduced
at the expense of eliminating many genuine 3' ESTs.
For a display of all ESTs (including unspliced), see the
cow EST track.
Display Conventions and Configuration
This track follows the display conventions for
PSL alignment tracks. In dense display mode, darker shading
indicates a larger number of aligned ESTs.
The strand information (+/-) indicates the
direction of the match between the EST and the matching
genomic sequence. It bears no relationship to the direction
of transcription of the RNA with which it might be associated.
The description page for this track has a filter that can be used to change
the display mode, alter the color, and include/exclude a subset of items
within the track. This may be helpful when many items are shown in the track
display, especially when only some are relevant to the current task.
To use the filter:
- Type a term in one or more of the text boxes to filter the EST
display. For example, to apply the filter to all ESTs expressed in a specific
organ, type the name of the organ in the tissue box. To view the list of
valid terms for each text box, consult the table in the Table Browser that
corresponds to the factor on which you wish to filter. For example, the
"tissue" table contains all the types of tissues that can be
entered into the tissue text box. Multiple terms may be entered at once,
separated by a space. Wildcards may also be used in the
filter.
- If filtering on more than one value, choose the desired combination
logic. If "and" is selected, only ESTs that match all filter
criteria will be highlighted. If "or" is selected, ESTs that
match any one of the filter criteria will be highlighted.
- Choose the color or display characteristic that should be used to
highlight or include/exclude the filtered items. If "exclude" is
chosen, the browser will not display ESTs that match the filter criteria.
If "include" is selected, the browser will display only those
ESTs that match the filter criteria.
This track may also be configured to display base labeling, a feature that
allows the user to display all bases in the aligning sequence or only those
that differ from the genomic sequence. For more information about this option,
click
here.
Several types of alignment gap may also be colored;
for more information, click
here.
Methods
To make an EST, RNA is isolated from cells and reverse
transcribed into cDNA. Typically, the cDNA is cloned
into a plasmid vector and a read is taken from the 5'
and/or 3' primer. For most — but not all — ESTs, the
reverse transcription is primed by an oligo-dT, which
hybridizes with the poly-A tail of mature mRNA. The
reverse transcriptase may or may not make it to the 5'
end of the mRNA, which may or may not be degraded.
In general, the 3' ESTs mark the end of transcription
reasonably well, but the 5' ESTs may end at any point
within the transcript. Some of the newer cap-selected
libraries cover transcription start reasonably well. Before the
cap-selection techniques
emerged, some projects used random rather than poly-A
priming in an attempt to retrieve sequence distant from the
3' end. These projects were successful at this, but as
a side effect also deposited sequences from unprocessed
mRNA and perhaps even genomic sequences into the EST databases.
Even outside of the random-primed projects, there is a
degree of non-mRNA contamination. Because of this, a
single unspliced EST should be viewed with considerable
skepticism.
To generate this track, cow ESTs from GenBank were aligned
against the genome using blat. Note that the maximum intron length
allowed by blat is 750,000 bases, which may eliminate some ESTs with very
long introns that might otherwise align. When a single
EST aligned in multiple places, the alignment having the
highest base identity was identified. Only alignments having
a base identity level within 0.5% of the best and at least 96% base identity
with the genomic sequence are displayed in this track.
Credits
This track was produced at UCSC from EST sequence data
submitted to the international public sequence databases by
scientists worldwide.
References
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J,
Wheeler DL.
GenBank: update. Nucleic Acids Res.
2004 Jan 1;32(Database issue):D23-6.
Kent WJ.
BLAT - the BLAST-like alignment tool.
Genome Res. 2002 Apr;12(4):656-64.
| |
|
|
|