Bayesian sensor
New!
Our new Bayesian Splice Sites (SS) sensor has been shown to outperform contemporary
Maximum Entropy Sensor on 5'
SS for several representative test sets, such as set of 250 human genes and short 5' UTR human gene fragments excluded
from the learning set as well as collection of 183 rat genes. On the same test sets performance of our new Bayesian 3'
SS sensor is as good or better than that of
Maximum Entropy Sensor.
Please refer to our CSB2005 poster for details.
Our implementation of Bayesian SS sensor is available as Perl wrapper.
We do believe that sensor performance could be generalized to a broad spectrum of
tetrapoda organisms, since genes responsible for recognition of splicing motifs
(such as encoding U1 and U2 snRNPs) are among the most conservative known genes.
|
 |
SpliceScan
New!
Our new Ab initio gene annotation tool SpliceScan uses interaction between different signals.
In our prediction we rely on Donor/Donor, Donor/Acceptor, Acceptor/Acceptor and Acceptor/Donor interactions plus set of ESE/ISE/ESS
signals that substantially enhance prediction quality. Tool was conceived as a splicing simulator with objectives
similar to ExonScan engine. Our
tool explores different approach of splice sites detection based on splice sites definition.
If available, it includes information related to exon and intron definition models combined with
prediction of Bayesian splice sites sensor. This approach is especially efficient for short pre-mRNA fragments.
SpliceScan has been shown to perform better than SpliceView,
GeneSplicer,
NNSplice,
NetUTR and Genio tools
on the ROC curve for the test set of 250 human genes excluded from the learning set and collection of 183 rat genes.
For the test set of short 5'UTR human gene fragments, with cross-correlation
removed between SpliceScan learning and the test set, our tool outperforms all the contemporary gene structural
prediction methods as could be seen here.
- Run SpliceScan tool online for 5' SS (donor) and
3' SS (acceptor)
- Download SpliceScan tool
- Poster presented at
CSB2005 and short
report
- Recent article in Biology Direct journal and my
dissertation that can explain the SpliceScan in more details
- Updated ROC diagrams for different applications. In this experiment cross-correlation has been removed between learning and test set.
- The ROC curves were obtained using our online web crawling application,
capable of querying test sequences against various web tools and parsing the results. Please notice difference between the standard
ROC curves (False positive fraction vs. True positive fraction) and the ROC curves we use (Sensitivity vs. 1 - Specificity).
- We used MHMMotif tool to learn some of the ESE/ISE motifs used by SpliceScan
|
GIGOgene engine
|
GIGOgene test results
We used the following gene structural prediction quality test
framework to obtain our data.
Exon level precision based on Genie
learning set of 462 human genes. Our GIGOgene application
outperforms contemporary
Homology Based annotation tools we have looked at in terms of exon
level Sensitivity and Specificity.
Exon
level precision based on Genie
learning set of 462 human genes. Our GIGOgene application
outperforms contemporary
Homology Based annotation tools we have looked at in terms of exon
level Sensitivity and Specificity.
|
TE
|
AE
|
PE
|
ESn
|
ESp |
Galahad
|
4744
|
4909 |
4790 |
96.64% |
99.04% |
Spidey
|
4827
|
4909 |
4847 |
98.33% |
99.59% |
EST2Genome
|
4742
|
4909 |
4752 |
96.60% |
99.79% |
Sim4
|
4837
|
4909 |
4845 |
98.53% |
99.83% |
BLAT
|
4832
|
4909 |
4902 |
98.43% |
98.57% |
GIGOgene
|
4864
|
4909 |
4865 |
99.08% |
99.98% |
We have compared performance of different programs on human
genes containing microexons. First we parsed gene structures for the
whole human genome to find genes containing microexons (2-11nt). Then
we carefully examined splice sites to be canononical in the genomic
structures predicted. We compared predicted structures with other
program annotations and got the following results in terms of exonic
level Sensitivity and Specificity:
|
TE
|
AE
|
PE
|
ESn
|
ESp |
Galahad
|
1220
|
1422
|
1278
|
85.79% |
95.46% |
Spidey
|
1251
|
1422
|
1334
|
87.97%
|
93.78%
|
EST2Genome
|
1270
|
1422
|
1318
|
89.31%
|
96.36%
|
Sim4
|
1278
|
1422
|
1326
|
89.87%
|
96.38%
|
BLAT
|
1375
|
1422 |
1424 |
96.69% |
96.56% |
GIGOgene
|
1420
|
1422 |
1422 |
99.86% |
99.86% |
|
|
| Contact e-mail: Alexander Churbanov |