Download | - View accepted manuscript: Guide to Threshold Selection for Motif Prediction Using Positional Weight Matrix (PDF, 435 KiB)
|
---|
Author | Search for: Pan, Youlian; Search for: Phan, Sieu |
---|
Format | Text, Article |
---|
Conference | International MultiConference of Engineers and Computer Scientists 2008 (IMECS'08) at The 2008 IAENG International Conference on Bioinformatics (ICB'08), March 19-21, 2008, Hong Kong, China |
---|
Subject | sequence motif; motif; positional weight matrix; matrice position-poids; log-odd score; score log-odd; statistical expectation; espérance statistique; goodness-of-fit; validité de la concordance |
---|
Abstract | In biological sequence research, the positional weight matrix (PWM) is often used to search for putative transcription factor binding sites. A log-odd score is usually applied to measure the closeness of a subsequence to the PWM. However, the log-odd score is motif-length-dependent and thus there is no universally applicable threshold. In this paper, we propose an alternative scoring index (G) varying from zero, where the subsequence is not much different from the background, to one, where the subsequence fits best to the PWM. We also propose a measure evaluating the statistical expectation at each G index. We investigated the PWMs from the TRANSFAC and found that the statistical expectation is significantly ( p < 0.0001) correlated with both the length of the PWMs and the threshold G value. We applied this method to two PWMs (GCN4_C and ROX1_Q6) of yeast transcription factor binding sites and two PWMs (HIC1-02, HIC1_03) of the human tumor suppressor (HIC-1) binding sites from the TRANSFAC database. Finally, our method compares favorably with the broadly used Match method. The results indicate that our method is more flexible and can provide better confidence. |
---|
Publication date | 2008 |
---|
In | |
---|
Language | English |
---|
NRC number | NRCC 49881 |
---|
NPARC number | 8913400 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | 47384da9-c7d2-41af-a3ef-13a49f0b3202 |
---|
Record created | 2009-04-22 |
---|
Record modified | 2020-08-12 |
---|