Row | Data used | What the data is used for | Outcome |
---|---|---|---|
1 | The 998 sequences in dbogap-str and the 1,083 sequences in glycos | Predicting the likelihood of O-GlcNAc glycosylation with sequence and structural data | Table 11 |
2 | The 340 sequences in dbogap-seq and the 361 sequences in Ngly | Predicting the likelihood of O-GlcNAc glycosylation with only sequence data. A sequence is considered to be mispredicted if its predicted probability of O-glycosylation is less than 50% and it is O-glycosylated | Table 14. About 11% of sequences in dbogap-seq are mispredicted as not being O-GlcNAc glycosylated; and 9% of the sequences in Ngly are mispredicted as being O-GlcNAc glycosylated |
3 | The 259 sequences in Oglc-non-dbogap | Calculating the out-of-sample mispredictions rate with the LPM estimated for the exercise outlined in Row 2 of this Table | 54 of the 259 sequences (≈ 21%) are mispredicted as not being O-GlcNAC glycosylated |
4 | The 2,079 sequences in Ogal | Calculate the out-of-sample mispredictions rate with the LPM estimated for the exercise outlined in Row 2 of this Table | 656 of the 2,079 (≈ 31.6%) are mispredicted as not being O-GalNAc glycosylated |
5 | The 236 sequences in WSTW-Uniprot | To see if any of these are O-glycosylated | None are O-glycosylated. This again indicates that ~ (W – S/T – W) is likely necessary for O-glycosylation |