In eukaryotes, methylation plays a vital role in gene expression regulation. Furthermore, it has been observed that abnormal methylation of the DNA can lead to coordinate changes in expression of genes resulting in cancer growth and metastasis. The most common type of DNA modification consists of the methylation of cytosine in the CpG dinucleotide. Methylation in non-CpG sequences is less frequent represting up to 15-20% of total 5'-methylcytosine. CpGs are present at an average of one per 80-dinucleotides throughout most part of the genome. However, there are regions within the genome where CpGs are around five times greater than the average. These regions are known as CpG islands and comprise 1-2% of the genome. CpG islands have a high G+C content (greater than 50%), and a size ranging from 200bp to several thousand base pairs.
Main function of DNA methylation is the repression of gene expression. Thus, methylation of CpGs may disrupt the binding of certain transcription factors . Also, DNA methylation promotes the association to the DNA of specific 5-methylctosine binding proteins and other structural proteins, which results in the packing of the DNA into a structure that is inaccessible to transcription factors. Not surprisingly, it has been suggested that patterns of methylation may compartmentalize the genome into transcriptionally active (non-methylated) and non-active regions (methylated). DNA methylation can alter the flow of the genetic information and reprogram the genome function , and therefore it is recognized as the major epigenetic modification. Genomic methylation patterns in non-dividing somatic differentiated cells are generally stable and heritable. However, there are instances where methylation patterns undergo significant changes to alter the phenotype.
Users can enter a nucleotide sequence in one of the standard formats such as GenBank, EMBL, GCG, or plain format. The method provides the option of pasting the sequence in the text area or uploading the direct sequence file. All non-standard characters except the four nucleotides bases adenine, guanine, cytosine and thymine will be ignored from the sequence. The method only allows the prediction for single sequence in one run of prediction and there is limit size of 10,000 nucleotides per query.
The method allows the selection of threshold for assigning the methylation sites. The value of threshold can vary from -1.5 to 1.5. The higher the threshold value the higher the specificity of the results obtained, althouht some methylation sites might go undetected. Conversely, the results obtained at low threshold values may have large number of false positive results. So one should use an optimum threshold (value = -0.455), at which the sensitivity an specificity had the same values. The values of the sensitivity ans specificity at different thresholds are shown through graph below.
After the analysis the results are shown in user-friendly format. The results provide the summary of the summated sequence in term of length, selected threshold and time of prediction. The methylated cytosines are shown in bolder and red color as compare to rest of the cytosines as shown below