Distribution of short oligopeptides in a dataset of selected polypeptides
- MOJ Proteomics & Bioinformatics
Varun Ravishankar,1 Natasha Kelkar,1,2 Nachiket Pathak,1 Rutuj Kolhe,1 Onkar Ghuge,1 Shantanu Madiwale,1 Dhanashree Deore,1 Anupam Saraph,3 Milner Kumar,4 Anil Gore,5 SP Modak6,7
PDF Full Text
DNAbases act as alphabets and nucleotide triplets, each representing an amino acid, or a punctuation mark, dictate the order and frequency of occurrence for different amino acids in the newly synthesized polypeptide. The presence of the triplet code in DNA raises the possibility that there may be another code or linguistic formulation composed of 20 amino acids as different alphabets dictating the frequency and the serial order in which 20 amino acids are arranged on different polypeptide strings. With this in mind, we have created a database of di-, tri-, tetra- and pentapeptides and examined the distribution and frequency of occurrence of different types of short oligopeptides in a set of 51,865 polypeptide sequences selected from the Swiss Prot database.
di-, tri-, tetra- and pentapeptides, oligopeptide matrices, forbidden oligopeptides, clustering algorithm