ExoPred Help Page

Universidad Complutense de Madrid - Inmunology Departament





GENERAL DESCRIPTION

USER GUIDE

    Input

    Options

   Output




GENERAL DESCRIPTION

ExoPred is a web-based tool that implements a Random Forest algorithm trained in a dataset that containing 2992 vertebrata exosome proteins collected from EXOCARTA and UNIPROT. Those ones including N-terminal signal peptide and/or transmembrane regions were discarted. Likewise, we reduce sequence similarity so that exosome proteins do not share more than 80% identity. This dataset also contains 2961vertebrata non-exosome proteins randomly collected from UNIPROT and obeyed to the same criteria than exosome proteins.

The ExoPred interface has been designed for simple and intuitive use.ExoPred first runs a BLASTP [4] against the UNIPROT database and process the BLAST output to identify the UNIPROT identifier (ID) of protein hits with  identity higher than 90% and over 90% of their entire length. After these identifiers, ExoPred will then retrieve taxa and sub-cellular location information from UNIPROT annotations and transfer it to the relevant input query proteins. ExoPred will also detect those proteins with leader sequence and transmembrane regions using SignalP [1] and TMHMM [2] and predict sub-cellular locations using PSORT [3] . The model for predicting exosome secretion is only executed on proteins from vertebrate and without a signal peptide or transmembrane regions.


USER GUIDE

The Input

The input data for ExoPred can be one or several protein sequences in FASTA format, as it is show in the example, which can be pasted or uploaded to the server. All the input sequences have to be in UPPER LETTER and in one-letter amino acid code. To run the server, click the "RUN" button.
>7B2_PIG Neuroendocrine protein 7B2 OS=Sus scrofa OX=9823 GN=SCG5 PE=1 SV=2
MVSTMLSGLVLWLTFGWTPALAYSPRTPDRVSETDIQRLLHGVMEQLGIARPRVEYPAHQ
AMNLVGPQSIEGGAHEGLQHLGPFGNIPNIVAELTGDNTPKDFSEDQGYPDPPNPCPIGK
TDDGCLENTPDTAEFSREFQLHQHLFDPEHDYPGLGKWNKKLLYEKMKGGQRRKRRSVNP
YLQGQRLDNVVAKKSVPHFSDEDKDPE
>CD81_CHLAE CD81 antigen OS=Chlorocebus aethiops OX=9534 GN=CD81 PE=2 SV=1
MGVEGCTKCIKYLLFVFNFVFWLAGGVILGVALWLRHDPQTTNLLYLELGDKPAPNTFYV
GIYILIAVGAVMMFVGFLGCYGAIQESQCLLGTFFTCLVILFACEVAAGIWGFVNKDQIA
KDVKQFYDQALQQAVVDDDANNAKAVVKTFHETLDCCGSSTLAALTTSVLKNNLCPSGSN
IISNLLKKDCHQKIDELFSGKLYLIGIAAIVVAVIMIFEMILSMVLCCGIRNSSVY
>TKFC_HUMAN Triokinase/FMN cyclase OS=Homo sapiens OX=9606 GN=TKFC PE=1 SV=2
MTSKKLVNSVAGCADDALAGLVACNPNLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHE
PAHAGFIGKGMLTGVIAGAVFTSPAVGSILAAIRAVAQAGTVGTLLIVKNYTGDRLNFGL
AREQARAEGIPVEMVVIGDDSAFTVLKKAGRRGLCGTVLIHKVAGALAEAGVGLEEIAKQ
VNVVAKAMGTLGVSLSSCSVPGSKPTFELSADEVELGLGIHGEAGVRRIKMATADEIVKL
MLDHMTNTTNASHVPVQPGSSVVMMVNNLGGLSFLELGIIADATVRSLEGRGVKIARALV
GTFMSALEMPGISLTLLLVDEPLLKLIDAETTAAAWPNVAAVSITGRKRSRVAPAEPQEA
PDSTAAGGSASKRMALVLERVCSTLLGLEEHLNALDRAAGDGDCGTTHSRAARAIQEWLK
EGPPPASPAQLLSKLSVLLLEKMGGSSGALYGLFLTAAAQPLKAKTSLPAWSAAMDAGLE
AMQKYGKAAPGDRTMLDSLWAAGQELQAWKSPGADLLQVLTKAVKSAEAAAEATKNMEAG
AGRASYISSARLEQPDPGAVAAAAILRAILEVLQS
>RN5A_HUMAN 2-5A-dependent ribonuclease OS=Homo sapiens OX=9606 GN=RNASEL PE=1 SV=2
MESRDHNNPQEGPTSSSGRRAAVEDNHLLIKAVQNEDVDLVQQLLEGGANVNFQEEEGGW
TPLHNAVQMSREDIVELLLRHGADPVLRKKNGATPFILAAIAGSVKLLKLFLSKGADVNE
CDFYGFTAFMEAAVYGKVKALKFLYKRGANVNLRRKTKEDQERLRKGGATALMDAAEKGH
VEVLKILLDEMGADVNACDNMGRNALIHALLSSDDSDVEAITHLLLDHGADVNVRGERGK
TPLILAVEKKHLGLVQRLLEQEHIEINDTDSDGKTALLLAVELKLKKIAELLCKRGASTD
CGDLVMTARRNYDHSLVKVLLSHGAKEDFHPPAEDWKPQSSHWGAALKDLHRIYRPMIGK
LKFFIDEKYKIADTSEGGIYLGFYEKQEVAVKTFCEGSPRAQREVSCLQSSRENSHLVTF
YGSESHRGHLFVCVTLCEQTLEACLDVHRGEDVENEEDEFARNVLSSIFKAVQELHLSCG
YTHQDLQPQNILIDSKKAAHLADFDKSIKWAGDPQEVKRDLEDLGRLVLYVVKKGSISFE
DLKAQSNEEVVQLSPDEETKDLIHRLFHPGEHVRDCLSDLLGHPFFWTWESRYRTLRNVG
NESDIKTRKSESEILRLLQPGPSEHSKSFDKWTTKINECVMKKMNKFYEKRGNFYQNTVG
DLLKFIRNLGEHIDEEKHKKMKLKIGDPSLYFQKTFPDLVIYVYTKLQNTEYRKHFPQTH
SPNKPQCDGAGGASGLASPGC

Options

In ExoPred, users can also select:
  •  To retrieve the sub-cellular location of input proteins as annotated in UNIPROT
  • Predict such sub-cellular location using PSORT (version II) [3]. 
These two options are selected by default.


Output

Here follows a representative output with all the information the user can obtain.

Seq # Protein Name Vertebrata Protein ID Leader Seq Transmembrane Uniprot Subcellular location Psort Prediction ExoPred
1 7B2_PIG Y P01165.2 Y N Extracellular Extracellular, including cell wall NA
2 CD81_CHLAE Y O97703.1 N Y Plasma Membrane Endoplasmic reticulum NA
3 TKFC_HUMAN Y Q3LXA3.2 N N Nucleus; Cytosol; Extracellular; Exosome Cytoplasmic Y
4 RN5A_HUMAN Y Q05823.2 N N Cytosol; Mitochondrion Mitochondrial N

ExoPred output consists of a table reporting by default whether input proteins are from vertebrate (Y/N), contain a signal peptide (Y/N) or transmembrane regions (Y/N) and can be secreted via exosomes (Y/N). As show in previously, ExoPred will also show the sub-cellular location of input proteins annotated in UNIPROT and predicted by PSORT if the relevant options were checked at submission. Exosome secretion predictions will show as NA (not available) for input proteins that do not meet the criteria mentioned above. For proteins without UNIPROT equivalents, ExoPred will still determine whether they can be secreted by exosomes as long as they have no predicted signal peptide or transmembrane regions. In these cases, the field taxa, and UNIPROT sub-cellular-location when selected, will show as not found.




1) Henrik Nielsen. Predicting Secretory Proteins with SignalP. 2017.In Kihara, D (ed). Protein Function Prediction (Methods in Molecular Biology vol. 1611) pp. 59-73, Springer.
2)
A. Krogh, B. Larsson, G. von Heijne, and E. L. L. Sonnhammer. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. Journal of Molecular Biology, 305(3):567-580, January 2001.
3) Horton, P., Park, K. J., Obayashi, T., Fujita, N., Harada, H., Adams-Collier, C. J., & Nakai, K. (2007). WoLF PSORT: protein localization predictor. Nucleic acids research, 35(Web Server issue), W585–W587.
4) Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of molecular biology, 215(3), 403–410.



Please cite:

Ras-Carmona, A., Gomez-Perosanz, M., & Reche, P. A. (2021). Prediction of unconventional protein secretion by exosomes. BMC bioinformatics, 22(1), 333. https://doi.org/10.1186/s12859-021-04219-z



For questions about this site: