What is WCOACH?

WCOACH [1] is a graph clustering algorithm to detect protein complexes in weighted Protein-Protein Interaction (PPI) networks. protein complexe prediction is a chalanging problem in computational biology. WCOACH algorithm is a modification of COACH [2] method to solve this problem in weighted PPI networks.

In our experiment the weight of each interaction is the semantic similarity measure between its protein pair based on GO structure. The semantic similarity measure is calculated by csbl.go package based on Lin [3] method. But the algorithms works with any weighted networks and extracts the significant modules based on interaction weights.

How to run the application

To use our algorithm please download the installation file from download menu (The source code of the algorithm is available too). After installing application, you can run the algorithm with the following steps (note that this software can be installed only on Windows OS and .net framework 4 or higher version must be installed on your OS):

Cluster your networks with WCOACH:

1. You can import input network by pressing Import Weighted Network or Import Unweighted Network button. The algorithm can be run on unweighted networks too. for the unweighted networks select the Unweighted radio button in top of the main form (see Image 1).

Input format: The input networks should be in text format with 3 columns (2 columns for unweighted). Each row is an edge (interaction). Column 1 contains Source verteces, column 2 contains destination vertices and column 3 contains weight values. The elements of each row should be separated with a tab. You can download some weighted PPI networks of Yeast from Download menu.

2. This algorithm has 2 parameters that can be set in main form:
Neighborhood affinity threshold: Our algorithm applies neighborhood affinity threshold to control the overlap between predicted cores. This threshold is set to 0.85 as default.
Minimum size of clusters: The size of output clusters (protein complexes) will be greater or equal than this value. This parameter set to 3 as default.

3. After importing an input network and set the thresholds you can press the Start Analysis button to detect protein complexes. After finishing analysis step, the end of detection process and the number of predicted clusters will be alerted to user with a messagebox.

4. The results can be saved on a text file by pressing the Save Results button.

Image 1: Main form of application (clock on image to view it in large size)

Evaluation results:

To evaluat results and calculate some evaluation criteria press Evaluation button in main form. The Precision, Recall, F-measure, Number of predicted clusters that matches real complexes (Ncp) and Number of real complexes that matches predicted clusters (Ncb) can be calculated by the application with the following steps (see image 2):

1. At ferst you should import predicted clusters of WCOACH or other algorithms by pressing the Import Predicted Clusters button. If the radio button Use Results of WCOACH is chosen the current results of WCOACH will be loaded. To load the results from another file the radio button From a file should be chosen.

2. Import a benchmark set file by pressing Import Benchmark Complexes button. The real complexes [4] of Yeast (CYC2008) are available in download menu.

Input format: The file that you loaded to evaluate should be in text format. Each row contains a cluster that its vertices are separated with a tab. The benchmark set file shold be in the same format.

3. The results will be available after set the Overlap Threshold and pressing Start Analysis button. The Overlap Threshold is applyed to control overlap between predicted and real complexes and is set to 0.5 as default.

4. The evaluation results can be savet in a text file by pressing Save Results button.

Image 2: Evaluation Results form (clock on image to view it in large size)

Refrences

[1] Kouhsar, M. et al. (2016) WCOACH: Protein complex prediction in weighted PPI networks, Genes & genetic systems.
[2] Wu, M. et al. (2009) A core-attachment based method to detect protein complexes in PPI networks, BMC bioinformatics, 10(1), 169. http://www.biomedcentral.com/1471-2105/10/169
[3] Lin, D. (1998) An information-theoretic definition of similarity, In Proceedings of 15th International Conference on Machine Learning (ICML), pp, 296-304.
[4] Pu, S. et al. (2009) Up-to-date catalogues of yeast protein complexes, Nucleic acids research, 37(3), 825-831.