Template based Contact Map Predictor

Please note that the WMC source code is freely distributed for academic use only (see copyrights statement below)


1. Download and configure WMC
2. Running WMC

Download and Configure WMC
The following instructions should work right out of the box for UNIX-like systems. Mac should also work in principle, but is not yet supported. Windows will require some additional work, such as setting up a cygwin environment.

1. Download the WMC source code.
2. Unzip and untar the files:

tar -xzvf WMC.v1.01.tgz

This will create a directory named WMC.v1.01

3. Check if you have the desired programs installed:
4. When searching for templates, WMC requires the following protein sequence databases indexed using NCBI-BLAST formatdb program (see more about formatdb here) 5. WMC also uses Perl and BioPerl:
For any problems - please contact me

Running WMC
Run the Perl script: WMC.pl
WMC uses flags in the command line arguments: (for help, type: "perl WMC.pl")

USAGE: perl WMC.pl --Target_Seq --Out_Path --Out
Required parameters:
--Target_Seq: Input sequence file in FASTA format IMPORTANT: in this release, only sequence names of type >NAME1_NAME2 are supported (e.g. >My_PROT)
--Out_Path: Output directory that will be created automatically and hold all output files
--Out: name for the prediction file

Optional parameters:
--Templates_List : List of Templates PDB ID to Use (PDB ID FORMAT: 1CLLA for 1CLL cahin A) - don't look for templates
--PSI_Blast : Blast File vs. PDB to start from
--Target_PDB_ID : If given prediction file with the extracted contact map (true lable) will be created (PDB ID should be in 1CLL_A forrmat for entry 1CLL chain A)
--Power : Considering the 1/Distance^Power as the contact map group's weight (default Power=3)
--Q_Align : Minimal percent of alignment overlap out of the query length (default=0.5)
--S_Align : Minimal percent of alignment overlap out of the subject length (deafult=0.5)
--Min_ID : Minimal percent of sequence identity (deafult=0)
--Min_Length : The minimal length of template to consider in AA sence (rather than percent of the target length) (deafult= 50% of target length)
--Best_Templates <NUMBER> : Consider only best NUMBER of templates
--ID_Cutoff : Indicate whether the PDB templates determined using identity cutoff as sepecified on Min_ID Var
--E_Value_Cutoff : Indicate whether the PDB templates determined using E-Value cutoff, default, to disable use --noE_Value_Cutoff
--NR_Struct : Indicate whether to take only one uniq PDB for each template (i.e discard 'reduandant' information)
--Only_Overlap : Consider only the number of overlapping sequences in both positions for the averaging (without gaps) rather than the number of all templates
--Phylo_Mode : Consider the phylogeneteic information in a weighted fashion
--Phylo_Sum_Mode : Consider the phylogeneteic information in a weighted fashion (not average but only sum)
--GODZIK_Mode : Find templates according to PDB-BLAST procedure described by Godzik and colleagues (Protein Science 2000) - default
--Entropy_Like : Consider the amount of information in the paired columns
--Clusters : Group contact maps according to the phylogenetic tree constructed (especially desigend to consider 'reduandant data' without bias) - default, to disable use --noClusters
--Include_Very_Close : Consider also very close homologs structures (including itsels if exist)


EXAMPLES: