Category

Sequence Analysis


Usage

cd-hit-para.pl -i nr90 -o nr60 -c 0.6 -n 4 --B hosts --S 64


Manual

where
--B hosts is a file with available hostnames
--S 64 is the number to split input db into, this number should be several times the number of hosts

More options:

--P program, "cd-hit" or "cd-hit-est", default "cd-hit"
--B filename of list of hosts,
    requred unless -Q or -L option is supplied
--L number of cpus on local computer, default 0
    when you are not running it over a cluster, you can use
    this option to divide a big clustering jobs into small
    pieces, I suggest you just use "--L 1" unless you have
    enough RAM for each cpu
--S Number of segments to split input DB into, default 64
--Q number of jobs to submit to queue queuing system, default 0
    by default, the program use ssh mode to submit remote jobs
--T type of queuing system, "PBS", "SGE" are supported, default PBS
--R restart file, used after a crash of run


Share your experience or ask a question