crass(1) General Commands Manual crass(1)

NAME

crassthe CRISPR Assembler.

SYNOPSIS

crass [-abcdDefgGhHkKlLnorsSVwxyz] file ...
 

DESCRIPTION

crass is a tool for finding and assembling reads from genomic and metagenomic datasets that contain Clustered Regularly Interspersed Short Palindromic Repeats (CRISPR). crass searches through the dataset and identifies reads which contain repeated K-mers that are of a specific length and are separated by a spacer sequence. These possible direct repeats are then curated internally to remove bad matches and then reads containing direct repeats are then outputed for further analysis.
 

OPTIONS

 
crass [-eghrzGHL] [-a LAYOUT_TYPE] [-b INT] [-c COLOUR_TYPE] [-d INT] [-f INT] [-k INT] [-K INT] [-l INT] [-n INT] [-o DIR] [-s INT] [-w INT] [-x REAL] [-y REAL] [-D INT] [-K INT] [-S INT] file ...
 
-a LAYOUT_TYPE --layoutAlgorithm LAYOUT_TYPE
The Graphviz layout algorithm to be used when rendering graphs.
-b INT --numBins INT
The number of colour bins for the output graph. Default is to have as many colours as there are different values for the coverage of Nodes in the graph.
-c COLOUR_TYPE --graphColour COLOUR_TYPE
The colour scheme for the output graph based on the coverage of each spacer in the CRISPR, can be one from:
red-blue
Low coverage spacers are coloured in red and high coverage spaceres are coloured blue. Intermediates are coloured in shades of purple.
blue-red
Low coverage spacers are coloured in blue and high coverage spaceres are coloured red. Intermediates are coloured in shades of purple.
green-red-blue
Three tone colouring with low coverage spacers in green and high coverage spacers in blue.
red-blue-green
Three tone colouring with low coverage spacers in blue and high coverage spacers in green.
-d INT --minDR INT
The minimim length of the direct repeat to search for [Default: 23]
-D INT --maxDR INT
The Maximum length of the direct repeat to search for [Default: 47]
-e --noDebugGraph
Option available only when DEBUG preoprocessor symbol is set. Will turn off generating debugging graphs
-f INT --covCutoff INT
Defines the minimim number of reads that a putative CRISPR must contain to be considered real. [Default: 10]
-g --logToScreen
Print the logging info to stdout rather than to a file
-G --showSingletons
Set to show unattached spacers in the graph output
-h --help
Output basic usage informtion to screen
-H --removeHomopolymers
Correct for homopolymer errors [default: no correction]
-l INT --logLevel INT
The level of verbosity to ouput in the crass log file
-k INT --kmerCount INT
The number of kmers at two direct repeats must share to be considered part of the same cluster [Default: 12]
-K INT --graphNodeLen INT
The length of the kmer used to define a node in the graph. The lower the number the more connected the graph will be but also increases the chance of false positive edges [Default: 7]
-n INT --minNumRepeats INT
The minimim number of repeats that a candidate CRISPR locus must contain to be considered 'real' [Default: 3]
-o LOCATION --outDir LOCATION
The name of the ouput directory for the output files [Default: ./]
-r --noRendering
Option only available when the '--enable-rendering' configure option is set. Will turn off the generation of image files.
-s INT --minSpacer INT
The minimim length of the spacer to search for [Default: 26]
-S INT --maxSpacer INT
The maximim length of the spacer to search for [Default: 50]
-V --version
Print version and copy right information
-w INT --windowLength INT
The length of the window size for searching a genome. Must be between 6 - 9 [Default: 8]
-x REAL --spacerScalling REAL
A decimal number that represents the reduction in size of the spacer when the --removeHomopolymers option is set [Default: 0.7]
-y REAL --repeatScalling REAL
A decimal number that represents the reduction in size of the direct repeat when the --removeHomopolymers option is set [Default: 0.7]
-z --noScalling
Use the given spacer and direct repeat ranges when --removeHomopolymers is set. The default is to use the scale these values based on the values of -x and -y.
 

FILES

crass.<TIMESTAP>.log
Log file containing information about the last execution of crass
Group_<NUM>_<DNA>.fa
Fasta file of all reads from a DR type.
Spacers_<NUM>_<DNA>.spacers.gv
File representing the graph of the DR type in Graphviz format
crass.<TIMESTAMP>.keys.gv
A file in graphviz format that contains all of the colour codes for the coverage values in the output graph
crass.crispr
A crispr file representing all the information about each of the DR types identified

DIAGNOSTICS

The crass utility exits 0 on success, and >0 if an error occurs.
 

SEE ALSO

crass-assembler(1), crisprtools(1)
17/04/13 Darwin