crass(1) |
General Commands Manual |
crass(1) |
NAME
crass — the CRISPR Assembler.
SYNOPSIS
crass |
[-abcdDefgGhHkKlLnorsSVwxyz] file ...
|
DESCRIPTION
crass is a tool for finding and assembling reads from genomic and metagenomic datasets that contain
Clustered
Regularly
Interspersed
Short
Palindromic
Repeats (CRISPR).
crass searches through the dataset and identifies reads which contain repeated K-mers that are of a specific length and are separated by a spacer sequence. These possible direct repeats are then curated internally to remove bad matches and then reads containing direct repeats are then outputed for further analysis.
OPTIONS
-
-
crass [-eghrzGHL] [-a LAYOUT_TYPE] [-b INT] [-c COLOUR_TYPE] [-d INT] [-f INT] [-k INT] [-K INT] [-l INT] [-n INT] [-o DIR] [-s INT] [-w INT] [-x REAL] [-y REAL] [-D INT] [-K INT] [-S INT] file ...
-
-a LAYOUT_TYPE --layoutAlgorithm LAYOUT_TYPE
-
The Graphviz layout algorithm to be used when rendering graphs.
-
-b INT --numBins INT
-
The number of colour bins for the output graph. Default is to have as many colours as there are different values for the coverage of Nodes in the graph.
-
-c COLOUR_TYPE --graphColour COLOUR_TYPE
-
The colour scheme for the output graph based on the coverage of each spacer in the CRISPR, can be one from:
-
red-blue
-
Low coverage spacers are coloured in red and high coverage spaceres are coloured blue. Intermediates are coloured in shades of purple.
-
blue-red
-
Low coverage spacers are coloured in blue and high coverage spaceres are coloured red. Intermediates are coloured in shades of purple.
-
green-red-blue
-
Three tone colouring with low coverage spacers in green and high coverage spacers in blue.
-
red-blue-green
-
Three tone colouring with low coverage spacers in blue and high coverage spacers in green.
-
-d INT --minDR INT
-
The minimim length of the direct repeat to search for [Default: 23]
-
-D INT --maxDR INT
-
The Maximum length of the direct repeat to search for [Default: 47]
-
-e --noDebugGraph
-
Option available only when DEBUG preoprocessor symbol is set. Will turn off generating debugging graphs
-
-f INT --covCutoff INT
-
Defines the minimim number of reads that a putative CRISPR must contain to be considered real. [Default: 10]
-
-g --logToScreen
-
Print the logging info to stdout rather than to a file
-
-G --showSingletons
-
Set to show unattached spacers in the graph output
-
-h --help
-
Output basic usage informtion to screen
-
-H --removeHomopolymers
-
Correct for homopolymer errors [default: no correction]
-
-l INT --logLevel INT
-
The level of verbosity to ouput in the crass log file
-
-k INT --kmerCount INT
-
The number of kmers at two direct repeats must share to be considered part of the same cluster [Default: 12]
-
-K INT --graphNodeLen INT
-
The length of the kmer used to define a node in the graph. The lower the number the more connected the graph will be but also increases the chance of false positive edges [Default: 7]
-
-n INT --minNumRepeats INT
-
The minimim number of repeats that a candidate CRISPR locus must contain to be considered 'real' [Default: 3]
-
-o LOCATION --outDir LOCATION
-
The name of the ouput directory for the output files [Default: ./]
-
-r --noRendering
-
Option only available when the '--enable-rendering' configure option is set. Will turn off the generation of image files.
-
-s INT --minSpacer INT
-
The minimim length of the spacer to search for [Default: 26]
-
-S INT --maxSpacer INT
-
The maximim length of the spacer to search for [Default: 50]
-
-V --version
-
Print version and copy right information
-
-w INT --windowLength INT
-
The length of the window size for searching a genome. Must be between 6 - 9 [Default: 8]
-
-x REAL --spacerScalling REAL
-
A decimal number that represents the reduction in size of the spacer when the --removeHomopolymers option is set [Default: 0.7]
-
-y REAL --repeatScalling REAL
-
A decimal number that represents the reduction in size of the direct repeat when the --removeHomopolymers option is set [Default: 0.7]
-
-z --noScalling
-
Use the given spacer and direct repeat ranges when --removeHomopolymers is set. The default is to use the scale these values based on the values of -x and -y.
FILES
-
crass.<TIMESTAP>.log
-
Log file containing information about the last execution of crass
-
Group_<NUM>_<DNA>.fa
-
Fasta file of all reads from a DR type.
-
Spacers_<NUM>_<DNA>.spacers.gv
-
File representing the graph of the DR type in Graphviz format
-
crass.<TIMESTAMP>.keys.gv
-
A file in graphviz format that contains all of the colour codes for the coverage values in the output graph
-
crass.crispr
-
A crispr file representing all the information about each of the DR types identified
DIAGNOSTICS
The
crass utility exits 0 on success, and >0 if an error occurs.