See the Examples Page for a walkthough, and a selection of pre-made queries for some major pathogens.
The result of the search will be a list of signature chains and the corresponding DNA sequence from the reference genome. A signature chain is a set of consecutive k-mer signature words. Intervals are given as the start of the first signature word and the end of the last signature word in the chain. Thus, the interval [s,e] contains exactly e-s-k+2 signature words,completely covering the interval [s,e] in the reference sequence. For some searches, more signatures will be found than can be displayed at once. To reduce this to a more reasonable number, slide the "Signature chain length" slider to ~100bp. Also, for convenience, check the "Show corresponding gene info" and "Sort by sequence length" boxes to organize the table.
Signature words are perfectly conserved by all target genomes, and contain at least a single difference from every background sequence. Therefore, a signature chain will contain a difference with the background at least every k bases. For some types of detection assays, these signatures can still cross-react with background sequences and return false positive detections. However, we have found that long signature chains (e.g. >100bp) are often quite dissimilar from the background and make good targets for detection assays. After identifying these candidate target sequences, we recommend performing a more sensitive background screen of the individual signatures using Blast to assure they are sufficiently unique. This can be done by selecting the desired signatures and choosing "Run BLAST search" from the pick list above the signature table. The output signatures can also be viewed graphically.
Genome Browser DisplayLegend:
Navigating the genome browser
Below is an example representation of the signatures on the Bacillus anthracis Ames genome: