Traditional techniques are unable to provide information of detrimental or neutral mutations, and only can inform about beneficial mutations that after few rounds of adaptation are fixed or nearly fixed in the population. The information generated by CirSeq is unique with respect to density (single nucleotide resolution) and nature (providing information on lethal, neutral and weak beneficial alleles). We consistently obtained 10e5-10e6 reads per nucleotide, from which we calculate the mutation frequency per nucleotide position in the viral genome. This provides an unprecedented wealth of genetic information. CirSeq analysis is performed on the T-BioInfo platform. Below is a screenshot of the Virology analysis graph, which includes algorithms for analyzing CirSeq data as well as regular NGS data. Hover over the buttons in the image below to read their descriptions. Data Mining is critical for a researcher that has produced huge quantities of data and needs to understand the complex phenomenons occurring.
Overall quality of the CirSeq librariesGeneral statistics for CirSeq generated reads:
- 1. Lengths of reads
- 2. PHRED quality of nucleotides in the reads
- 1. Distribution of repeat sizes
- 2. Quality of consensuses
- 1. Virus genome
- 2. Host genome
Algorithm to obtain consensus sequences
- 1. Splicing the CirSeq reads into repeats based on the most frequent distance between the same kmers in the read.
- 2. Mapping consensuses of the CirSeq repeats onto virus genome. Counts are calculated as:
-For symmetric data: sum of mapped consensuses per position
-For non-symmetric data: sum of weights of individual consensuses per position
- 3. Strict filtering out of consensuses based on matching nucleotides of high PHRED quality at least in two repeats of the consensus
- 4. Soft filtering out of consensuses based on matching nucleotides of medium PHRED quality in special combinations of three repeats. Each accepted consensus combination in a position provides a probability (weight) for a position to be a true consensus.