SeraSeq

Download

Command Line

SeraSeq data is large consortium which consist of unfiltered and filtered bam files, accounting of about more than 200 GB. The users are requested to use the command line "wget http://big2.hanyang.ac.kr/SeraSeq/SeraSeq.tar.gz" or can use DOWNLOAD option stated above.

Description

Targeted sequencing data from cfDNA reference materials from SeraCare (Milford, MA, USA) were generated. DNA libraries were prepared using a KAPA Hyper Prep kit (Kapa Biosystems, Woburn, MA, USA) as described previously.
Hybrid selection for target enrichment was performed using customized baits targeting 38 cancer-related genes. After hybrid selection, the libraries were pooled, amplified,purified, quantified, and then subjected to cluster amplification according to the manufacturer’s protocol (Illumina, San Diego, CA, USA).
Flow cells were sequenced in the 150-bp paired-end mode using a NextSeq 500/550 High Output Kit v2.5 (Illumina). The mean target coverage was 2023X. Two kinds of DNA mixtures,with the frequency of variant alleles ranging from 0.5%~ 5.0% (CMM and MMv2),and a plasma-like DNA mixture, with the frequency of variant alleles ranging from 0.5%~2.5% (CR), were generated along with WT DNA (Supplementary Table 10).
The WT material was used as the matched normal. The BreaKmer tool was excluded from this analysis because it failed to call variants from any sample,presumably because its approach is not feasible for such low allele frequencies.

Reference genome

Aligned to hg19 (GRCh37)