Question

From pooled fastq data to SNPs

0

Entering edit mode

6.8 years ago

Fedster ▴ 30

I have some pooled GBS data (96 samples( that was generated by two runs on an Illumina HiSeq4000. The DNA was sise selected, between 100 and 200bp. The output is two files (run 1 and run 2) as fastq.gz files. For each sample I have an unique barcode -- said barcodes are in a list in a CSV file.

In addition I have a reference genome (as a single fasta file) for the organism in question.

What I want is the genotype (as SNPs) for each sample, and thus I need to demultiplex my data, ideally throw out fragments that have too low quality, align fragments and pileup, call SNPs etc.

Writing in January 2018, is there a preferred pipeline for this? I think I could do everything using Stacks, but other approaches might be available and offer greater speed/whatever other benefit? everything being similar a faster approach would be preferred.

fastq demultiplexing barcode GBS SNP • 1.8k views

ADD COMMENT • link updated 6.8 years ago by bari.ballew ▴ 470 • written 6.8 years ago by Fedster ▴ 30

score 1 · Answer 1 · 2018-01-26

1

Entering edit mode

6.8 years ago

bari.ballew ▴ 470

You probably want to check out Broad's best practices (https://software.broadinstitute.org/gatk/best-practices/). Specifically, look mainly at the sections on data pre-processing and germline SNPS+indels. Alternatively, you could check out the bcbio pipeline (http://bcbio-nextgen.readthedocs.io/en/latest/contents/pipelines.html).

Note that you're basically asking how to analyze sequencing data from nuts to bolts, so depending on your background, this will likely take a good bit of effort, both in reading/understanding the pipelines and in implementing. Best of luck!

ADD COMMENT • link 6.8 years ago by bari.ballew ▴ 470

0

Entering edit mode

I used PyRAD before, and it was slow as a rock. I would have hoped that, natural selection on the proliferation of alternative methods could have produced a sensible standard, sensible in terms of use, results and speed.

ADD REPLY • link 6.8 years ago by Fedster ▴ 30