Hi All,
I am struggling to pre-process my 16S rRNA gene amplicon Illumina Sequencing data using QIIME. I have several issues that I can't find clear answers for on QIIME's website.
I have 4 files from the sequencing - read 1, read 2, index 1 and index 2 (MiSeq Paired End - 2x 250 cycle). V1-V2 region, Schloss Primer design- 27F and 338R
Workflow Plan:
1) Extract Barcodes extract_barcodes.py): with the option to re-orientate reads (I am finding the reverse complement of my i7 adaptor / linker/ pad/ and barcodes at the beginning of some of my reads in the read 1 file and vice versa in the read 2 file (but reverse complement i5 adapter and forward primer instead)).
2) Join Paired Ends join_paired_ends.py): with option to update the index / barcode reads file to match the surviving joined pairs.
3) Split libraries split_libraries.py): To de-mulitplex and QC with option z- to remove the reverse primer (and adapter / linker / pad/ sequence).
My Questions:
1) I am struggling how to see how the extract barcodes script helps me and to best use it in my case- On QIIME's website it says: for two index/barcode reads and two fastq reads... This situation can be treated as a special case of paired-end reads. One could supply the index files (labeled as index1.fastq, index2.fastq) and use the --input_type barcode_paired_end:
i.e.: extract_barcodes.py --input_type barcode_paired_end -f index1.fastq -r index2.fastq --bc1_len 8 --bc2_len 8 -o parsed_barcodes/
The output barcodes.fastq file would be used for downstream processing, and the reads1 and reads2 files could be ignored.... (This sentence is part of what I don't understand... I need the read 1 and 2 to join the reads and then de-multiplex samples in the other downstream scripts right?)
2) Setting up mapping file for the split_libraries.py script with dual barcodes- THIS is my biggest issue. How do I list both barcodes when the formatting and script allows for only one barcode column? How do others handle duel barcodes with this script and mapping file format? I have been reading other pages but I can't find an answer or example on how to handle this and setup a mapping file to properly de-multiplex my dual indexed samples. Also, I have seen the other scripts use the mapping file - like to re-orientate reads. So getting this right is critical but I am very stuck here on how to set this up in my case.
3) Any other examples or resources to handle paired end Illumina miseq data in QIIME for first time users - specifically for those with 4 original sequencing files - 2 read files and 2 index read files.
Thank you in advance!
Sara
Hi, Sara
Have you solved the problem? I have same problem with you. My sequence data is from Illumina, paired-end reads, and one sample have 2 barcodes, 2 index reads, I don't know how to set up mapping file for Qiime software. Do you have any suggestions?
Thank you!
Jintian
Take a look at this thread: How do i should proceed with 16S rRNA amplicon sequencing data from Illumina MiSeq using QIIME pipeline?
Hi, Sara
Thank you for you reply. I still confused after looked the thread. My sequencing data didn't demultiplexed, I only have paired-end reads(Read1 and Reads2, which all samples contained in this two files), and 2 barcodes, 2 index reads for each samples. How should I do to set up mapping file to demultiplex samples when use Qiime software? Can you help me?
Thank you!
Jintian
Please use
ADD REPLY/ADD COMMENT
when responding to existing threads to keep them logically organized.Take at look at this link to see how to setup mapping file: http://qiime.org/documentation/file_formats.html