FastQ files from MiSeq Illumina to the "mapping file" in Qiime...Barcodes sequences needed?
1
1
Entering edit mode
8.5 years ago

Hello!

I am very new to Qiime and can really use some help for more of the simple stuff!

I am working with soil microbial community data where I have received sequences from Genewiz (Illumina miseq 16S sequencing V3-V5). I am trying to make the mapping file but I don't seem to have the barcode sequences. I believe this data has already been demultiplexed. Do i need the bar code and primer sequence to set up the mapping file or can I do without it? Struggling very hard! I know this is a simple question but any help in getting the fastq files formatted correctly would be appreciated! Thank you!

Qiime Barcode fastq • 5.8k views
ADD COMMENT
1
Entering edit mode
8.5 years ago

Hi,

I already battled with this problem too... until I found the answer. Of course I assume that your FASTQ files were already demultiplexed, ando also assume that you are attempting to use the [split_libraries_fastq] command ( see http://qiime.org/scripts/split_libraries_fastq.html ) command. I also assume that you already merged your paired-end reads file into just one. In the QIIME documentation for this particular command: see [http://qiime.org/scripts/split_libraries_fastq.html] you'll see that there is an option --barcode_type "not-barcoded" , you can use this one.

Putting things together, the QIIME command I used was:

 split_libraries_fastq.py \
    -i $FASTQ\
    -o slout\
    --barcode_type 'not-barcoded'\
    -m $MAP_FILE\
    --sample_ids sample.1\

$FASTQ is the original FASTQ file and $MAP_FILE is your map file.

I fought with the problems to have to analyze multiple FASTQ files and in order to not have the need to write manually a map file each time I needed to run with a new FASTQ file I wrote the following Python code to generate for me a new "dummy" map file each time I need to run QIIME:

import os, sys
from os import path

def main():
    fastq_filename = sys.argv[1]
    if len(sys.argv) > 2:
        new_mapping_file_name = sys.argv[2]
    else:
        new_mapping_file_name = ".".join(fastq_filename.split(".")[:-1]) + ".map.txt" 

    mapping_file_header = ['#SampleID','BarcodeSequence','LinkerPrimerSequence','Description','File']
    mapping_file_line1 = ['sample.1','','','single_file',path.basename(fastq_filename)]

    with open(new_mapping_file_name,"w") as f:
        f.write("\t".join(mapping_file_header) + "\n")
        f.write("\t".join(mapping_file_line1) + "\n")


    print "New mapping file is ", new_mapping_file_name

if __name__ == '__main__':
    main()

This script accept two arguments: the first being your FASTQ filename and the second the name you wish to give the map file. As you can see in the code the map file is a tab-separated values file with just two lines: one is for the headers and the second has empty contents for the 2nd and 3rd columns. 'sample.1' is a dummy name I chose to give to the sample you can modify it at will: QIIME won't bother with that I think but if you change it, just remember to change the sample name accordingly in the command line option -sample_ids.

Cheers

ADD COMMENT
0
Entering edit mode

@pescadordigital I have a follow-up question that I think is simple to solve, but I am very new to Qiime. I am running Qiime in Jupyter Notebook. When I run your code for creating a dummy map I obtain the following output

New mapping file is /Users/gid/Library/Jupyter/runtime/kernel-5a62ed2f-e700-4161-ae33-75e8a7c79c3f.json

As you can see, the name of the json file is very long and it is not saved in the working directory where my fatsq files are located. Can you tell me how to correct these two things, so that I can run the first lines of code you provided? Any help would be really apreciated.

ADD REPLY

Login before adding your answer.

Traffic: 2499 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6