Hello, I need some help.
I have to use droptag on 2 fastq.gz files (an internship assigned work) coming from here
I want to use the droptag utility in order to demultiplex my fastq.gz files (as it extracts the cell barcodes and UMIs from the library): the syntax is like this:
droptag [options] -c config.xml barcode_reads.fastq [barcode_umi_reads.fastq] gene_reads.fastq [library_tags.fastq]
But, droptag needs a .fastq barcode files and all I have are 3 .txt files with the barcodes of the 3 rounds of SPLIT-seq barcoding: Their format is like this: For Round1 and 2:
#WellPosition Name Sequence
A1 Round1_01 /5Phos/CGCGCTGCATACTTGAACGTGATCCCATGATCGTCCGA
A2 Round1_02 /5Phos/CGCGCTGCATACTTGAAACATCGCCCATGATCGTCCGA
A3 Round1_03 /5Phos/CGCGCTGCATACTTGATGCCTAACCCATGATCGTCCGA
A4 Round1_04 /5Phos/CGCGCTGCATACTTGAGTGGTCACCCATGATCGTCCGA
For Round3:
#WellPosition Name Sequence
A1 Round3_01 /5Biosg/CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAACGTGATGTGGCCGATGTTTCG
A2 Round3_02 /5Biosg/CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAAACATCGGTGGCCGATGTTTCG
A3 Round3_03 /5Biosg/CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNATGCCTAAGTGGCCGATGTTTCG
I don't know what to do, do I have to write a script? I don't even know for now what a barcode.fastq file should look like. Or is there already an utility that solves this problem? I looked on the web but didn't find anything promising.
And yes, I tried, droptag doesn't work with .txt files.
In fact, it's okay, RNA-seq has the first raw fastq file with only the reads and the second with the barcodes, the demultiplexing phase can compute them together just fine.
So did solution suggested by @ale_abd work in your case?
No, I couldn't apply it in my case as the output is not really compatible with what droptag is searching for. But it could be a good solution for other people