Hi,
I have fasta sequences from sample that includes spike-in controls. How can I align those spike-in with the database using bowtie2 and filter them to create mapped reads of FASTQ files. Any references or scripts to perform the Task would be appreciated.
Thanks
Depending on how long and specific your spike-ins are, I recommend using BBMap's Seal which can both remove and quantify them using kmer-matching. For example:
Yes, they are the full length of the spike-ins. Although, there were 12 spike-ins used, I gave examples of only 3 spike-ins which is ok to know the command-line for removal .
In that case, if you decide to use Seal, change the flag "K=31" to "k=20" or whatever the length is of the shortest spike-in. And you may want to allow a substitution with the "hdist" flag, e.g.
Hi swbarnes2, I prepared a new *.fa file with all the sequences of spike-in as shown below
Would you be kind enough to send me the script for Bowtie2 provided my test sequence is B12_015.fastq? Thank you very much
With such short sequences, I think kmer-matching will probably work better than alignment... are those the full length of the spike-ins?
Yes, they are the full length of the spike-ins. Although, there were 12 spike-ins used, I gave examples of only 3 spike-ins which is ok to know the command-line for removal .
In that case, if you decide to use Seal, change the flag "K=31" to "k=20" or whatever the length is of the shortest spike-in. And you may want to allow a substitution with the "hdist" flag, e.g.