Format files in Trinity
0
0
Entering edit mode
6.2 years ago
luzglongoria ▴ 50

Hi there,

I am working with parasites so in my reads I have parasite and host information. In order to filter out my reads and keep only those ones from the parasite I have done a selective depletion and a selective capture with bowtie2 (I have a "close" related parasite genome and the host genome):

Command for selective capture:

bowtie2 --threads 4 --local --no-unal \
-x /home/luz_garcia_longoria/workspace/reference_genomes/parasitereference.fasta \
-q -k 1 --al aligned_reads.fastq \
-1 /home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq \
-2 /home/luz_garcia_longoria/workspace/s21_2.fq,s22_2.fq,s23_2.fq,s24_2.fq,s25_2.fq,s31_2.fq,s32_2.fq,s33_2.fq,s34_2.fq,s35_2.fq | samtools view -b -o aligned_parasite.bam

Command for selective depletion:

bowtie2 --threads 4 --local --no-unal \
-x /home/luz_garcia_longoria/workspace/reference_genomes/parasite_host_reference.fasta \
-q -k 1 --un no_aligned_reads.fastq \
-1 /home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq \
-2 /home/luz_garcia_longoria/workspace/s21_2.fq,s22_2.fq,s23_2.fq,s24_2.fq,s25_2.fq,s31_2.fq,s32_2.fq,s33_2.fq,s34_2.fq,s35_2.fq | samtools view -b -o no_aligned_host_parasite.bam

Now I have two BAM files with the information from the parasite and the information that (I guess) it's from my parasite species (specifically).

My next step is to do de novo assembling with Trinity. The problem is that I am not sure if I can use both files in one command in Trinity. I know I can merge these two files and then use them but I am not sure if this is correct. I have been searching and I found this page where they explain a little bit how to run Trinity with BAM files. This is the command they suggest:

$TRINITY_HOME/Trinity --genome_guided_bam alignments.hisat2.bam \
   --CPU 2 --max_memory 1G --genome_guided_max_intron 5000

So, my questions are:

Is it fine to use these two BAM (combined into one) in this case? How can I know the value of the option '--genome_guided_max_intron' ? It would be ok if I convert my BAM file into fastq file and then run Trinity or that would be something very stupid?

Thank you very much in advance.

RNA-Seq trinity format files • 1.8k views
ADD COMMENT

Login before adding your answer.

Traffic: 2529 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6