Dear Friends, Hi. . ( I'm not native in English so, be ready for some possible language flaws).
I have used my 6 left and right fastq files (2 treatments, 3 biological replication paired-end RNA-seq for each- non-model fish) with STAR for creating "coordinate-sorted bam" files that Trinity needs it for genome-guided approach.
But now I have 6 coordinate-sorted bam files and in the Trinity script it is just one :
Trinity --genome_guided_bam rnaseq.coordSorted.bam \ --genome_guided_max_intron 10000 \ --max_memory 10G --CPU 10
What must I do now ? Do I must run the above script 6 times ? or merge all the 6 coordinate-sorted bam files and produce just on coordinate-sorted bam file ?
these are the scripts I have used for STAR indexing and and aligning, if needed :
STAR –runMode genomeGenerate –runThreadN 20 –genomeDir ‘/home/Zebrafish-genome-index-STAR’ --genomeFastaFiles ‘/home/Zebrafish-genome-index-STAR/GCF_000002035.5_GRCz10_genomic.fasta’
then I have run the below script 6 times for my 6 different fastq sets:
STAR --genomeDir '/home/Zebrafish-genome-index-STAR' --runThreadN 24 --readFilesIn '/home/F1left.fastq' '/home/F1right.fastq' --outFileNamePrefix F1_Zebra --outSAMtype BAM SortedByCoordinate
Since you can specify only one bam as a genome guide I suppose you will have to merge all six.
Dear genomax2, Hi and thank you.
Do I must merge them by linux "cat" command or some tools or program is needed ?
samtools merge
would be the way to do it. Re-sort after merging. It does not look like you need to keep sample identification lines.for "Re-sort after merging", do I need other tools ?
about "to keep sample identification lines" I really dont know, yet !
Thanks
samtools sort -o merged_sorted.bam merged.bam
will do it.