How I deal with different lanes in RNA-seq alignment
4
0
Entering edit mode
5.2 years ago
zizigolu ★ 4.3k

Hi

I have Four lanes per sequencing run => 4 fastqs each so for each patient I have 16 Fastq files

I am using STAR for alignment by this code (each lane)

STAR --genomeDir ./hg38_Genome --readFilesIn ./fastq1./fastq2

At the end I would have 4 sam files

Should I merge sam files from each lane or I should merging lane when alignment by STAR?

Any help please?

Thank you

RNA-Seq STAR alignment • 6.8k views
ADD COMMENT
5
Entering edit mode
5.2 years ago
Ram 44k

I'd merge the FASTQs as that is easier.

ADD COMMENT
5
Entering edit mode
5.2 years ago
leaodel ▴ 190

You don't have to concatenate your files beforehand when using STAR. Adding the files of read1 (followed by read 2 if using paired end data) separated by comma should do the trick and save you time: STAR --readFilesIn file1_1.fastq,file2_1.fastq file1_2.fastq,file2_2.fastq

ADD COMMENT
0
Entering edit mode

Are you sure they won't be treated as different samples when comma-separated values are used? The manual states that this syntax is for multi-sample alignment.

ADD REPLY
0
Entering edit mode

It won't! I have processed several datasets where one sample comes with multiple .fastq files due to sequencing depth. It's essentially the same as concatenate - I, of course, tested before using. It's important not to put spaces between the files when you're feeding --readFilesIn.

ADD REPLY
1
Entering edit mode

Looks like the STAR manual is wrong, then. It is the second most disappointing manual, right after RSEM.

ADD REPLY
4
Entering edit mode
5.2 years ago
cschu181 ★ 2.8k

If they are from the same replicate and are just spread across lanes to achieve more depth, you may as well merge them before alignment.

ADD COMMENT
0
Entering edit mode

Thank you, the reads are paired end so should I do like below?

STAR --genomeDir ./hg38_Genome --readFilesIn ./fastq1_lane1 ./fastq1_lane2 ./fastq1_lane3 ./fastq1_lane4 ./fastq2_lane1 ./fastq2_lane2 ./fastq2_lane3 ./fastq2_lane4
ADD REPLY
2
Entering edit mode

Use cat to concatenate them into a single file per forward and reverse read.

ADD REPLY
0
Entering edit mode

Sorry how? I mean use cat for locating the folders? or what?

ADD REPLY
0
Entering edit mode

concatenating the individual fastq files into one file

ADD REPLY
0
Entering edit mode
cat test_rep1.fasta 
ATGC

cat test_rep2.fasta 
TGCATGC

# merge together
cat test_rep1.fasta > merged.fasta
cat test_rep2.fasta >> merged.fasta

cat merged.fasta 
ATGC
TGCATGC
ADD REPLY
2
Entering edit mode
5.2 years ago

As ATpoint said, concatenate the R1 files together with unix's cat command (you can do this on the compressed files), and do the same to the R2 files, and give those two combined files to STAR. But you can merge the bams together after the fact if you like with samtools merge.

Or, ask the people who made the fastqs for you to remake them with --no-lane-splitting.

ADD COMMENT
0
Entering edit mode

Sorry when I'm sorting my sam file by samtools sort -n I have one million non unique mapped reads after getting raw read counts by htseq but when I'm using picard samsort I only have 200000 non unique mapped reads

In your experiences what I'm doing wrong ?

Non unique mapped reads are too bad?

Should I use picard samsort then?

ADD REPLY

Login before adding your answer.

Traffic: 2768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6