Sequence identifiers considerations when merging FASTQ files before alignment
1
0
Entering edit mode
7.4 years ago
alpanagi ▴ 10

I have concatenated two FASTQ from the same sample but different lanes. I tried to pass them to basespace from illumina to do the STAR alignment but it failed saying that the samples are from different lanes (based on the sequence identifier). I updated the files to have the same lanes but still get a random error when trying some files.

Now I am setting up STAR aligner on my own PC but I was wondering what the role of the sequence identifier is when doing STAR and how should I merge these two files correctly to not affect downstream analysis.

RNA-Seq alignment sequencing • 2.0k views
ADD COMMENT
1
Entering edit mode

Perhaps you should have done this in BaseSpace rather than concatenating the files.

ADD REPLY
0
Entering edit mode

I had no idea this could be done. This helped a lot. Thanks!

ADD REPLY
0
Entering edit mode

but still get a random error when trying some files.

You'll have to be more specific.

ADD REPLY
0
Entering edit mode

I would have also liked to know but the analysis doesn't actually describe the error and support couldn't elaborate further at least until now

ADD REPLY
1
Entering edit mode
7.4 years ago

The lane has nothing to do with alignment. STAR should run fine on the concatenated files when you run it yourself, and the sequence identifier will not affect downstream analysis (other than for ensuring pairing is correct or optical duplicate removal).

ADD COMMENT
0
Entering edit mode

This has more to do with how BaseSpace implements apps (of which STAR is one I suppose).

ADD REPLY
0
Entering edit mode

That is ultimately what I wanted to know. Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 1997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6