Question

star --readFilesIn error when using unmapped.out.mate1/2 files

0

Entering edit mode

5.2 years ago

kokoko • 0

Hello I'm very new to star. I am trying to run star by using the star's --outReadsUnmapped Fastx output (Unmapped.out.mate1/2 files). Although they are fastq files, star keeps showing me this error.

EXITING because of fatal input ERROR: could not open readFilesIn=

And this is my command

STAR  --runThreadN 12      \
  --runMode alignReads \
  --genomeSAindexNbases 10  \
  --genomeDir  ${PROJECT_DIR}ref_bacteria/ \
  --readFilesIn ${PROJECT_DIR}align/Sample_1/Sample_1Unmapped.out.mate1 \ 
  --outFileNamePrefix ${PROJECT_DIR}align/Sample_1/Sample_1    \
  --outFilterMismatchNoverLmax 0.02                            \          
  --outSAMtype BAM SortedByCoordinate  \
  --outSAMattributes All               \
  -quantMode GeneCounts                \
  --outSAMunmapped Within              \
  --sjdbGTFfile ${PROJECT_DIR}ref_bacteria/genes.gtf \
  --sjdbOverhang 100;

done

Can you tell me the reason why i cannot use this Unmapped.out.mate1/2 files?

Thanks.

star RNA-Seq • 3.8k views

ADD COMMENT • link updated 3.0 years ago by benformatics 4.2k • written 5.2 years ago by kokoko • 0

0

Entering edit mode

There is no --outReadsUnmapped Fastx in your command line

Furthermore the error says that your --readFIlesin is not correct :

--readFilesIn ${PROJECT_DIR}align/Sample_1/Sample_1Unmapped.out.mate1

Should be a path to fastq file like :

--readFilesIn ${PROJECT_DIR}align/Sample_1/Sample_1.fastq

ADD REPLY • link 5.2 years ago by Bastien Hervé 6.5k

0

Entering edit mode

I think they are using the --outReadsUnmapped Fastx output as the input for this command.

Is the ${PROJECT_DIR} variable ending in / ?

Can you head the Sample_1Unmapped.out.mate1 file ?

ADD REPLY • link 5.2 years ago by benformatics 4.2k

0

Entering edit mode

Yes. ${PROJECT_DIR} is starting and ending in /. And I can head that file and it was certainly fastq file.

it shows

@A00718:115:HT7HLDSXX:4:1128:26549:26616 0:N:  00
GTCACCATGATGTCAGAGACAGGAATAACCTAAAATCCTCTGAGGGGTAGGTAATTCCAGACCTGGTGTTAAAAGGCCCCTCAGCAACCTTTTGTCATCAC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00718:115:HT7HLDSXX:4:1128:3613:26631 0:N:  00
GTTCAGCACAAACACTCCCTTGTCCACAGCCACTAGCCCAACTCGCGCCCCCTGTTTAACTTCAATACTGAGTGTCGTTTGAAGCCCAGGTGCGAGATGTT
+
::FFFF:F,:,,FF,F,FFFF,F,FFFFFF:F,FF::,:,F:F:,,,F,,::F:,,F,,,,FF:FF,F:F:,F,FF,:,,:,:FFFF,,:,,,::FFF:,F
@A00718:115:HT7HLDSXX:4:1128:6198:26631 0:N:  00
ATCAAATACAAAGCTTTTTACAAAATTTTGAAGGCTGAACTCACTATGCACTAAGAGTTGTGCAAAGGGATTTACATATGTAATCTCAGTTAGTACTCAAA

ADD REPLY • link updated 5.2 years ago by Bastien Hervé 6.5k • written 5.2 years ago by kokoko • 0

0

Entering edit mode

benformatics is right. I used --outReadsUnmapped Fastx from another running as a input

ADD REPLY • link 5.2 years ago by kokoko • 0

0

Entering edit mode

Hello! I guess I could be late for this topic, but I'm trying to do a very similar analysis and I'm quite desperate, cause I don't really know how to.

Did you finally get to analyze those _unmapped.out.mate1/2 files? Is there any way to convert them to .fastq format?

And, apart from that, which bacteria reference database did you use?

I'm in my first days of analyzing RNAseq data and I just need to go on analyzing those unmapped sequences.

Thank you so much in advanced :)

ADD REPLY • link 4.3 years ago by miriam.gorostidi • 0

0

Entering edit mode

Hello, the unmapped files from STAR are fastq file. If you do :

head Sample_1Unmapped.out.mate1

You will see there are fastq formatted, but I don't know why STAR does not put the .fastq extension.

Anyways, you can just add the extension modifying the name of the file :

mv Sample_1Unmapped.out.mate1 Sample_1Unmapped.out.mate1.fastq

ADD REPLY • link 4.3 years ago by Bastien Hervé 6.5k

0

Entering edit mode

Thank you so much!! I didn't expect it to be that easy... but indeed it worked :) Thank you!

ADD REPLY • link 4.3 years ago by miriam.gorostidi • 0

0

Entering edit mode

Can you specify the output path/name of the un-aligned reads?
I want to align several files in parallels. so each output file should have a unique name.

ADD REPLY • link 3.0 years ago by elisheva ▴ 120

0

Entering edit mode

The starting sample names should be transferred to the unmapped file sample name? --outFileNamePrefix....

ADD REPLY • link 3.0 years ago by benformatics 4.2k