Question

Mapping paired-end reads to a draft genome using HISAT2

0

Entering edit mode

6.8 years ago

Farbod ★ 3.4k

Dear Biostars, Hi

I have the RNA-seq data of a salmon fish (3 cond1 and 3 cond2 as biological replicate and both are paired-ends -> 12 fastq files ) and I have done Trinity de novo assembly and DEG analysis on them.

Recently the draft genome of that salmon species have been released.

I want to run a genome-guided and then DEG analysis on the same data to compare the results.

Using many helps from Biostars and @Kevin Blighe, I have selected HISAT2, StringTie approach for this purpose.

First, I created an indexed genome using: ./hisat2-build -p 6 '/home/Salmon-genome/GCF_salmon_genome.fna' ht2_base_salmon_genome

and now I have 8 *.ht2 files and want to map my reads to that reference.

Q1: Please check below mapping script I have found from here to see if it is correct for my case or not. specially I need help about paired-end data mapping. what should I write instead of sample_1.fastq in this case that I have paired-end reads?

./hisat2 -p 6  -x path/to/reference/fileName -1 path/to/fastqFile/sample_1.fastq -2 path/to/fastqFile/sample_2.fastq -S /path/to/outDir/fileName.sam &> /path/to/outDir/fileName.sam.info

Q2: can I map all my 12 files to the reference genome in one script? how? (is it by using && and repeating the script, changing fastq file names?).

Thanks

RNA-Seq alignment HISAT2 genome guided • 4.9k views

ADD COMMENT • link updated 6.8 years ago by GenoMax 150k • written 6.8 years ago by Farbod ★ 3.4k

score 3 · Accepted Answer · 2018-06-24

3

Entering edit mode

6.8 years ago

GenoMax 150k

Q1: Sample_1.fastq will be replaced by R1 data file. Sample_2.fastq will be replaced by R2 data file.

Q2: You could follow parallel approach (Script to run blast locally with multiple files in a directory as queries ) or use this method Hisat2 multiple paired end reads

ADD COMMENT • link 6.8 years ago by GenoMax 150k

0

Entering edit mode

Dear @genomax, Hi and thank you. I guess all other parameters in the script were fine.

would you please explain more what do you mean by "R1 data file" ?

My fastq files names are as :

J1-left.fq, J1-right.fq, J2-left.fq, J2-right.fq, J3-left.fq, J3-right.fq, H1-left.fq, H1-right.fq, H2-left.fq, H2-right.fq, H3-left.fq and H3-right.fq.

ADD REPLY • link 6.8 years ago by Farbod ★ 3.4k

1

Entering edit mode

left is likely equivalent to R1 and right should be R2. You can look in the read headers of left and right files. You may see something like

R1 - @EAS139:136:FC706VJ:2:2104:15343:197393 1:Y:18:ATCACG

R2 - @EAS139:136:FC706VJ:2:2104:15343:197393 2:Y:18:ATCACG

ADD REPLY • link 6.8 years ago by GenoMax 150k

0

Entering edit mode

Hi there, I have used "./hisat2 -p 6 -x ht2_base_salmon_genome -1 '/media/Seagate Backup Plus Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_left.fq' -2 '/media/Seagate Backup Plus Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_right.fq' -S H1.sam &> H1.sam.info" and then I received this warning in the H1.sam.info file:

Warning: Same mate file "/media/Seagate" appears as argument to both -1 and -2

Extra parameter(s) specified: "Backup", "Plus", "Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_left.fq",

"Backup", "Plus", "Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_right.fq" Note that if <mates> files are specified using -1/-2, a <singles> file cannot also be specified. Please run bowtie separately for mates and singles.

Error: Encountered internal HISAT2 exception (#1) Command:

/home/Desktop/salmon-genome-2018/hisat2-2.1.0/hisat2-align-s --wrapper basic-0 -p 6 -x ht2_base_salmon_genome -S H1.sam -1 /media/Seagate -2 /media/Seagate Backup Plus Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_left.fq Backup Plus Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_right.fq (ERR): hisat2-align exited with value 1

ADD REPLY • link 6.8 years ago by Farbod ★ 3.4k

2

Entering edit mode

You would be wise to remove spaces in file/directory paths. I suggest you rename Seagate Backup Plus Drive as Seagate_Backup_Plus_Drive

ADD REPLY • link 6.8 years ago by GenoMax 150k

0

Entering edit mode

I know that this is not a bioinformatic question but do you have any idea how I can change the name of my external hard drive in Ubuntu 14.04?

I tried and get this error :

Cannot change label on mounted device of type filesystem:ntfs. (udisks-error-quark, 11)

ADD REPLY • link 6.8 years ago by Farbod ★ 3.4k

1

Entering edit mode

See if this helps: https://askubuntu.com/questions/731579/problem-in-renaming-the-partition-and-opening-a-file-through-terminal

If your drive is formatted using ntfs (windows file system) then are you able to write to it under ubuntu? As I recall that requires installation of some additional software (which you may have done).

ADD REPLY • link 6.8 years ago by GenoMax 150k