Hello everyone, I'm actually using Hisat in order to map reads against one genome.
Here is the job file :
FILE=Genome.fa
READS1a=SRR9110374_1.fastq.gz
READS2a=SRR9110374_2.fastq.gz
###############################################################
# SOFTWARES #
##############################################################
# MAPPING HISAT2
HISAT2=/TOOLS/hisat2-2.1.0/
#bedtools
BEDTOOLS=/usr/bin/bedtools
###############################################################
# MAP SHORT READS TO INFER COVERAGE DEPTH
##############################################################
# copy on cluster
cd $OUT
# build index
$HISAT2/hisat2-build $FILE mapping_index
# mapping
$HISAT2/hisat2 -k 1 -q -x mapping_index -1 $READS1a -2 $READS2a -S mapping.sam > stats_mapping.txt
But the issue is here :
The R1 and R2 files are relatively huge and when I generate the .sam file it takes a lot of memory space ! I wondered then if Hisat2 was able to specify the threshold coverage to reach ? In other word can we define the number of read we want to map against the genome even if we do not take all the reads present in fastQ files ?
Tanks for your help
I was talking about the hard disk space sorry. So if I desire a coverage mapping of 10X I have to put -u 10 right ?
No, see my answer. I removed the
accepted
tag as the question is not solved yet, even though Devon Ryan 's answer is of course perfectly fine from the technical side towards how to limithisat2
to process only a certain number of reads.