issue with the Arima genomics pipeline for Hi-C scaffolding

0

Entering edit mode

8 months ago

rj.rezwan ▴ 10

Hi, I am using Arima genomics pipeline for scaffolding the contigs into chromosomes (https://github.com/ArimaGenomics/mapping_pipeline). I have an issue at this step "Step 3A: Pair reads & mapping quality filter". When I performed this step my combined file size decreased to 72MB, which is quite confusing because my f1 and f2 files sizes are 27 GB each. I also checked my log file which has no error inside it. Could you please explain is this normal to have this file size or there is any error with the script

here is my script

MAPQ_FILTER=10
module load samtools/1.16.1 perl/5.38.0/intel2022.3
export PATH=/ibex/user/tariqr/scaffolding/arima_pipeline/:$PATH

two_read_bam_combiner.pl abc_f1.filter.bam abc_f2.filter.bam samtools $MAPQ_FILTER | samtools view -bS -t abc_assembly.hic.p_ctg.fasta - | samtools sort -@ 32 -o abc.combine.bam

plant pacbio assembly genome Hi-C • 263 views

ADD COMMENT • link 8 months ago by rj.rezwan ▴ 10

Login before adding your answer.