Question

error in identifying alternativce splcing using splAdder

0

Entering edit mode

7.5 years ago

bisht20diksha ▴ 30

Hello.I am trying to identify alternative splicing using splAdder. I have annotation GTF file. I have three replicates of barley both for control and treated respectively i.e c1, c2,c3 and t1, t2, t3. I have generated sorted bam and .bai files and put them in a working directory. When I ran splAdder,

python2.7 spladder.py -a genome_annotationfile.gtf -b c1Aligned.sorted,c2Aligned.sorted,c3Aligned.sorted,t1Aligned.sorted,t2Aligned.sorted,t3Aligned.sorted -o splAdder_result

it shows following information:

/home/aasim/.local/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.



from ._conv import register_converters as _register_converters
WARNING: barley_genome_annotationfile.gtf does not have gene level information for transcript HORVU1Hr1G000010.1 - information has been inferred from tags
WARNING: barley_genome_annotationfile.gtf does not have gene level information for transcript HORVU1Hr1G000020.1 - information has been inferred from tags
WARNING: barley_genome_annotationfile.gtf does not have gene level information for transcript HORVU1Hr1G000060.1 - information has been inferred from tags
WARNING: barley_genome_annotationfile.gtf does not have gene level information for transcript HORVU1Hr1G000080.1 - information has been inferred from tags
WARNING: too many warnings for inferred tags

WARNING: a total of 39734 cases had no gene level information annotated - information has been inferred from tags
WARNING: removing 6046 genes from given annotation that overlap to each other:
list of excluded genes written to: barley_genome_annotationfile.gtf.genes_excluded_gene_overlap
WARNING: removing 2 genes from given annotation that share exact exon coordines:
list of excluded exons written to: barley_genome_annotationfile.gtf.genes_excluded_exon_shared
Augmenting splice graphs.
=========================
Generating splice graph ...
...done.

Loading introns from file ...
Traceback (most recent call last):
  File "spladder.py", line 322, in <module>
    spladder()
  File "spladder.py", line 223, in spladder
    spladder_core(CFG)
  File "/home/aasim/diksha/sam_files_cd_rt_/modules/core/spladdercore.py", line 21, in spladder_core
    genes = gen_graphs(genes, CFG)
  File "/home/aasim/diksha/sam_files_cd_rt_/modules/core/gen_graphs.py", line 83, in gen_graphs
    introns = get_intron_list(genes, CFG)
  File "/home/aasim/diksha/sam_files_cd_rt_/modules/reads.py", line 431, in get_intron_list
    [intron_list_tmp] = add_reads_from_bam(gg, CFG['bam_fnames'], ['intron_list'], CFG['read_filter'], CFG['var_aware'], CFG['primary_only'], CFG['ignore_mismatch_tag'])
  File "/home/aasim/diksha/sam_files_cd_rt_/modules/reads.py", line 155, in add_reads_from_bam
    (introns, spliced_coverage) = get_all_data(blocks[b], filenames, mapped=False, filter=filter, var_aware=var_aware, primary_only=primary_only, no_mm=no_mm)
  File "/home/aasim/diksha/sam_files_cd_rt_/modules/reads.py", line 314, in get_all_data
    (coverage_tmp, introns_tmp) = get_reads(fname, contig_name, block.start, block.stop, strand, filter, mapped, spliced, var_aware, collapse, primary_only, no_mm)
  File "/home/aasim/diksha/sam_files_cd_rt_/modules/reads.py", line 43, in get_reads
    if filter_read(read, filter, spliced, mapped, strand, primary_only, var_aware, no_mm):
  File "/home/aasim/diksha/sam_files_cd_rt_/modules/reads.py", line 472, in filter_read
    return filter['mismatch'] < tags['NM']
KeyError: 'NM'

Where is the problem?

splAdder • 2.1k views

ADD COMMENT • link updated 6.5 years ago by Biostar 20 • written 7.5 years ago by bisht20diksha ▴ 30

1

Entering edit mode

Whenever you see such error, the best way to start looking (when you are not much familiar with Python) is to google "Python KeyError"; you will surely have some idea, where to look into.

PS: There are a couple of warnings as well. GTF files are boring to work with ;)

ADD REPLY • link 7.5 years ago by lakhujanivijay 5.9k

score 1 · Answer 1 · 2018-02-12

1

Entering edit mode

7.5 years ago

blawney ▴ 10

Note that I have not used this particular software, but at a quick glance, it appears your BAM files may not have the "NM" tag (edit distance to the reference, according to SAM spec) in the optional alignment fields (right-most column).

The KeyError exception is raised when you try to access a missing key in a Python dictionary, so it seems likely that the parser (pysam?) is not finding that tag in the alignments.