Skip to content

NCI-RBL/RBL_RBL3

Repository files navigation

RBL3 Pipeline Overview

Workflow

  • Adapters are trimmed from FASTQ files with Porechop
  • Alignment is performed with minimap2
  • An option to clean-up SAM files is provided in config. If selected SAM files are corrected for mismatches, microindels, and noncanonical splice junctions that have been mapped to the genome, using TranscriptClean when clean_transcript flag is turned on
  • QC is performed on FASTQ files with FASTQC and on BAM files with SAMTOOLS
  • Two workflows are completed for transcipt identification and annotation:

    TALON workflow is performed including read priming, abundance counts of transcripts, filtering transcripts and summarizing transcripts before and after filtering. See output summary below. FLAIR workflow is performed. See output summary below.

  • DEG Analysis is completed based on user-provided input of comparison groups via FLAIR count matrices Alt text

Output Summary

  • Intermediate files are defined in Wiki
  • QC_report provides summaries of the sequencing alignment, by read length
  • MultiQC_report provides summaries of the FASTQ and BAM files quality
  • Summary_report provides overview of parameters included within the workflow run, summaries of both the TALON and FLAIR workflows, comparisons between the outputs, and a DEG summary, if included.

Getting Started

Review the wiki page for a tutorial and general information on running the pipeline.