Skip to content

Reference: How is my reference supposed to be formatted or formulated?

marabouboy edited this page Dec 13, 2021 · 3 revisions

INPUT

Input Reference: GTF

This section is dedicated to explaining the reference, GTF.

I will show what type of reference files are accepted and examples of how the GTF files are required to be formatted/formulated

1. GFF/.gff3 format:

  • Most common gene annotation format is the GFF3 format. While being the most common format, FLAME requires a specific formating of the GTF format.
  • My recommendation is to download the annotation file for your organism through NCBI <-----ERROR----->How to download the gff3-file for the conversion into GTF file
2. Conversion into GTF/.gtf format:

  • While you could use another open-source or propreitary software when converting the GFF3-file into a GTF-file, the FLAME program is deigned with using the Galaxy web-based platform in mind. Using the standardized format is planned to be integrated into the FLAME program.
  • My recommendation is to upload the GFF3 file into Galaxy <-----ERROR----->How to upload data onto the Galaxy web-based platform
  • Following uploading the GFF3 file, using the gffread function, with the important parameters highlighted in red: <-----ERROR----->How to convert GFF3 into GTF using the Galaxy web-based platform
  • Ideally, your GTF annotation reference file should resemble or be formated like this: <-----ERROR----->How the ideal GTF file for FLAME is formated
3. Adding additional exons to your reference:

  • Once you have cycled through FLAME once and have one or several novel exons suggested, to add the novel exons to the GTF annotation reference file, one simply needs to manually add an additional line in the GTF-file with all but the exon starting and ending position being altered. Remember to also specify that the added novel exon is "exon" and not "CDS" in the third column as the CDS are automatically filtered out in the FLAME workflow.
  • Ideally, your additional exon follows this format, with the novel exons highlighted in red: <-----ERROR----->Example of how additional exons are added to the GTF-file