Skip to content

Translating transcript-relative loci to genome-relative loci

Notifications You must be signed in to change notification settings

Uauy-Lab/transcript_to_genomic_loci

 
 

Repository files navigation

Code for translating transcript-relative loci to genome-relative loci

This repository contains the code used to translate transcript-relative loci (including the output of TargetFinder (https://github.com/carringtonlab/TargetFinder) into genome-relative loci. If the transcript-relative locations span introns) the output will include multiple genome-relative regions.

Script 01: Extracting exon information for loci transcripts (shell)

  • Extracts the exon structure of loci transcripts from a genome annotation GFF3 file
  • Splits the exon structure files into separate GFF3 files for transcripts on the negative and positive strand of the genome

Requirements:

  • A UNIX shell such as BASH

Input files:

  • A GFF3 file of transcript-relative loci you want to translate (loci_file)
  • A genome annotation GFF3 file (annotation_file)

Usage:

  • User input is required on lines 4 (path/to/working/directory), 9 (loci_file) and 66 (annotation_file)
  • Ensure the required input files are in the working directory. All output files will also be written to this directory.

Script 02: Translating the loci (R)

  • Generates a .txt file of genome-relative loci_file

Requirements:

  • R with tidyr, plyr and dplyr packages installed

Input files: All required input files are generated by script 01.

  • annotation_file.exons.targets.positive.2.3.4.gff3
  • annotation_file.exons.targets.negative.2.3.4.gff3
  • loci_file.2.3.4.gff3

Usage:

  • User input is required on lines 13 (path/to/working/directory), 20 (annotation_file), 23 (annotation_file) and 25 (annotation_file)
  • The output file name can be changed on line 370
  • The working directory should be the same as that used for script 01. The output file will also be written to this directory.

Output file The output of this pipeline is a .txt file ("genomic_loci.txt") with 5 or more columns

  1. user-provided locus IDs

  2. chromosome

  3. genome strand

  4. Region 1 end

  5. Region 1 start

  6. Onwards (if the locus spans multiple exons): Locus end and start positions

About

Translating transcript-relative loci to genome-relative loci

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 81.8%
  • Shell 18.2%