LoginSignup
0
0

More than 1 year has passed since last update.

genometools gt: transcript の orf を決める

Posted at

インストールはしないで singularity で実行

condaのサイト
conda の バージョンは v1.2.1 なのに対して 遺伝研スパコン singularity のバージョンは 1.6.2 だったのでインストールせず、singularity を使用する。(2022/09/05)

ヘルプ

singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt cds --help
Usage: /usr/local/bin/gt cds [option ...] [GFF3_file]
Add CDS (coding sequence) features to exon features given in GFF3 file.

-minorflen      set the minimum length an open reading frame (ORF) must have to
              be added as a CDS feature (measured in amino acids)
              default: 64
-startcodon     require than an ORF must begin with a start codon
              default: no
-finalstopcodon require that the final ORF must end with a stop codon
              default: no
-seqfile        set the sequence file from which to take the sequences
              default: undefined
-encseq         set the encoded sequence indexname from which to take the
              sequences
              default: undefined
-seqfiles       set the sequence files from which to extract the features
              use '--' to terminate the list of sequence files 
-matchdesc      search the sequence descriptions from the input files for the
              desired sequence IDs (in GFF3), reporting the first match
              default: no
-matchdescstart exactly match the sequence descriptions from the input files for
              the desired sequence IDs (in GFF3) from the beginning to the
              first whitespace
              default: no
-usedesc        use sequence descriptions to map the sequence IDs (in GFF3) to
              actual sequence entries.
              If a description contains a sequence range (e.g.,
              III:1000001..2000000), the first  part is used as sequence ID
              ('III') and the first range position as offset ('1000001')
              default: no
-regionmapping  set file containing sequence-region to sequence file mapping
              default: undefined
-v              be verbose
              default: no
-o              redirect output to specified file
              default: undefined
-gzip           write gzip compressed output file
              default: no
-bzip2          write bzip2 compressed output file
              default: no
-force          force writing to output file
              default: no
-help           display help and exit
-version        display version information and exit

File format for option '-regionmapping':

The file supplied to option -regionmapping defines a ``mapping''.  A mapping
maps the `sequence-region` entries given in the 'GFF3_file' to a sequence file
containing the corresponding sequence. Mappings can be defined in one of the
following two forms:

  mapping = {
    chr1  = "hs_ref_chr1.fa.gz",
    chr2  = "hs_ref_chr2.fa.gz"
  }

or

  function mapping(sequence_region)
    return "hs_ref_"..sequence_region..".fa.gz"
  end

The first form defines a Lua (http://www.lua.org) table named ``mapping''
which maps each sequence region to the corresponding sequence file.
The second one defines a Lua function ``mapping'', which has to return the
sequence file name when it is called with the `sequence_region` as argument.

Report bugs to https://github.com/genometools/genometools/issues.

実行

singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt gff3 -sortlines -tidy out.gff3 > out.sorted.gff3
singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt cds -startcodon -finalstopcodon -seqfile $GENOME -o out.sorted.orf.gff3 out.sorted.gff3

エラーで落ちた...

/usr/local/bin/gt cds: error: no mapping rule given and no MD5 tags present in the query seqid "chrX" -- no mapping can be defined

gt cds コマンドに -matchdesc オプションをつけて再実行

-matchdesc      search the sequence descriptions from the input files for the
              desired sequence IDs (in GFF3), reporting the first match
              default: no
singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt gff3 -sortlines -tidy out.gff3 > out.sorted.gff3
singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt cds -matchdesc -startcodon -finalstopcodon -seqfile $GENOME -o out.sorted.orf.gff3 out.sorted.gff3

できた!

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0