インストールはしないで singularity で実行
condaのサイト
conda の バージョンは v1.2.1 なのに対して 遺伝研スパコン singularity のバージョンは 1.6.2 だったのでインストールせず、singularity を使用する。(2022/09/05)
ヘルプ
singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt cds --help
Usage: /usr/local/bin/gt cds [option ...] [GFF3_file]
Add CDS (coding sequence) features to exon features given in GFF3 file.
-minorflen set the minimum length an open reading frame (ORF) must have to
be added as a CDS feature (measured in amino acids)
default: 64
-startcodon require than an ORF must begin with a start codon
default: no
-finalstopcodon require that the final ORF must end with a stop codon
default: no
-seqfile set the sequence file from which to take the sequences
default: undefined
-encseq set the encoded sequence indexname from which to take the
sequences
default: undefined
-seqfiles set the sequence files from which to extract the features
use '--' to terminate the list of sequence files
-matchdesc search the sequence descriptions from the input files for the
desired sequence IDs (in GFF3), reporting the first match
default: no
-matchdescstart exactly match the sequence descriptions from the input files for
the desired sequence IDs (in GFF3) from the beginning to the
first whitespace
default: no
-usedesc use sequence descriptions to map the sequence IDs (in GFF3) to
actual sequence entries.
If a description contains a sequence range (e.g.,
III:1000001..2000000), the first part is used as sequence ID
('III') and the first range position as offset ('1000001')
default: no
-regionmapping set file containing sequence-region to sequence file mapping
default: undefined
-v be verbose
default: no
-o redirect output to specified file
default: undefined
-gzip write gzip compressed output file
default: no
-bzip2 write bzip2 compressed output file
default: no
-force force writing to output file
default: no
-help display help and exit
-version display version information and exit
File format for option '-regionmapping':
The file supplied to option -regionmapping defines a ``mapping''. A mapping
maps the `sequence-region` entries given in the 'GFF3_file' to a sequence file
containing the corresponding sequence. Mappings can be defined in one of the
following two forms:
mapping = {
chr1 = "hs_ref_chr1.fa.gz",
chr2 = "hs_ref_chr2.fa.gz"
}
or
function mapping(sequence_region)
return "hs_ref_"..sequence_region..".fa.gz"
end
The first form defines a Lua (http://www.lua.org) table named ``mapping''
which maps each sequence region to the corresponding sequence file.
The second one defines a Lua function ``mapping'', which has to return the
sequence file name when it is called with the `sequence_region` as argument.
Report bugs to https://github.com/genometools/genometools/issues.
実行
singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt gff3 -sortlines -tidy out.gff3 > out.sorted.gff3
singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt cds -startcodon -finalstopcodon -seqfile $GENOME -o out.sorted.orf.gff3 out.sorted.gff3
エラーで落ちた...
/usr/local/bin/gt cds: error: no mapping rule given and no MD5 tags present in the query seqid "chrX" -- no mapping can be defined
gt cds コマンドに -matchdesc オプションをつけて再実行
-matchdesc search the sequence descriptions from the input files for the
desired sequence IDs (in GFF3), reporting the first match
default: no
singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt gff3 -sortlines -tidy out.gff3 > out.sorted.gff3
singularity exec /usr/local/biotools/g/genometools-genometools:1.6.2--py39h95ed972_0 gt cds -matchdesc -startcodon -finalstopcodon -seqfile $GENOME -o out.sorted.orf.gff3 out.sorted.gff3
できた!