bioinformatics
cwl
CWLDay 19

CWL User Guide 19: Custom Types をやってみた

CWL User Guide 19: Custom Types

Common Workflow Language User Guide: Custom Types

18はないようです。

今回は、独自の type を定義して CWL で使う方法について書かれています

Key Points を引用すると

  • You can create your own custom types to load into descriptions.
  • These custom types allow the user to configure the behaviour of a tool without tinkering directly with the tool description.
  • Custom types are described in separate YAML files and imported as needed.

この回にでてくる主なキーワード

  • inputs:applications
  • requirements:SchemaDefRequirement
  • ResourceRequirement

CWLファイル、custom-types.cwl

custom-types.cwl
cwlVersion: v1.0
class: CommandLineTool

label: "InterProScan: protein sequence classifier"

doc: |
      Version 5.21-60 can be downloaded here:
      https://github.com/ebi-pf-team/interproscan/wiki/HowToDownload

      Documentation on how to run InterProScan 5 can be found here:
      https://github.com/ebi-pf-team/interproscan/wiki/HowToRun

requirements:
  ResourceRequirement:
    ramMin: 10240
    coresMin: 3
  SchemaDefRequirement:
    types:
      - $import: InterProScan-apps.yml

hints:
  SoftwareRequirement:
    packages:
      interproscan:
        specs: [ "https://identifiers.org/rrid/RRID:SCR_005829" ]
        version: [ "5.21-60" ]

inputs:
  proteinFile:
    type: File
    inputBinding:
      prefix: --input
  applications:
    type: InterProScan-apps.yml#apps[]?
    inputBinding:
      itemSeparator: ','
      prefix: --applications

baseCommand: interproscan.sh

arguments:
 - valueFrom: $(inputs.proteinFile.nameroot).i5_annotations
   prefix: --outfile
 - valueFrom: TSV
   prefix: --formats
 - --disable-precalc
 - --goterms
 - --pathways
 - valueFrom: $(runtime.tmpdir)
   prefix: --tempdir


outputs:
  i5Annotations:
    type: File
    format: iana:text/tab-separated-values
    outputBinding:
      glob: $(inputs.proteinFile.nameroot).i5_annotations

$namespaces:
 iana: https://www.iana.org/assignments/media-types/
 s: http://schema.org/
$schemas:
 - https://schema.org/docs/schema_org_rdfa.html

s:license: "https://www.apache.org/licenses/LICENSE-2.0"
s:copyrightHolder: "EMBL - European Bioinformatics Institute"

パラメータファイル

custom-types.yml
proteinFile:
    class: File
    path: test_proteins.fasta

必要なファイル

InterProScan-apps.yml
type: enum
name: apps
symbols:
 - TIGRFAM
 - SFLD
 - SUPERFAMILY
 - Gene3D
 - Hamap
 - Coils
 - ProSiteProfiles
 - SMART
 - CDD
 - PRINTS
 - PIRSF
 - ProSitePatterns
 - Pfam
 - ProDom
 - MobiDBLite

test_proteins.fastaの取得について

実行

実行方法

実行結果

現在の状態では
interproscan.sh がないってことで、エラーになっています。
これについては問い合わせ予定。

$ cwltool custom-types.cwl custom-types.yml
/usr/local/bin/cwltool 1.0.20171107133715
Resolved 'custom-types.cwl' to 'file:///home/vagrant/cwl_user_guide_work/19-custom-types/custom-types.cwl'
custom-types.cwl:1:1: unrecognized extension field `http://schema.org/license`.  Did you include a $schemas section?
custom-types.cwl:1:1: unrecognized extension field `http://schema.org/copyrightHolder`.  Did you include a $schemas section?
[job custom-types.cwl] /tmp/tmpNxr2ff$ interproscan.sh \
    --outfile \
    test_proteins.i5_annotations \
    --formats \
    TSV \
    --disable-precalc \
    --goterms \
    --pathways \
    --tempdir \
    /tmp/tmph91Fgl \
    --input \
    /tmp/tmpJFLLEi/stg9a4b13fb-77ed-4d7c-991b-401d4dc78a1b/test_proteins.fasta
'interproscan.sh' not found
[job custom-types.cwl] completed permanentFail
{}
Final process status is permanentFail

今回使ったファイル

cwl_user_guide_work/19-custom-types at master · manabuishii/cwl_user_guide_work