What is Zatsu method?
Zatsu (雑 in Japanese) is translated into the following English words:
- rough
- crude
- sloppy
- messy
Zatsu method is a way to achieve the objective (writing a CWL, in this article) without knowing the details and without considering the details as much as possible.
What is this article?
If you are a serious person, this article is not for you. You should go to the User Guide and the specifications which will help you to write a graceful CWL.
This article is for the people who complain: I do not want to care about the details of the CWL syntax! I just want to put my tool into CWL workflows!.
This article describes how to write a tool definition in Common Workflow Language (CWL) without knowing details of the specifications.
This is an English version of 雑に始める CWL! that is written in Japanese.
Notice
This article does not describe the way to handle tool options gracefully. Go to the User Guide.
A tool to be CWLized in this article
The following command print the first n lines to the standard output.
n is specified with -n option.
$ head -n5 foobar.txt
- I want to give
5in-n5andfoobar.txtas input parameters of CWL definition. - I want to capture the standard output of
headcommand as an output object. - I do not care about other options!
Running example of head command (without CWL)
$ man head > manhead.txt
$ head -n5 manhead.txt
HEAD(1) BSD General Commands Manual HEAD(1)
NAME
head -- display first lines of a file
Let's cooking!
- First, copy the following as
head.cwl.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: []
arguments: []
inputs: []
outputs: []
- Add the first part of the command to
baseCommandand the rest toarguments(comma separated).
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n5, foobar.txt]
inputs: []
outputs: []
- Add input parameters to
inputsfield. In this case, you introducesourcefor the input file andnlinesto specify the firstnlines.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n5, foobar.txt]
inputs:
- id: source
- id: nlines
outputs: []
- Add a type for each input parameter. Specify
Filetype forsourcebecause it takes a file and specifyintfornlinesbecause it takes an integer.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n5, foobar.txt]
inputs:
- id: source
type: File
- id: nlines
type: int
outputs: []
- Replacing
5andfoobar.txtwith the values ofnlinesandsource. It can be done by using the following notation:$(inputs."id field name").
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n$(inputs.nlines), $(inputs.source)]
inputs:
- id: source
type: File
- id: nlines
type: int
outputs: []
- Add an output parameter. In this case I call it
out.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n$(inputs.nlines), $(inputs.source)]
inputs:
- id: source
type: File
- id: nlines
type: int
outputs:
- id: out
- As same as the case of input parameters, add a type for each output parameter. You can use
stdouttype to capture the standard output.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n$(inputs.nlines), $(inputs.source)]
inputs:
- id: source
type: File
- id: nlines
type: int
outputs:
- id: out
type: stdout
- Well done!
Does it work?
Let's run it with source=manhead.txt and nlines=5 as input parameters.
$ man head > manhead.txt
$ cwltool head.cwl --source manhead.txt --nlines 5
...
[job head.cwl] completed success
{
"out": {
"location": "file:///Users/tom-tan/147d6ad946430f35c2aafbff6c5604d43b30aa14",
"basename": "147d6ad946430f35c2aafbff6c5604d43b30aa14",
"class": "File",
"checksum": "sha1$30c2a20de147871db76e237705ac274277504428",
"size": 145,
"path": "/Users/tom-tan/147d6ad946430f35c2aafbff6c5604d43b30aa14"
}
}
Final process status is success
$ cat 147d6ad946430f35c2aafbff6c5604d43b30aa14
HEAD(1) BSD General Commands Manual HEAD(1)
NAME
head -- display first lines of a file
- You can see the output object after
[job head.cwl] completed success.- You can see the file object named
147d6ad9464...foroutparameter.
- You can see the file object named
- You can verify it is the same output of [Running example of
headcommand (without CWL)](#Running example ofheadcommand (without CWL)) by usingcatcommand.
Next steps!
How terrible the output file name is...
You can specify the file name by using stdout field.
...
outputs:
- id: out
type: stdout
stdout: output.txt # Add this line
I want to specify the output file name to manhead-head.txt when the source file name is manhead.txt.
Use $() notation. You can get manhead from manhead.txt by using nameroot field in File object.
...
outputs:
- id: out
type: stdout
stdout: $(inputs.source.nameroot)-head.txt # Add this line
Can I use Docker?
Yes, use DockerRequirement. You can specify the docker image by using dockerPull field.
...
outputs:
- id: out
type: stdout
stdout: $(inputs.source.nameroot)-head.txt
# Add the following
requirements:
- class: DockerRequirement
dockerPull: debian:latest
You do not have to specify the details of containers such as volume mount.
They can be handled by CWL workflow engines.
Conclusion
Now you can write tool definitions in CWL!
Do it!