What is Zatsu
method?
Zatsu
(雑
in Japanese) is translated into the following English words:
- rough
- crude
- sloppy
- messy
Zatsu
method is a way to achieve the objective (writing a CWL, in this article) without knowing the details and without considering the details as much as possible.
What is this article?
If you are a serious person, this article is not for you. You should go to the User Guide and the specifications which will help you to write a graceful CWL.
This article is for the people who complain: I do not want to care about the details of the CWL syntax! I just want to put my tool into CWL workflows!.
This article describes how to write a tool definition in Common Workflow Language (CWL) without knowing details of the specifications.
This is an English version of 雑に始める CWL! that is written in Japanese.
Notice
This article does not describe the way to handle tool options gracefully. Go to the User Guide.
A tool to be CWLized in this article
The following command print the first n
lines to the standard output.
n
is specified with -n
option.
$ head -n5 foobar.txt
- I want to give
5
in-n5
andfoobar.txt
as input parameters of CWL definition. - I want to capture the standard output of
head
command as an output object. - I do not care about other options!
Running example of head
command (without CWL)
$ man head > manhead.txt
$ head -n5 manhead.txt
HEAD(1) BSD General Commands Manual HEAD(1)
NAME
head -- display first lines of a file
Let's cooking!
- First, copy the following as
head.cwl
.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: []
arguments: []
inputs: []
outputs: []
- Add the first part of the command to
baseCommand
and the rest toarguments
(comma separated).
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n5, foobar.txt]
inputs: []
outputs: []
- Add input parameters to
inputs
field. In this case, you introducesource
for the input file andnlines
to specify the firstn
lines.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n5, foobar.txt]
inputs:
- id: source
- id: nlines
outputs: []
- Add a type for each input parameter. Specify
File
type forsource
because it takes a file and specifyint
fornlines
because it takes an integer.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n5, foobar.txt]
inputs:
- id: source
type: File
- id: nlines
type: int
outputs: []
- Replacing
5
andfoobar.txt
with the values ofnlines
andsource
. It can be done by using the following notation:$(inputs."id field name")
.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n$(inputs.nlines), $(inputs.source)]
inputs:
- id: source
type: File
- id: nlines
type: int
outputs: []
- Add an output parameter. In this case I call it
out
.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n$(inputs.nlines), $(inputs.source)]
inputs:
- id: source
type: File
- id: nlines
type: int
outputs:
- id: out
- As same as the case of input parameters, add a type for each output parameter. You can use
stdout
type to capture the standard output.
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [head]
arguments: [-n$(inputs.nlines), $(inputs.source)]
inputs:
- id: source
type: File
- id: nlines
type: int
outputs:
- id: out
type: stdout
- Well done!
Does it work?
Let's run it with source=manhead.txt
and nlines=5
as input parameters.
$ man head > manhead.txt
$ cwltool head.cwl --source manhead.txt --nlines 5
...
[job head.cwl] completed success
{
"out": {
"location": "file:///Users/tom-tan/147d6ad946430f35c2aafbff6c5604d43b30aa14",
"basename": "147d6ad946430f35c2aafbff6c5604d43b30aa14",
"class": "File",
"checksum": "sha1$30c2a20de147871db76e237705ac274277504428",
"size": 145,
"path": "/Users/tom-tan/147d6ad946430f35c2aafbff6c5604d43b30aa14"
}
}
Final process status is success
$ cat 147d6ad946430f35c2aafbff6c5604d43b30aa14
HEAD(1) BSD General Commands Manual HEAD(1)
NAME
head -- display first lines of a file
- You can see the output object after
[job head.cwl] completed success
.- You can see the file object named
147d6ad9464...
forout
parameter.
- You can see the file object named
- You can verify it is the same output of Running example of
head
command (without CWL) by usingcat
command.
Next steps!
How terrible the output file name is...
You can specify the file name by using stdout
field.
...
outputs:
- id: out
type: stdout
stdout: output.txt # Add this line
I want to specify the output file name to manhead-head.txt
when the source
file name is manhead.txt
.
Use $()
notation. You can get manhead
from manhead.txt
by using nameroot
field in File
object.
...
outputs:
- id: out
type: stdout
stdout: $(inputs.source.nameroot)-head.txt # Add this line
Can I use Docker?
Yes, use DockerRequirement
. You can specify the docker image by using dockerPull
field.
...
outputs:
- id: out
type: stdout
stdout: $(inputs.source.nameroot)-head.txt
# Add the following
requirements:
- class: DockerRequirement
dockerPull: debian:latest
You do not have to specify the details of containers such as volume mount.
They can be handled by CWL workflow engines.
Conclusion
Now you can write tool definitions in CWL!
Do it!