Skip to content

docs: revise best practices and contributing guide#293

Open
a-frantz wants to merge 5 commits intomainfrom
docs/best-practices
Open

docs: revise best practices and contributing guide#293
a-frantz wants to merge 5 commits intomainfrom
docs/best-practices

Conversation

@a-frantz
Copy link
Member

@a-frantz a-frantz commented Feb 3, 2026

Describe the problem or feature in addition to a link to the issues.

closes #277

Before submitting this PR, please make sure:

  • You have added a few sentences describing the PR here.
  • The code passes all CI tests without any errors or warnings.
  • You have added tests (when appropriate).
  • You have added an entry in any relevant CHANGELOGs (when appropriate).
  • If you have made any changes to the scripts/ or docker/ directories, please ensure any image versions have been incremented accordingly!
  • You have updated the README or other documentation to account for these changes (when appropriate).

@a-frantz a-frantz self-assigned this Feb 3, 2026
@a-frantz a-frantz requested a review from adthrasher February 3, 2026 20:59
@a-frantz a-frantz marked this pull request as ready for review February 3, 2026 21:59
@a-frantz a-frantz changed the title WIP[docs: revise best practices and contributing guide] docs: revise best practices and contributing guide Feb 3, 2026
Comment on lines +38 to +39
- See `template/common-parameter-meta.txt` for common description strings.
- If applicable, use the same parameter name and help text as the underlying tool called by the task.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all related to this PR, but I wonder if there is a way to tell Copilot about this file. It'd be great if its autocomplete suggestions would consider that file as a set of options when writing parameter_meta.

- Contributors can mix and match the available templates, copy and pasting subsections as appropriate.
- It is allowed to have one resource allocated dynamically, and another allocated statically in the same task.
- Multi-core tasks should *always* follow the conventions laid out in the `use_all_cores_task` example (see `template/task-examples.wdl`).
- This is catering to cloud users, who may be allocated a machine with more cores than are specified by the `ncpu` parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to revisit this paradigm in the future. With Planetary + AKS, the container only gets allocated the requested number of cores within the VM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q you may not have the answer to: how does nproc work in those environments? I think that nproc on the cluster is "smart" and only returns what was allocated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. With Planetary + AKS, the container is allocated n CPUs, and it can only see that many. I'm not actually confident that the idea behind nproc is sound. As long as the task is running in a container, the Docker engine will have provided some number of cores to the container. Presumably, whatever engine is queueing the Docker container will provide the cpu argument to Docker as the number of cores. The use_all_cores idea is probably better implemented at the engine-level. If the engine allocates 1 VM per task, then it should use_all_cores. If it is capable of sharing VMs between tasks, then it shouldn't use_all_cores.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point of use_all_cores is so that you can tell called tools what level of parallelism should be used. Most bioinformatics tools will only spin up as many threads as you ask for (i.e. without specifying some kind of --thread [>1 value] everything will run in one thread regardless of how many cores the task/engine/backend/VM allocates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant more that the engine, itself, should set use_all_cores if it isn't capable of scheduling multiple tasks to a node.

With our experience with LSF though, I'm curious how other HPC system function (e.g. Slurm). LSF does something so that the underlying task only sees the allocated cores. If the scheduling systems reliably did that, then we could just always use_all_cores.

- e.g. do not use `latest` tags for Docker images.
- This ensures reproducibility across time and environments.
- Check all assumptions made about workflow inputs before beginning long running executions.
- Common examples of assumptions that should be checked: valid `String` choice, mutually exclusive parameters, missing optional file for selected parameters, filename extensions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid String choice should be replaced with enum in new code. We should probably update this to note that parenthetically.

Suggested change
- Common examples of assumptions that should be checked: valid `String` choice, mutually exclusive parameters, missing optional file for selected parameters, filename extensions
- Common examples of assumptions that should be checked: valid `String` choice (post-WDL 1.3, an `enum` should be used in place of `String`s with a fixed set of valid options), mutually exclusive parameters, missing optional file for selected parameters, filename extensions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related question, should this document be applicable to all versions of WDL (which may have slightly different best practices), or should we simplify things and assume any "best practice conforming" WDL is going to be in the latest revision of the specification (which for now is 1.3)?

This would be easier to write if we just assumed everything was 1.3, but I also don't have enough experience (nil) writing 1.3 to feel very confident staking my flag in this document if we go that direction 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not ready to push it to 1.3, since we aren't using it and nothing else supports it. That said I think the document should be written to some form of latest (which, for now, I'd leave at 1.2). I think it's OK to note that something will change in the next version, if nothing else, as a reminder to ourselves to update it. We may want to consider having a section at the end along the lines of "pre-1.3"/"pre-1.2"/etc. that we move old best practices to when they no longer apply to the version we are targeting.

- Use `after` clauses in workflows to ensure that all these assumptions are valid before beginning tasks with heavy computation.
- If the _contents_ of a `File` are not read or do not need to be localized for a task, try to coerce the `File` variable to a `Boolean` (with `defined()`) or a `String` (with `basename()`) to avoid unnecessary disk space usage and networking.
- All requirement values are overridable at runtime. However, tasks should have easily configurable memory and disk space allocations.
- TODO should this be here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it for now. If we're sharing this in the OpenWDL channels, I assume we'll get feedback if people disagree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. I will rephrase it a bit to be clearer in intention (and agnostic of our specific naming conventions)

- TODO should this be here?
- multi-core tasks should *always* follow the conventions laid out in the `use_all_cores_task` example (see `template/task-examples.wdl`)
- this is catering to cloud users, who may be allocated a machine with more cores than are specified by the `ncpu` parameter
- TODO should this be here? Will at least need to be rephrased.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could delete this as a global best practice.

Comment on lines +29 to +31
- The non-empty qualifier (`+`) of arrays and maps should be avoided.
- TODO this is really just our opinion after trying to use it and finding the resulting WDL ugly. I'm not sure it can really be called a "best practice".
- This is because the non-empty qualifier can be cumbersome to deal with in WDL source code. No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually recall what led to writing this. I'm not sure we should recommend skipping use of a WDL language feature though as a best practice. If anything, the spec should be improved around the feature to improve the user experience.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We briefly had Array[*]+ on main before we had some issue with sprocket+miniwdl compatibility that made us go back and revert them all (and write this entry), although I now can't remember the specifics. It should be in our git history though, will just take some digging to find

Co-authored-by: Andrew Thrasher <adthrasher@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

revisit best-practices.md

2 participants