Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds filtering of sex chromosome probes to the UMAP generation pipeline and generates lists of probes that are affected by SNPs or do not map to the genome. The changes enhance the methylation workflow by providing more granular control over probe filtering and making filtered probe lists available as outputs.
Key changes:
- Added sex chromosome probe filtering capability to the UMAP generation
- Generated and output lists of SNP-affected probes and non-genomic probes
- Implemented a batched concatenation mechanism for large probe lists
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| workflows/methylation/methylation-standard.wdl | Added new outputs and batched probe list concatenation logic |
| workflows/methylation/methylation-preprocess.wdl | Added task to list sex chromosome probes and updated outputs |
| workflows/methylation/methylation-cohort.wdl | Integrated sex probe filtering into the cohort workflow |
| workflows/methylation/CHANGELOG.md | Documented new probe list outputs |
| scripts/methylation/methylation-preprocess.R | Added logic to identify and output SNP-affected and non-genomic probes |
| scripts/methylation/list-sex-probes.R | New script to generate sex chromosome probe list |
| scripts/methylation/filter.py | Added support for excluding probes from additional file sources |
| scripts/CHANGELOG.md | Documented script changes |
| docker/pandas/package.json | Incremented revision for pandas container |
| docker/minfi/package.json | Incremented revision for minfi container |
| docker/minfi/Dockerfile | Added new list-sex-probes.R script to container |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Ari Frantz <ari.frantz@stjude.org>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| File probe_list = probe_files[num] | ||
| } | ||
| } | ||
| } | ||
| scatter (iter_index in range(length(probe_list))){ | ||
| call concat_and_uniq { input: | ||
| files_to_combine = select_all(probe_list[iter_index]), |
There was a problem hiding this comment.
The variable 'probe_list' is used here but refers to a 2D array of optional Files from the nested scatter. Consider renaming to 'probe_list_batches' or 'probe_file_batches' to clarify it contains batches of probe files.
| File probe_list = probe_files[num] | |
| } | |
| } | |
| } | |
| scatter (iter_index in range(length(probe_list))){ | |
| call concat_and_uniq { input: | |
| files_to_combine = select_all(probe_list[iter_index]), | |
| File probe_file_batches = probe_files[num] | |
| } | |
| } | |
| } | |
| scatter (iter_index in range(length(probe_file_batches))){ | |
| call concat_and_uniq { input: | |
| files_to_combine = select_all(probe_file_batches[iter_index]), |
| File probe_list_non_genomic = non_genomic_probe_list[num_ng] | ||
| } | ||
| } | ||
| } | ||
| scatter (iter_index in range(length(probe_list_non_genomic))){ | ||
| call concat_and_uniq as non_genomic_concat { input: | ||
| files_to_combine = select_all(probe_list_non_genomic[iter_index]), |
There was a problem hiding this comment.
The variable 'probe_list_non_genomic' refers to a 2D array of optional Files from the nested scatter. Consider renaming to 'non_genomic_probe_batches' or 'non_genomic_file_batches' to clarify it contains batches of probe files.
| File probe_list_non_genomic = non_genomic_probe_list[num_ng] | |
| } | |
| } | |
| } | |
| scatter (iter_index in range(length(probe_list_non_genomic))){ | |
| call concat_and_uniq as non_genomic_concat { input: | |
| files_to_combine = select_all(probe_list_non_genomic[iter_index]), | |
| File non_genomic_probe_batches = non_genomic_probe_list[num_ng] | |
| } | |
| } | |
| } | |
| scatter (iter_index in range(length(non_genomic_probe_batches))){ | |
| call concat_and_uniq as non_genomic_concat { input: | |
| files_to_combine = select_all(non_genomic_probe_batches[iter_index]), |
Add filtering of sex chromosomes to the UMAP generation. Also generate a list of probes that have SNPs.
Before submitting this PR, please make sure:
scripts/ordocker/directories, please ensure any image versions have been incremented accordingly!