Create datasets for Sarek-called somatic or germline variants results

Organize variant call files from Nextflow Sarek into 3-4 datasets, grouping files by variant type and workflow with titles having the format: "type Genomic Variants - workflow Pipeline", e.g. "Somatic Genomic Variants - Strelka Pipeline". As you can see, this assumes that you want to create datasets that segregate Somatic and Germline calls. This makes sense for NF because Germline calls can be treated differently. This uses latest version of all files and creates a Draft version of the dataset.

Usage

nf_sarek_datasets(
  output_map,
  parent,
  workflow = c("FreeBayes", "Mutect2", "Strelka", "DeepVariant"),
  verbose = TRUE,
  dry_run = TRUE
)

Arguments

output_map: The data.table returned from map_sample_output_sarek. See details for alternatives.
parent: Synapse id of parent project where the dataset will live.
workflow: One of workflows used.
verbose: Optional, whether to be verbose – defaults to TRUE.
dry_run: If TRUE, don't actually store dataset, just return the data object for inspection or further modification.

Value

A list of dataset objects.

Details

Since we basically just need the syn entity id, variant type, and workflow to group the files. Instead of getting this info through running map_* as in the example, you may prefer using a fileview, in which case you just need to download a table from a fileview that has id => output_id + the dataType and workflow annotations. The fileview can be used after the files are annotated. If you want to create datasets before files are annotated, then you have to use map_*.

Finally, datasets cannot use the same name if stored in the same project, so if there are multiple batches, the names will have to be made unique by adding the batch number, source data id, processing date, or whatever makes sense.

Examples

if (FALSE) { # \dontrun{
syn_out <- "syn26648589"
m <- map_sample_output_sarek(syn_out)
datasets <- nf_sarek_datasets(m, parent = "syn26462036", dry_run = F) # use a test project
} # }