Annotate processed aligned reads — annotate_aligned

Put together annotation components for nextflow star-salmon outputs. Annotations come from several sources:

Inherit some annotations on the original input files. Requires a reference mapping of input files to use. Most prop vals can be inherited by the derived files, e.g. assay type, but not for "comments" or "entityId". Ideally, the data model itself should include inheritance rules; since that isn't possible currently, we hard-code lots of stuff, so this is hard to generalize for other data models.

Usage

annotate_aligned_reads(
  sample_io,
  samtools_stats_file = NULL,
  picard_stats_file = NULL,
  template = "bts:ProcessedAlignedReadsTemplate",
  schema =
    "https://raw.githubusercontent.com/nf-osi/nf-metadata-dictionary/main/NF.jsonld",
  verbose = TRUE,
  dry_run = TRUE
)

Arguments

sample_io: Table mapping input to outputs.
samtools_stats_file: Path to file/syn id of file with samtools stats produced by the workflow.
picard_stats_file: Path to file/syn id of file with picard stats produced by the workflow.
template: (Optional) URI of template in data model to use, prefixed if needed. Can specify different model/version, but in some cases may not work well.
schema: Path (URL or local) to file from which schema will be read, or schema as list object.
verbose: Give verbose reports for what's happening.
dry_run: Whether to apply annotations.

Details

Extract metrics from auxiliary files to surface as annotations. See annotate_with_tool_stats.
Manually add annotations that can't (yet?) be derived from #1 or #2. Has to be done outside of this util.

Always returns a "partial" manifest, which can be adjusted as needed; for example, if default values such as the linked workflow version are out of date. The param dry_run specifies whether annotations should be applied.