Skip to contents

R wrapper for validation workflow with schematic. Because there is no validation-only service endpoint, we move metadata around twice (generating manifest from server and submitting back to server), so once schematic has a validation-only service endpoint that would be much more efficient. A dataset in this context is a folder, usually tagged with contentType = "dataset".

Usage

meta_qc_dataset(
  dataset_id,
  data_type = NULL,
  asset_view = "syn16787123",
  schema_url =
    "https://raw.githubusercontent.com/nf-osi/nf-metadata-dictionary/main/NF.jsonld",
  cleanup = TRUE,
  depth = 1L
)

Arguments

dataset_id

Id of folder that represents a dataset, not actual Synapse dataset entity – see details.

data_type

A specific data type to validate against, otherwise tries to infer based on annotations. See details.

asset_view

A reference view, defaults to the main NF portal fileview.

schema_url

Schema URL, points by default to 'latest' main NF schema, can change to use a specific released version.

cleanup

Whether to automatically remove reconstituted manifests once done. Default TRUE.

depth

How much deeper to go when there appears to be no files in the immediate scope. Defaults to 1L.

Value

List of structure list(result = result, notes = notes), where result indicates passing or NA if no data or if couldn't be validated for other reasons.

Details

Note that we prefer to wrap the schematic web API over a local installation because:

  • Will not require user to go through local schematic setup for this to be functional

  • API more likely reflects an up-to-date version of schematic and consistent with current DCA deployment

When data_type can't be inferred based on annotations, this is treated as a fail.

Status: alpha and likely to change based on changes in schematic.