Skip to contents

Table Utils

Main PORTAL table data update and management

Augment and update one of the main portal tables (e.g. Portal - Studies)

add_publication_from_pubmed()
Add a publication to the publication table
add_publication_from_unpaywall()
Add a publication or preprint to the publication table via the Unpaywall API.
add_publications_from_file()
Add a batch of publications from spreadsheet
assign_study_data_types()
Summarize data types for the study
calculate_related_studies()
Calculate and add related studies to study table
register_study()
Register a NEW project for the NF Data Portal in Portal - Project View
add_people_from_table()
Update the People table from a source Table or View column
register_study_files()
Register a project's files in Portal - Files
summarize_attribute()
Helper summarization util

Lower-level table maintenance

adjust_view()
Adjust view
swap_col()
Swap out old column for a new column in a schema
byte_budget()
Calculate byte budget for a schema

Project Configuration and Management

Create or retrofit an NF project to expected default structure and assets

new_project()
Create a new project
add_default_fileview()
Create default project fileview
add_default_folders()
Create default folders
make_admin()
Make a user or group full admin of a Synapse entity

Metadata Utils

General annotations

Add and manage annotations on Synapse entities

set_annotations()
Wrapper around the Python set_annotations that pulls current annotations and adds new annotations with given annotations data or replaces data for annotations with the same keys existing on the entity.
update_study_annotations()
Updates a set of files with project-level annotations.
annotate_with_manifest()
Set annotations from a manifest
copy_annotations()
Copy annotations
.modify_annotation()
Modify a single annotation on a single file
meta_qc_dataset()
QC dataset metadata with pass/fail result
meta_qc_project()
QC metadata at the project level with pass/fail result
manifest_generate()
Generate manifest via schematic service
manifest_validate()
Validate manifest via schematic service
manifest_validate_wrapper()
Validate with stated data_type in manifest
manifest_passed()
Provide a pass/fail summary result
precheck_manifest()
Precheck a manifest
remanifest()
Reconstitute a manifest
infer_data_type()
Infer data type of a dataset folder

Special annotation of nextflow processed data

Special annotation of nextflow processed data

map_reports_sarek()
Map out Sarek report files
map_sample_input_ss()
Parse nextflow samplesheet for sample inputs
map_sample_io()
Map sample input-output
map_sample_output_rnaseq()
Map sample to output from nf-rnaseq with path
map_sample_output_sarek()
Map sample to output from nf-sarek
annotate_processed()
Annotate processed data
annotate_aligned_reads()
Annotate processed aligned reads
annotate_called_variants()
Annotate somatic or germline variants output
annotate_quantified_expression()
Annotate quantified expression output
annotate_reports_sarek()
Annotate Sarek reports
annotate_with_samtools_stats()
Make annotations from samtools stats
processed_meta()
Metadata for processed products
nf_workflow_version()
Return workflow version according to workflow meta

Dataset Creation and Management

General dataset creation and citation

Create datasets in general

new_dataset()
Create new dataset with given items

Working with dataset collections to manage datasets after creation

add_to_collection()
Add to collection
use_latest_in_collection()
Update item versions to "latest" in a collection
update_items()
INTERNAL - apply updates to a collection of items

Data Model Utils

Talk to a JSON-LD data model important to the portal data (i.e. NF-metadata-dictionary)

get_by_prop_from_json_schema()
Look up connected nodes by specified property in JSON-LD schema
get_dependency_from_json_schema()
Get dependencies for node in JSON-LD schema
get_valid_values_from_json_schema()
Retrieve valid subclasses of a value in a JSON-LD schema
key_label_to_id()
Query for schema key id given label
schema_max_str_len()
Consult schema about max string length

Governance Utils

Analyze and manage data access/restrictions

make_public_viewable()
Set public access to VIEW (READ) only for an entity
make_public()
Make public
check_access()
Check access
summarize_file_access()
Summarize file access for files within some view
grant_specific_file_access()
Provide access to a specific set of files using a query result.

Search Utils

Help locate Synapse accessions, etc.

find_child()
Find id of a child entity in a container
find_child_type()
Find children of type
find_data_root()
Find data folder
find_in()
Find in path
find_nf_asset()
Find a standard nextflow workflow output asset
find_parent()
Find parent

Provenance Utils

Manage provenance metadata

add_activity()
Add activity to entity
add_activity_batch()
Add activity to multiple entities
delete_provenance()
Remove provenance info

Content Utils

Create and manage content for projects and pages

add_default_wiki()
Add default wiki
wiki_mod()
Add markup to a project wiki
remove_wiki_subpage()
Remove a subpage from a project wiki
data_curator_app_subpage()
Create NF Data Curator App subpage
get_project_wiki()
Get wiki content of synapse project(s)
check_wiki_links()
Check wiki links
remove_button()
Remove button from a project wiki

Figures and diagrams

Supplemental figures and diagrams that go into Wikis or other places

processing_flowchart()
Wrapper to create data-driven flowchart with pretty processing provenance mermaid template
dsp_dataset_mapping()
Wrapper to create Data Sharing Plan to project dataset comparison chart
bipartite_mmd_template()
Simple bipartite representation in mermaid charts

Export Data to Other Platforms

Helpers to export/release NF data to other platforms/databases.

cBioPortal

Export data as a cBioPortal study

cbp_new_study()
Initialize a new cBioPortal study dataset
cbp_new_cancer_type()
Create reference file for new cancer type
cbp_add_maf()
Export and add mutations data to cBioPortal dataset
cbp_add_clinical()
Export and add clinical data to cBioPortal dataset
cbp_add_expression()
Export and add expression data to cBioPortal dataset
cbp_add_cna()
Export and add CNA (seg) data to cBioPortal dataset

Quality Control and Testing Utils

QC data

check_readpair_validity()
Check fastq read pair matches samplesheet read pair assignment.
identify_read_pair()
Identify read pair from string
test_failed()
Format a test fail message.
test_passed()
Format a test passed message.

Basic Utils

Low-level functions

syn_login()
Logs into Synapse.
table_query()
Generic table query
as_table_schema()
Transform table data to target schema for Synapse storage
make_folder()
Create project folders
bind_schema()
Wrapper for JSON schema binding
add_to_scope()
Add to scope
new_view()
Create a view
list_project_datasets()
List datasets in project
latest_version()
Get the latest version
walk()
Walk through a directory
copy()
Create copy of entity
get_path()
Get path for a Synapse id
convert_to_stringlist()
Convert a delimited string to a stringlist annotation
bare_syn_id()
Extract synapse id from URI or other string
bad_url()
Helper function to check urls
.update_table_data()
Replace/update table contents = input data must have ROW_ID and ROW_VERSION columns to update, otherwise will append data.
.update_view_data()
Replace/update table contents = input data must have ROW_ID, ROW_VERSION, ETAG columns to update.
from_pubmed()
Get publication metadata from PubMed

Internal/experimental

Mostly meant to be internal or experimental stuff

.delim_string_to_vector()
Convert a delimited string to vector, utility function.
.dict_to_list()
Convert a flat Python Dict to R list
.replace_string_column_with_stringlist_column()
Replace string column with stringlist column
.store_rows()
Adds a row to a table.
missing_annotation_email()
Convert a delimited string to a stringlist annotation
get_doi_meta()
Get DOI metadata if it exists
cite_dataset()
Generate example dataset citation