Glossary | ELITE Portal Help Docs

Refer back to this page if you’re ever unsure about what something means or the difference between terms.

Annotations

Annotations are essentially extra pieces of information included with data in a project, file, folder, table, or view. This additional information, in the form of controlled vocabulary, helps to surface the data in a structured way. For data contributors uploading data into Synapse, you can find more information on how to assign and edit annotations here.

As a user, annotations are what allow you to systematically search for and find specific data of interest. If you haven’t already, learn how to filter and find data on the ELITE Portal here.

Looking for descriptions of annotations? Visit the metadata dictionary and learn more about using the metadata dictionary here.

Controlled Access Data

While all data uploaded to Synapse is falls under the principles of Open Data, individual level human data is Controlled Access Data and requires the submission of a Data Use Certificate (DUC). You can find more detailed information on this here. Individual-level data in any form (raw, processed, derived) can not be shared outside the ELITE Portal.

Controlled Value

A pre-formatted value that must be used as defined. Ex: True instead of yes; female instead of woman

Data

In our context, this refers to the data generated across studies. Explore all data on the portal here.

Data Subtype

Data Subtype (also referred to as dataSubtype) is a file annotation that indicates if data in the file are raw, processed, or normalized, or if the file contains metadata.

Individual ID

An individual ID is the identifier for a specific individual (human subject or single animal).

Individual Level Data

Individual level data refers to any file that has values for an individual, as opposed to aggregate data which are data combined from several individuals

File Schema

The JSON schema associated with the Synapse File Entity.

Governance

Due to the open-access nature of the platform, Synapse operates under comprehensive governance policies that define the rights and responsibilities of Synapse users. This includes our standard operating procedures (SOPs), privacy policy, code of conduct, community standards, and more.

Grant

A grant is represented by a contract number assigned to a NIH-funded project.

HIPAA-Limited Data

Data that excludes all PHI (as defined by HIPPA) except for at least one of the following:

dates such as admission, discharge, service, DOB, DOD
city, state, five digit or more zip code
ages in years, months or days or hours

Note: HIPAA-limited data should always be categorized in the Controlled Access Data Tier.

Manifest

A list of files and their metadata. There are several different types of manifests used throughout Synapse:

Upload manifest: This is a .tsv file used to upload metadata—more details, along with a template, are provided here.
Download manifest: This is used when downloading data programmatically—the template is provided by Synapse Python Client.
File Schema Driven Manifest: This is based on the new File Schema.
Portals Manifest: This is currently provided when exporting data.

Metadata

Metadata is additional, standardized information included alongside the data to give it context—data about the data, if you will. Metadata is what allows data in the portal to be searchable, discoverable, accessible, re-usable, and understandable to others, including those who were not involved in the data generation process.

Metadata can be descriptive (i.e., the name of the file), administrative (i.e., provenance information), or research-based (i.e., information about the sampling and handling of data).

Find everything you need to know about using metadata here.

NIA/NIH (National Institute on Aging/National Institutes of Health)

The NIA leads a broad scientific effort to understanding the nature of aging and to extend the healthy, active years of life. NIA is a division of the NIH. Both are American institutes.

Open Data/Open Science

Open data represents transparent and accessible knowledge that is shared and developed through collaborative networks, based on the principles of open science. The goal of open science is to make scientific research—including publications, data, physical samples, and software—and its dissemination accessible to all levels of an inquiring society, whether amateur or professional.

The general driving idea behind open science and data is that scientific research can and should be accessible to anyone—because, well, why not? This system benefits all parties involved—the researchers gain wider-reaching recognition and appreciation for their work, the study subjects get to witness the palpable value of providing their personal data, scientists and other professionals are able to use properly funded research to aid in their own research/work, and the general public gains helpful information and knowledge from trusted sources. This is truly a win-win—collective consciousness is a global good!

People

These are the researchers and supporting stakeholders who contribute to the portal and make up the consortium. In the People section of the portal, you can query for people based on their institutions, programs, and the Grants which fund their work.

Explore all people on the portal here.

Clicking on a name will take you to their Synapse profile and provide basic information about the person, including their Synapse email.

Program

In our context, a program represents a group of scientists working together towards a common research goal. In this way, “Consortia” is often used interchangeably with “program.” An exception is the Community Data Contribution Program, where researchers outside of the funded programs contribute data and other content to the portal.

Project

In our context, a project is typically associated with an NIH grant. So, the terms project and grant are often used interchangeably. Explore all projects on the portal here.

Publications

Publications are a core output of research studies—many of them are Open Access and can be directly accessed by anyone. Explore all publications on the portal here.

Results

Data analyses that surfaces on the portal through biological and computational tools and is boosted through information available on the portal such as metadata and provenance.

From a reusability perspective, data is the most useful to future users. Both results and data can be shared, but “data” is more important for reproducibility and reuse.

We consider data to be raw or partially processed information, depending on the type of experiment. Results are generally post-analysis information or manuscript figures. For example, if you are sharing gene expression information, raw data would be the raw, zipped, fastq.gz files, while differential expression analysis and volcano plots would be considered results. This distinction is well defined for many types of data, but for assays that we encounter less often this may be less clear. Results might also be acceptable for assays that do not lend themselves to re-analysis, such as western blotting. We can work with you to help figure this out.

Schema

An overlapping concept to data model, a metadata schema provides further rules and standardization of a data model. It outlines additional rules governing the management of metadata through constraints such as the optionality or valid values of attributes.

Sensitive Data

Data that must be protected from unauthorized access to safeguard the privacy or security of an individual or organization. This includes human data at risk of re-identification.

Note: “De-identified” data (maintained in a way that does not allow association with a specific person) is not considered sensitive.

Specimen ID

A specimen ID is the identifier for a sample from a specific individual – for example, a brain sample from a specific region or a blood sample.

Study

A study is the primary unit of data organization in the portal. Essentially, each study represents an individual research project with specific objectives and focus (one project can operate multiple studies) A study can represent data generated from a specific human cohort, data from experiments on a model system, cross-consortium data processing and analysis efforts, or data associated with a specific publication. Explore all studies on the portal here.

Synapse ID

Every object in Synapse (file, folder, project, table, view, user, etc.) is designated a unique Synapse ID (also known as synID) that is readable by programmatic clients.

Tools

The portal houses tools available to the research community to assist with data research and analysis. Find computational tools here.