Skip to main content
Skip table of contents

Step 1.4: How to Assemble Metadata

Metadata is standardized information provided along with data that helps with data organization and description. It is, essentially, data about the data.

Metadata is useful for you and everyone else that has a need for discovering, accessing, and/or using the corresponding data or repository the data is stored in. All data that gets uploaded into Synapse is curated by the EL DMCC to ensure the metadata properly allows for data usability.

In most cases, we require three things for every set of data:

  • The data itself

  • A description of the study and methods used to generate the data

  • The metadata, captured as annotations on the data files themselves AND as a separate set of files (.csv) about the individuals, specimens from the individuals, and the assay(s) performed on the specimens


Assemble a Study Description, Methods Description, and Acknowledgement Statement

The study description, methods description, and acknowledgement statement should be submitted when you submit your initial AD + EL Service Desk Request. These can be modified at any time before your data is released by messaging the EL DMCC in your Service Desk ticket. See Step 1.1: How to Submit a Service Desk Request | Study-description for a full description and examples of each of these items.

Assemble Required Metadata Templates

All data that is uploaded into Synapse is curated by the EL DMCC to ensure the metadata properly allows for data usability. These metadata standards are created in collaboration with the EL consortia, and define the minimum required variables that are expected for all submissions of the same data type. These standards are translated into spreadsheet templates (.csv) that you will need to fill out.

In most cases, the required metadata will be expected in four files:

  1. Individual metadata template: .csv file describing each individual in the study

  2. Biospecimen metadata template: .csv file describing the specimens collected in the study

  3. Assay metadata template: .csv file(s) describing the assay(s) performed

  4. File annotation template: a .tsv (tab-delimited text) file listing each file that will be uploaded


1. Generate the metadata template(s)

Contact the EL DMCC via your AD + EL Service Desk ticket to request metadata template(s). In the future, metadata templates will be downloadable from the ELITE Portal Metadata Dictionary.

Metadata templates are subject to change over time as our data standards develop. We recommend that you download templates close to the time when you plan to submit them for validation to ensure that you have the latest version available.

2. Fill in the template(s)

For this step, you may need to reference our full metadata dictionary and definitions. You can browse and search the full dictionary here: ELITE Portal Data Dictionary

Fill in the templates by selecting the available variables in each cell’s dropdown menu. Continue filling in the columns until all required columns (highlighted in blue) are filled in.

  • All required columns must contain either a value or a “Unknown, Not collected, Not applicable, Not specified” option. If you do not see a value you need and require a custom term, please contact us and we can add new value(s) to the template.

  • Columns without a dropdown do not have controlled vocabulary. Please enter your own values.

  • Delete any blank rows at the bottom of your completed template.

Once you have filled your template with the necessary metadata, save your file as a CSV (comma-separated values).

At this point, your preparation work is done. Hold on to these completed templates until you reach Step 4: How to Annotate Data, during which you will submit them for validation.


Assemble Additional Metadata Files

Data contributors may provide additional documentation to help others understand and reuse the data.

Additional metadata can be added as optional columns in the provided metadata templates OR uploaded as separate files. Additional metadata that are uploaded as separate files do not need to conform to the same format as the provided metadata templates.

The following table outlines examples of additional documentation that are commonly submitted.

Documentation Type

Required to Upload?

Data dictionary

Ex. for demographic variables, phenotype variables, metabolite or peptide names, etc.

Required

Inclusion/exclusion criteria for study participants; sample preparation; data generation or data processing protocols (pdfs, protocols.io links, etc)

Recommended

Processing code or links to code

Recommended

Configuration files or lists of arguments used with processing software

Recommended

QC files

Recommended

Data generation reports from labs, sequencing facilities, etc.

Recommended

Electronic lab notebook summaries

Recommended

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.