Skip to main content
Skip table of contents

Step 4: How to Annotate Data

Once data is uploaded to Synapse, you will use the Data Curator App (DCA) to annotate your data files. The DCA is a web-based metadata upload application developed and managed by Sage Bionetworks.

Annotating your data is the process of labeling files so that users can:

  • get a general understanding of what information is contained in the file

  • determine what type of experiment was performed

  • determine what type of biospecimen was used

  • assess whether the data is the type of data that they are looking for


Annotating your data involves the following actions, to be completed in order:

1. Ensure your data is properly uploaded

If you haven’t done so already, follow the instructions at Step 3: How to Upload Data .

2. Navigate to the Data Curator app

In order to access the app, you must be logged into Synapse. If you are not already logged into Synapse when you land on this page, you will be prompted to do so. Click the Synapse link provided on the page to log in to Synapse, and then refresh the app page—you should now be able to access it.

3. Find the data you want to annotate

Once in the app, follow the prompts to select the project, folder, and the type of assay you are annotating.

If the app is idle for too long, it will disconnect to reduce the load on the server. Simply refresh the page to restart the app.

4. Generate metadata template(s)

If you already completed your metadata templates in Step 1.4: How to Assemble Metadata, please

  • Verify that the template version you filled matches the latest version available.

  • Once verified, skip forward to 6. Submit your template

Note that metadata templates are subject to change over time as our data standards develop. We recommend that you download templates close to the time when you plan to submit them for validation to ensure that you have the latest version available.

Click Download template to download a blank template. Click on the generated file link to open the template.

At minimum, the EL DMCC requires data contributors to submit:

  • One Individual metadata file

  • One Biospecimen metadata file

  • One Assay metadata file per each type of assay (ex. one for RNAseq, one for ChIPseq)

  • One File annotation template per each type of data (ex. one for RNAseq, one for ChIPseq)

5. Fill in the template

For this step, you may need to reference our full metadata dictionary and definitions. You can browse and search the full dictionary here: https://eliteportal.github.io/data-models/

Fill in the templates by selecting the available variables in each cell’s dropdown menu. Continue filling in the columns until all required columns (highlighted in blue) are filled in.

  • All required columns must contain either a value or a “Unknown, Not collected, Not applicable, Not specified” option. If you do not see a value you need and require a custom term, please contact us and we can add new value(s) to the template.

  • Columns without a dropdown do not have controlled vocabulary. Please enter your own values.

  • Delete any blank rows at the bottom of your completed template.

Once you have filled your template with the necessary metadata, save your file as a CSV (comma-separated values) and navigate back to DCA.

6. Submit your template

Upload your CSV file using the Browse button. Preview your completed metadata and, once ready, click Validate Filled Metadata.

image-20230519-135938.png

Error message example

If you receive an error upon pressing the Validate Metadata button, the metadata template cells causing the error will be highlighted, along with a corresponding list of error details. You can edit your data locally or by using your previous spreadsheet. Once you’ve resolved the errors, resave and upload your CSV in the Validate & Submit Metadata screen and click Validate Metadata.

Once you receive the No Errors Found message, you are ready to click Submit data.

image-20231115-152136.png

Success notification example

You should receive a Success! notification that your metadata has been submitted. You can either visit the Synapse project using the link, or click OK to navigate back to the validation screen.

After submission, be sure to review your template and/or your table in Synapse to ensure your data was submitted properly.

7. (Optional) Updating Annotations

If you need to either update or add to your metadata templates you can do so by going to the DCA application, selecting the same project, team, and then staging metadata folder where previous metadata templates are saved, and re-filling the template with your additions. The DCA will automatically download the existing information from the previous file. Once the template is updated, complete Step 6: Submit your template to apply your changes.

You can reuse existing templates as long as your dataset has not changed (i.e. files haven’t been deleted or added). If files were deleted, please ensure that templates don’t contain the records associated with these files.

If you need to update or add to your metadata templates after data has been released to the ELITE Portal, please contact the EL DMCC team for assistance.

8. Release your data

Once you have finished annotating the data, you are now ready to complete the final checks before releasing your data to the public. See Step 5: How to Release Data .


Step-by-Step Tutorial

Click on the image below for a step-by-step guide for How to Annotate Data in DCA.

Screenshot 2024-07-23 at 12.43.15 PM.png

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.