Admin Guide: S3 Bucket File Structure

Dataset

  • bucket_name: dataset_id prefixed with S3_BUCKET_PREFIX

    S3_BUCKET_PREFIX-dataset_id

    Example: For dataset https://rdms.cottagelabs.com/concern/datasets/2v23vt540, in S3 the id will be

    cl2-2v23vt540

  • all of the files will be inside bucket_name/

  • The metadata will be saved in metadata.json within the bucket_name bucket_name/metadata.json

CrcDataset

  • bucket_name: experiment id prefixed with S3_BUCKET_PREFIX

    S3_BUCKET_PREFIX-crc_dataset_id

    Example: For experiment https://rdms.cottagelabs.com/concern/crc_datasets/xs55mc178?locale=en, in S3 the id will be

    cl2-xs55mc178

  • Experiment

    • all of the experiment files will be inside bucket_name/
    • The metadata will be saved in metadata.json within the bucket_name bucket_name/metadata.json
  • Subject

    • sanitised_subject_name:

      • The subject title will be sanitised to follow S3 bucket naming rules.

      • After sanitising, a subject title must be unique within the experiment

        For example, Sub-001, sub-001, sub--001 will all be considered the same.

    • All of the subject files will be inside

      bucket_name/sanitised_subject_name/

    • The metadata will be saved in metadata.json within the subject folder bucket_name/sanitised_subject_name/metadata.json

  • Session

    • sanitised_session_name:

      • The session title will be sanitised to follow S3 bucket naming rules.

      • After sanitising, a session title must be unique within the subject

        For example, Ses-001, ses-001, ses--001 will all be considered the same.

    • All of the session files will be inside

      bucket_name/sanitised_subject_name/sanitised_session_name/

    • The metadata will be saved in metadata.json within the session folder bucket_name/sanitised_subject_name/sanitised_session_name/metadata.json

  • Modality

    • sanitised_modality_name:

      • The first modality value will be copied to be the modality title.

      • The modality title will be sanitised to follow S3 bucket naming rules.

      • After sanitising, a modality title must be unique within the session.

        For example, Mod-001, mod-001, mod--001 will all be considered the same.

    • All of the modality files will be inside

      bucket_name/sanitised_subject_name/sanitised_session_name/sanitised_modality_name/

    • The metadata will be saved in metadata.json within the session folder bucket_name/sanitised_subject_name/sanitised_session_name/sanitised_modality_name/metadata.json

S3 bucket naming rules

There are rules that apply to for naming general purpose buckets and directory buckets in Amazon S3 as stated here.

We have implemented the following rules

  • Check for length (between 3 and 63 characters long)

  • Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).

    • Convert title to lowercase
    • Remove any characters other than a-z, 0-9, . and -
  • Bucket names must begin and end with a letter or number.

    • Remove . or - from beginning and end
  • Bucket names must not contain two adjacent periods.

    • Replace .. with .

    • Also, replace -- with - for consistency

  • bucket names must not start with the prefix xn--

    • We would have changed this to xn-
  • Bucket names must not start with the sthree-

  • Bucket names must not end with the -s3alias

  • Bucket names must not end with the --ol-s3

    • We would have changed this to -ol-s3