Using Amazon Web Services S3
Data for our projects is stored on the Amazon Web Services (AWS) Simple Storage Service (S3).
Accessing data from HCP
The data from the Human Connectome project is provided as part of AWS Open Data program. The HCP dataset entry in the program is provided here.
To access the processed Human Connectome Project data, use the instructions provided here
To add your HCP credentials to your configuration, you will need to use the command-line interface.
aws configure --profile hcp
We also have code in pyAFQ that automatically fetches/reads HCP data from S3.
Uploading data to S3
Before uploading data to our S3 storage, please organize it in a 'BIDSish' format on your local hard-drive. This will look something like this:
| <study>
| ├-derivatives
| ├-<pipeline>
| ├── sub01
| │ ├── ses01
| │ │ ├── anat
| │ │ │ ├── sub-01_ses-01_aparc+aseg.nii.gz
| │ │ │ └── sub-01_ses-01_T1.nii.gz
| │ │ └── dwi
| │ │ ├── sub-01_ses-01_dwi.bvals
| │ │ ├── sub-01_ses-01_dwi.bvecs
| │ │ └── sub-01_ses-01_dwi.nii.gz
| │ └── ses02
| │ ├── anat
| │ │ ├── sub-01_ses-02_aparc+aseg.nii.gz
| │ │ └── sub-01_ses-02_T1w.nii.gz
| │ └── dwi
| │ ├── sub-01_ses-02_dwi.bvals
| │ ├── sub-01_ses-02_dwi.bvecs
| │ └── sub-01_ses-02_dwi.nii.gz
| └── sub02
| ├── ses01
| │ ├── anat
| │ ├── sub-02_ses-01_aparc+aseg.nii.gz
| │ │ └── sub-02_ses-01_T1w.nii.gz
| │ └── dwi
| │ ├── sub-02_ses-01_dwi.bvals
| │ ├── sub-02_ses-01_dwi.bvecs
| │ └── sub-02_ses-01_dwi.nii.gz
| └── ses02
| ├── anat
| │ ├── sub-02_ses-02_aparc+aseg.nii.gz
| │ └── sub-02_ses-02_T1w.nii.gz
| └── dwi
| ├── sub-02_ses-02_dwi.bvals
| ├── sub-02_ses-02_dwi.bvecs
| └── sub-02_ses-02_dwi.nii.gz
Where study
will be the name of the study and will also be the name we will
use for the bucket on S3. Instead of pipeline
use a name of the preprocessing
pipeline that you used to process the data. For example, you might use vista
,
if you processed the data with the vistalab tools or dmriprep
if you use
dmriprep.
Here, we will use the command-line interface to upload the data (see how to get started with that).
The command reference for S3 sub-commands is here.
Specifically, to create a bucket on S3, you will use the mb
sub-sub-command:
aws s3 mb s3://study
This command should be executed once. Then, uploading the data as a sync operation:
aws s3 sync path/to/study s3://study
The use of sync
(rather than aws s3 cp
) means that only new data would be
uploaded (for example, if repeated calls are made).