Script to download S3 data from Empatica.
- Python 100%
| .python-version | ||
| LICENSE | ||
| main.py | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
S3 Session Downloader
This Python script enables bulk and efficient searching and downloading of files from AWS S3. Specifically designed for participant data structures, it allows filtering by Session ID and supports parallel downloads with a visual progress bar.
Quick Start
This project uses uv for package management.
- Clone the repository:
git clone https://dev.b2s.club/josu/s3_data_downloader.git
cd s3_data_downloader
- Sync the environment:
uv sync
- Configure credentials:
Create a
.envfile in the project root:
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_DEFAULT_REGION=us-east-1
Usage
The script offers two main modes of operation:
1. Download a single session
Pass the session ID directly as an argument:
uv run main.py id_123
2. Download multiple sessions from a file
Create a sessions.txt file with one ID per line and run:
uv run main.py -f sessions.txt
Available Options:
| Argument | Description |
|---|---|
session_id |
Unique Session ID to download. |
-f, --file |
Path to a .txt file containing multiple session IDs. |
-v, --verbose |
Shows DEBUG information |
Output Structure
The script recreates the S3 folder hierarchy locally while ignoring the participant_data prefix to keep the directory tree clean:
sessions/
└── id_123/
└── 2024-01-23/
└── digital_biomarkers/
└── accelerometer.csv
Security
- Do not upload the
.envfile: Ensure your.gitignoreis configured to ignore credentials. - IAM Permissions: Verify that your AWS credentials have
s3:ListBucketands3:GetObjectpermissions.
This project is licensed under the GNU GPL v3 License - see the LICENSE file for details.