2 Commits

Author SHA1 Message Date
5f554ea769 refactor: environment variable configuration for all pipeline settings
- Add config.py with dataclass-based configuration from env vars
- Remove hardcoded RunPod endpoint and credentials
- Consolidate duplicate training components into single reusable function
- Add .env.example with all configurable options
- Update README with environment variable documentation
- Add Kubernetes secrets example for production deployments
- Add timeout and error handling improvements

BREAKING: Pipeline parameters now use env vars by default.
Set RUNPOD_API_KEY, RUNPOD_ENDPOINT, S3_BUCKET, and AWS creds.
2026-02-03 20:47:27 +00:00
0bf3837e78 feat: add ADE, Triage, and Symptom-Disease training pipelines
New tasks supported:
- task=ade: Adverse Drug Event classification (ADE Corpus V2, 30K samples)
- task=triage: Medical Triage classification (urgency levels)
- task=symptom_disease: Symptom-to-Disease prediction (40+ diseases)

All use HuggingFace datasets, Bio_ClinicalBERT, and S3 model storage.
2026-02-03 16:20:55 +00:00