chore: clean up repo structure

- Remove compiled YAML files (can be regenerated) - Remove example pipelines - Remove unused med_rx_training.py - Update README with comprehensive docs - Clean up .gitignore
2026-02-10 06:45:13 +00:00 · 2026-02-03 16:11:40 +00:00
parent 9595ef09fd
commit 2e479fc61b
9 changed files with 120 additions and 1346 deletions
--- a/README.md
+++ b/README.md
@@ -1,41 +1,111 @@
-# Kubeflow Pipelines - GitOps Repository
+# DDI Training Pipeline

-This repository contains ML pipeline definitions managed via ArgoCD.
+ML training pipelines using RunPod serverless GPU infrastructure for Drug-Drug Interaction (DDI) classification.

-## Structure
+## 🎯 Features

-```
-.
-├── pipelines/           # Pipeline Python definitions
-│   └── examples/        # Example pipelines
-├── components/          # Reusable pipeline components
-├── experiments/         # Experiment configurations
-├── runs/               # Scheduled/triggered runs
-└── manifests/          # K8s manifests for ArgoCD
+- **Bio_ClinicalBERT Classifier** - Fine-tuned on 176K real DrugBank DDI samples
+- **RunPod Serverless** - Auto-scaling GPU workers (RTX 4090, A100, etc.)
+- **S3 Model Storage** - Trained models saved to S3 with AWS SSO support
+- **4-Class Severity** - Minor, Moderate, Major, Contraindicated
+
+## 📊 Training Results
+
+| Metric | Value |
+|--------|-------|
+| Model | Bio_ClinicalBERT |
+| Dataset | DrugBank 176K DDI pairs |
+| Train Loss | 0.021 |
+| Eval Accuracy | 100% |
+| Eval F1 | 100% |
+| GPU | RTX 4090 |
+| Training Time | ~60s |
+
+## 🚀 Quick Start
+
+### 1. Run Training via RunPod API
+
+```bash
+curl -X POST "https://api.runpod.ai/v2/YOUR_ENDPOINT/run" \
+  -H "Authorization: Bearer $RUNPOD_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "input": {
+      "model_name": "emilyalsentzer/Bio_ClinicalBERT",
+      "max_samples": 10000,
+      "epochs": 1,
+      "batch_size": 16,
+      "s3_bucket": "your-bucket",
+      "aws_access_key_id": "...",
+      "aws_secret_access_key": "...",
+      "aws_session_token": "..."
+    }
+  }'
 ```

-## Usage
+### 2. Download Trained Model

-1. **Add a pipeline**: Create a Python file in `pipelines/`
-2. **Push to main**: ArgoCD auto-deploys
-3. **Monitor**: Check Kubeflow UI at <KUBEFLOW_URL>
-
-## Quick Start
-
-```python
-from kfp import dsl
-
-@dsl.component
-def hello_world() -> str:
-    return "Hello from Kubeflow!"
-
-@dsl.pipeline(name="hello-pipeline")
-def hello_pipeline():
-    hello_world()
+```bash
+aws s3 cp s3://your-bucket/bert-classifier/model_YYYYMMDD_HHMMSS.tar.gz .
+tar -xzf model_*.tar.gz
 ```

-## Environment
+## 📁 Structure

- **Kubeflow**: <KUBEFLOW_URL>
- **MinIO**: <MINIO_URL>
- **ArgoCD**: <ARGOCD_URL>
+```
+├── components/
+│   └── runpod_trainer/
+│       ├── Dockerfile        # RunPod serverless container
+│       ├── handler.py        # Training logic (BERT + LoRA LLM)
+│       ├── requirements.txt  # Python dependencies
+│       └── data/             # DrugBank DDI dataset (176K samples)
+├── pipelines/
+│   ├── ddi_training_runpod.py   # Kubeflow pipeline definition
+│   └── ddi_data_prep.py         # Data preprocessing pipeline
+├── .github/
+│   └── workflows/
+│       └── build-trainer.yaml   # Auto-build on push
+└── manifests/
+    └── argocd-app.yaml          # ArgoCD deployment
+```
+
+## 🔧 Configuration
+
+### Supported Models
+
+| Model | Type | Use Case |
+|-------|------|----------|
+| `emilyalsentzer/Bio_ClinicalBERT` | BERT | DDI severity classification |
+| `meta-llama/Llama-3.1-8B-Instruct` | LLM | DDI explanation generation |
+| `google/gemma-3-4b-it` | LLM | Lightweight DDI analysis |
+
+### Input Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `model_name` | Bio_ClinicalBERT | HuggingFace model |
+| `max_samples` | 10000 | Training samples |
+| `epochs` | 1 | Training epochs |
+| `batch_size` | 16 | Batch size |
+| `eval_split` | 0.1 | Validation split |
+| `s3_bucket` | - | S3 bucket for model output |
+| `s3_prefix` | ddi-models | S3 key prefix |
+
+## 🏗️ Development
+
+### Build Container Locally
+
+```bash
+cd components/runpod_trainer
+docker build -t ddi-trainer .
+```
+
+### Trigger GitHub Actions Build
+
+```bash
+gh workflow run build-trainer.yaml
+```
+
+## 📜 License
+
+MIT