diff --git a/USE_CASES.md b/USE_CASES.md
new file mode 100644
index 0000000..13aff5b
--- /dev/null
+++ b/USE_CASES.md
@@ -0,0 +1,158 @@
+# Healthcare ML Use Cases & Datasets
+
+Curated list of similar healthcare/biomedical use cases with publicly available datasets for training on RunPod.
+
+---
+
+## 🔥 Priority 1: Ready to Train
+
+### 1. Adverse Drug Event Classification
+**Dataset:** `Lots-of-LoRAs/task1495_adverse_drug_event_classification`
+- **Task:** Classify text for presence of adverse drug events
+- **Size:** ~10K samples
+- **Labels:** Binary (adverse event / no adverse event)
+- **Use Case:** Pharmacovigilance, FDA reporting automation
+- **Model:** Bio_ClinicalBERT
+
+```python
+from datasets import load_dataset
+ds = load_dataset("Lots-of-LoRAs/task1495_adverse_drug_event_classification")
+```
+
+### 2. PubMed Multi-Label Classification (MeSH)
+**Dataset:** `owaiskha9654/PubMed_MultiLabel_Text_Classification_Dataset_MeSH`
+- **Task:** Assign MeSH medical subject headings to research articles
+- **Size:** ~50K articles
+- **Labels:** Multi-label (medical topics)
+- **Use Case:** Literature categorization, research discovery
+- **Model:** PubMedBERT
+
+```python
+from datasets import load_dataset
+ds = load_dataset("owaiskha9654/PubMed_MultiLabel_Text_Classification_Dataset_MeSH")
+```
+
+### 3. Symptom-to-Disease Prediction
+**Dataset:** `shanover/disease_symptoms_prec_full`
+- **Task:** Predict disease from symptom descriptions
+- **Size:** Variable
+- **Labels:** Disease categories
+- **Use Case:** Triage, symptom checker apps
+- **Model:** Bio_ClinicalBERT
+
+```python
+from datasets import load_dataset
+ds = load_dataset("shanover/disease_symptoms_prec_full")
+```
+
+### 4. Medical Triage Classification
+**Dataset:** `shubham212/Medical_Triage_Classification`
+- **Task:** Classify urgency level of medical cases
+- **Size:** ~500 downloads (popular)
+- **Labels:** Triage levels (Emergency, Urgent, Standard)
+- **Use Case:** ER automation, telemedicine routing
+- **Model:** Bio_ClinicalBERT
+
+---
+
+## 📚 Priority 2: QA & Reasoning
+
+### 5. MedMCQA - Medical Exam Questions
+**Dataset:** `openlifescienceai/medmcqa` (24K downloads!)
+- **Task:** Answer medical entrance exam questions
+- **Size:** 194K MCQs covering 2.4K healthcare topics
+- **Labels:** Multiple choice (A/B/C/D)
+- **Use Case:** Medical education, knowledge testing
+- **Model:** Llama-3 or Gemma (LLM fine-tuning)
+
+```python
+from datasets import load_dataset
+ds = load_dataset("openlifescienceai/medmcqa")
+```
+
+### 6. PubMedQA - Research Question Answering
+**Dataset:** `qiaojin/PubMedQA` (18K downloads!)
+- **Task:** Answer yes/no/maybe questions from abstracts
+- **Size:** 274K samples
+- **Labels:** yes / no / maybe
+- **Use Case:** Evidence-based medicine, literature review
+- **Model:** PubMedBERT or Bio_ClinicalBERT
+
+```python
+from datasets import load_dataset
+ds = load_dataset("qiaojin/PubMedQA")
+```
+
+---
+
+## 🧬 Priority 3: Specialized NLP
+
+### 7. Medical Abbreviation Disambiguation (MeDAL)
+**Dataset:** `McGill-NLP/medal`
+- **Task:** Disambiguate medical abbreviations in context
+- **Size:** 14GB → curated to 4GB
+- **Labels:** Abbreviation meanings
+- **Use Case:** Clinical note processing, EHR parsing
+- **Model:** Bio_ClinicalBERT
+
+### 8. BioInstruct - Instruction Following
+**Dataset:** `bio-nlp-umass/bioinstruct`
+- **Task:** Instruction-tuned biomedical tasks
+- **Size:** 25K instructions
+- **Labels:** Various biomedical tasks
+- **Use Case:** General biomedical assistant
+- **Model:** Llama-3 or Mistral (LoRA fine-tuning)
+
+---
+
+## 🛠️ Implementation Roadmap
+
+### Week 1: Adverse Drug Events
+1. Download ADE dataset
+2. Add to handler.py as new training mode
+3. Train classifier → S3
+4. Build inference endpoint
+
+### Week 2: PubMed Classification
+1. Download PubMed MeSH dataset
+2. Multi-label classification head
+3. Train → S3
+4. Literature search API
+
+### Week 3: Medical QA
+1. Download MedMCQA
+2. LLM fine-tuning with LoRA
+3. Deploy QA endpoint
+
+### Week 4: Symptom Checker
+1. Symptom-disease dataset
+2. Train classifier
+3. Build symptom input → disease prediction API
+
+---
+
+## 📊 Dataset Comparison
+
+| Dataset | Size | Task | Difficulty | Business Value |
+|---------|------|------|------------|----------------|
+| DDI (current) | 176K | Classification | Medium | ⭐⭐⭐⭐⭐ |
+| Adverse Events | 10K | Binary | Easy | ⭐⭐⭐⭐⭐ |
+| PubMed MeSH | 50K | Multi-label | Medium | ⭐⭐⭐⭐ |
+| MedMCQA | 194K | MCQ | Hard | ⭐⭐⭐⭐ |
+| PubMedQA | 274K | Yes/No/Maybe | Medium | ⭐⭐⭐⭐ |
+| Symptom→Disease | Varies | Classification | Easy | ⭐⭐⭐⭐⭐ |
+| Triage | ~5K | Classification | Easy | ⭐⭐⭐⭐⭐ |
+
+---
+
+## 🔗 Additional Resources
+
+- **MIMIC-III/IV:** ICU clinical data (requires PhysioNet access)
+- **n2c2 Challenges:** Clinical NLP shared tasks
+- **i2b2:** De-identified clinical records
+- **ChemProt:** Chemical-protein interactions
+- **BC5CDR:** Chemical-disease relations
+
+---
+
+*Generated: 2026-02-03*