mirror of
https://github.com/ghndrx/kubeflow-pipelines.git
synced 2026-02-10 06:45:13 +00:00
feat: Add 176K real DrugBank DDI samples with drug names
- Downloaded 191K DDI pairs from TDC DrugBank - Fetched 1,634 drug names from PubChem API (96% hit rate) - Created complete training dataset with: - Real drug names (not just IDs) - 86 interaction type descriptions - Severity labels (minor/moderate/major/contraindicated) - Bundled 34MB data file in Docker image - Handler loads real data instead of curated samples
This commit is contained in:
@@ -8,11 +8,13 @@ COPY requirements.txt /app/requirements.txt
|
||||
# Install dependencies
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Copy handler
|
||||
# Copy handler and data
|
||||
COPY handler.py /app/handler.py
|
||||
COPY data/ /app/data/
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
ENV HF_HOME=/tmp/huggingface
|
||||
ENV DDI_DATA_PATH=/app/data/drugbank_ddi_complete.jsonl
|
||||
|
||||
CMD ["python", "-u", "handler.py"]
|
||||
|
||||
Reference in New Issue
Block a user