Summit Health Data

ML Model Training Management

Training Pipeline

1

Data Extraction

Initializing...

2

Data Selection

Select Batches

3

Training Config

Configure Parameters

4

Teraq Sync

Synchronize Data

5

Training

Model Training

6

ML Model Performance Evaluation

MedHELM LLM-Jury Evaluation

7

Deployment

Model Ready

Available Batches

-

Selected Batches

0

Total Documents

-

Active Training Jobs

-
Data Selection
Training Configuration
Training Process Control
Stored Models
Training Jobs
Model Evaluation
Fact Extraction
Model Results
Billing & API
Quantum ML Training

Select Datasets for Training

Select datasets and batches from batch review to use for model training. Data will be synchronized with Teraq.

Public Medical Datasets

MIMIC-III

Medical Information Mart for Intensive Care III

Click to select

MIMIC-IV

Medical Information Mart for Intensive Care IV

Click to select

Your Batch Review Data

Loading available batches...

Training Configuration

Enter a name for your trained model
Select base model for training
Classical training uses LoRA/QLoRA for efficiency
Select the compute instance for training
Model size reduction (2x-5x)

No batches selected. Go to Data Selection tab to select batches.

TensorBoard will show real-time training metrics, loss curves, and more. Available for all instance types.

Training Process Control

TinyLlama-1.1B Medical Pre-training with Nemotron 49B medical data extraction

Phase 1 - Causal Language Modeling (CLM)

Checking status...
Loading training process information...
Current Step
--
of 5,000
Progress
--
Completion
Training Loss
--
Latest Value
Time per Step
--
Average
Estimated Time
--
Remaining
Memory Usage
--
RAM Used
CPU Usage
--
Utilization
Runtime
--
Since Start

Instance & Resource Monitoring

Active Training Processes

Loading active processes...

System Resources - Summit Backend

CPU Usage
--
System-wide
Memory Usage
--
RAM
Disk Usage
--
Storage
Running Processes
--
Training Jobs
📊 Overall Training Progress 0%
Datasets: MIMIC-III + MIMIC-IV (Combined)

Model Configuration

Model: TinyLlama-1.1B-Chat-v1.0
Parameters: 1.10B
Batch Size: 32
Sequence Length: 256 tokens
Dataset: MIMIC-III (59,569 docs)

Training Details

Start Time: --
Last Update: --
Checkpoints: 0
Output Directory: /home/ec2-user/Training_Data/models/tinyllama-1b-medical-phase1

Teraq.ai Synchronization

Status: Not synced
Active Jobs: 0
Last Sync: --
Model Reduction: --

đŸ“ē Training Terminal Output

--
Loading terminal output...
Log file: -- Lines: -- | Size: --

📊 TensorBoard Visualization

Real-time training metrics, loss curves, and performance visualization

Checking TensorBoard status...
Loading TensorBoard information...

Stored Training Models

Overview of all trained models with metadata from database.

Loading stored models...

Active Training Jobs

Monitor training jobs synchronized with Teraq platform.

Job ID Model Name Status Progress Started Duration Actions

Loading training jobs...

ML Model Performance Evaluation

Evaluate trained models using MedHELM LLM-jury evaluation protocol. Based on MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks (Bedi et al., 2025).

Select Model for Evaluation

Evaluation Metrics (MedHELM LLM-Jury)

Based on MedHELM (Bedi et al., 2025) - arXiv:2505.23802. Uses LLM-jury evaluation with three criteria. Final score is the mean of all three metrics (equal weighting).

Accuracy (33.3%)

Factual correctness and adherence to medical guidelines

Completeness (33.3%)

Thoroughness in addressing all aspects of the query

Clarity (33.3%)

Organization, readability, and easy to understand language

📚 Available Human-Annotated Benchmarks

Standard medical Q&A benchmarks available for evaluation. These datasets are human-annotated and clinically validated.

MedQA

Available

USMLE medical licensing exam questions

Questions: ~20,000
Format: Multiple-choice
Path: /Training_Data/datasets/medical_benchmarks/medqa/

MedMCQA

Available

Large-scale multi-subject medical Q&A

Questions: 193,155
Format: Multiple-choice
Path: /Training_Data/datasets/medical_benchmarks/medmcqa/

PubMedQA

Available

Biomedical research question answering

Questions: ~214,269
Format: Yes/No/Maybe
Path: /Training_Data/datasets/medical_benchmarks/pubmedqa/

âš ī¸ Note: These benchmarks use different formats (multiple-choice/Yes-No) than your MIMIC-III dataset (open-ended Q&A). Use MedHELM LLM-jury for evaluation - it works with any format and doesn't require ground truth!

Run Evaluation

⚡ Inference Configuration

Trainium Instance: ec2-54-159-165-80.compute-1.amazonaws.com:8500
Inference will use Trainium API endpoint for fast model evaluation.

Enhanced Fact Extraction

Extract medical facts from clinical notes using enhanced rule-based extraction with optional AWS Comprehend Medical and John Snow Labs integration. Supports 500+ medications, enhanced negation detection, and multi-source extraction.

Extraction Method

Clinical Note Input

📊 Create FactEHR Dataset from PhysioNet

Generate FactEHR-style datasets from PhysioNet clinical notes. Select a dataset source and extraction method.

📚 Human-Annotated Medical Benchmarks

Standard medical Q&A benchmarks available for evaluation. These datasets are human-annotated and clinically validated.

MedQA

Available

Medical licensing exam questions (USMLE)

Questions: ~20,000
Format: Multiple-choice
Location: /Training_Data/datasets/medical_benchmarks/medqa/

MedMCQA

Available

Large-scale multi-subject medical Q&A

Questions: 193,155
Format: Multiple-choice
Location: /Training_Data/datasets/medical_benchmarks/medmcqa/

PubMedQA

Available

Biomedical research question answering

Questions: ~214,269
Format: Yes/No/Maybe
Location: /Training_Data/datasets/medical_benchmarks/pubmedqa/

âš ī¸ Note: These benchmarks use different formats (multiple-choice/Yes-No) than your MIMIC-III dataset (open-ended Q&A). Use MedHELM LLM-jury for evaluation - it works with any format and doesn't require ground truth!

✨ Enhanced Features

📚 500+ Medications

Comprehensive drug dictionary covering all major medication categories

đŸšĢ Enhanced Negation

20+ negation patterns to filter out negated conditions and medications

â˜ī¸ AWS Integration

Optional AWS Comprehend Medical for cloud-based high-quality extraction

đŸ”Ŧ John Snow Labs

Optional state-of-the-art medical NLP with 95% precision

Model Training Results

View metrics and performance of trained models.

Select a completed training job to view results.

💰 Billing & API Documentation

Complete API documentation for external parties to integrate with Summit Health ML Training API, including billing and cost allocation.

📚 Quick Links

📖 Full API Documentation

Complete HTML documentation with all endpoints, examples, and billing information

🔑 Authentication

API Key or OAuth 2.0 required for all requests

💰 Billing

Automatic cost allocation to user_id and billing_account

🚀 Quick Start

Base URL: https://your-backend-server.com

API Version: v1

Content-Type: application/json

📋 Key Endpoints

POST /api/training/start

Start a new training job with billing allocation

Parameters: instance_type, datasets, user_id, billing_account

GET /api/training/process-status

Check training job status and progress

Parameters: job_id

GET /api/training/cost-tracking/user-costs

Get cost breakdown for user or billing account

Parameters: user_id, billing_account, start_date, end_date

💰 Pricing Structure

Resource Type Pricing Description
Classical CPU $5.00/hour Standard CPU training instances
48 vCPU $7.50/hour High-performance 48-core instances (3x faster)
Trainium $15.00/hour AWS Trainium instances for accelerated training
Base Cost $10.00/job One-time setup cost per training job
Storage $0.10/GB/month Model storage cost

đŸ’ģ Code Example

import requests

API_BASE_URL = "https://your-backend-server.com"
API_KEY = "YOUR_API_KEY"

# Start training with billing allocation
response = requests.post(
    f"{API_BASE_URL}/api/training/start",
    params={
        "instance_type": "48vcpu",
        "datasets": "MIMICIII,MIMIC4"
    },
    headers={
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    },
    json={
        "user_id": "external_user_123",
        "billing_account": "account_abc"
    }
)

result = response.json()
job_id = result["job_id"]
print(f"Training started: {job_id}")

💡 Note: All costs are automatically tracked and allocated to the provided user_id and billing_account. You can query costs at any time using the billing endpoints.

📞 Support

For API access, billing questions, or technical support: