ML Model Training Management
Training Pipeline
Data Extraction
Initializing...
Data Selection
Select Batches
Training Config
Configure Parameters
Teraq Sync
Synchronize Data
Training
Model Training
ML Model Performance Evaluation
MedHELM LLM-Jury Evaluation
Deployment
Model Ready
Available Batches
Selected Batches
Total Documents
Active Training Jobs
Select Datasets for Training
Select datasets and batches from batch review to use for model training. Data will be synchronized with Teraq.
Public Medical Datasets
MIMIC-III
Medical Information Mart for Intensive Care III
MIMIC-IV
Medical Information Mart for Intensive Care IV
Your Batch Review Data
Loading available batches...
Training Configuration
No batches selected. Go to Data Selection tab to select batches.
Training Process Control
TinyLlama-1.1B Medical Pre-training with Nemotron 49B medical data extraction
Phase 1 - Causal Language Modeling (CLM)
Instance & Resource Monitoring
Active Training Processes
Loading active processes...
System Resources - Summit Backend
Model Configuration
Training Details
Teraq.ai Synchronization
đē Training Terminal Output
đ TensorBoard Visualization
Real-time training metrics, loss curves, and performance visualization
Stored Training Models
Overview of all trained models with metadata from database.
Loading stored models...
Active Training Jobs
Monitor training jobs synchronized with Teraq platform.
| Job ID | Model Name | Status | Progress | Started | Duration | Actions |
|---|---|---|---|---|---|---|
|
Loading training jobs... |
||||||
ML Model Performance Evaluation
Evaluate trained models using MedHELM LLM-jury evaluation protocol. Based on MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks (Bedi et al., 2025).
Select Model for Evaluation
Evaluation Metrics (MedHELM LLM-Jury)
Based on MedHELM (Bedi et al., 2025) - arXiv:2505.23802. Uses LLM-jury evaluation with three criteria. Final score is the mean of all three metrics (equal weighting).
Accuracy (33.3%)
Factual correctness and adherence to medical guidelines
Completeness (33.3%)
Thoroughness in addressing all aspects of the query
Clarity (33.3%)
Organization, readability, and easy to understand language
đ Available Human-Annotated Benchmarks
Standard medical Q&A benchmarks available for evaluation. These datasets are human-annotated and clinically validated.
MedQA
AvailableUSMLE medical licensing exam questions
MedMCQA
AvailableLarge-scale multi-subject medical Q&A
PubMedQA
AvailableBiomedical research question answering
â ī¸ Note: These benchmarks use different formats (multiple-choice/Yes-No) than your MIMIC-III dataset (open-ended Q&A). Use MedHELM LLM-jury for evaluation - it works with any format and doesn't require ground truth!
Run Evaluation
⥠Inference Configuration
Trainium Instance: ec2-54-159-165-80.compute-1.amazonaws.com:8500
Inference will use Trainium API endpoint for fast model evaluation.
Enhanced Fact Extraction
Extract medical facts from clinical notes using enhanced rule-based extraction with optional AWS Comprehend Medical and John Snow Labs integration. Supports 500+ medications, enhanced negation detection, and multi-source extraction.
Extraction Method
Clinical Note Input
đ Create FactEHR Dataset from PhysioNet
Generate FactEHR-style datasets from PhysioNet clinical notes. Select a dataset source and extraction method.
đ Human-Annotated Medical Benchmarks
Standard medical Q&A benchmarks available for evaluation. These datasets are human-annotated and clinically validated.
MedQA
AvailableMedical licensing exam questions (USMLE)
MedMCQA
AvailableLarge-scale multi-subject medical Q&A
PubMedQA
AvailableBiomedical research question answering
â ī¸ Note: These benchmarks use different formats (multiple-choice/Yes-No) than your MIMIC-III dataset (open-ended Q&A). Use MedHELM LLM-jury for evaluation - it works with any format and doesn't require ground truth!
⨠Enhanced Features
đ 500+ Medications
Comprehensive drug dictionary covering all major medication categories
đĢ Enhanced Negation
20+ negation patterns to filter out negated conditions and medications
âī¸ AWS Integration
Optional AWS Comprehend Medical for cloud-based high-quality extraction
đŦ John Snow Labs
Optional state-of-the-art medical NLP with 95% precision
Model Training Results
View metrics and performance of trained models.
Select a completed training job to view results.
đ° Billing & API Documentation
Complete API documentation for external parties to integrate with Summit Health ML Training API, including billing and cost allocation.
đ Quick Links
Complete HTML documentation with all endpoints, examples, and billing information
API Key or OAuth 2.0 required for all requests
Automatic cost allocation to user_id and billing_account
đ Quick Start
Base URL: https://your-backend-server.com
API Version: v1
Content-Type: application/json
đ Key Endpoints
/api/training/start
Start a new training job with billing allocation
Parameters: instance_type, datasets, user_id, billing_account
/api/training/process-status
Check training job status and progress
Parameters: job_id
/api/training/cost-tracking/user-costs
Get cost breakdown for user or billing account
Parameters: user_id, billing_account, start_date, end_date
đ° Pricing Structure
đģ Code Example
import requests
API_BASE_URL = "https://your-backend-server.com"
API_KEY = "YOUR_API_KEY"
# Start training with billing allocation
response = requests.post(
f"{API_BASE_URL}/api/training/start",
params={
"instance_type": "48vcpu",
"datasets": "MIMICIII,MIMIC4"
},
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
},
json={
"user_id": "external_user_123",
"billing_account": "account_abc"
}
)
result = response.json()
job_id = result["job_id"]
print(f"Training started: {job_id}")
đĄ Note: All costs are automatically tracked and allocated to the provided user_id and billing_account.
You can query costs at any time using the billing endpoints.
đ Support
For API access, billing questions, or technical support:
- Email: api-support@summithealth.ai
- Documentation: View Full Documentation
âī¸ Quantum ML Training
Advanced quantum-enhanced machine learning solutions for clinical data analysis and survival prediction.