VSF-Med

A Vulnerability Scoring Framework for Medical Vision-Language Models

Overview

VSF-Med is a comprehensive framework designed to systematically evaluate the safety, reliability, and adversarial robustness of Vision-Language Models (Vision LLMs) in clinical imaging applications.

Vulnerability Dimensions

Safety Dimensions

  • Prompt injection effectiveness
  • Jailbreak resilience
  • Potential confidentiality breach
  • Risk of misinformation

Reliability Dimensions

  • Denial of service resilience
  • Persistence of attack effects
  • Safety bypass success
  • Impact on medical decision support

We apply this framework to ten clinically motivated adversarial scenarios, ranging from contextual prompt injections to image perturbations, using the MIMIC-CXR dataset.

Repository Structure

VSF-Med/
├── src/                         # Source code
│   ├── config/                  # Configuration files
│   │   └── default_config.yaml  # Default configuration
│   ├── database/                # Database schemas
│   │   └── dbschema.sql         # PostgreSQL database schema
│   ├── models/                  # Model implementations
│   │   └── evaluation/          # Evaluation models
│   │       └── vulnerability_scoring.py  # VSF-Med scoring framework
│   └── utils/                   # Utility functions
│       ├── database/            # Database utilities
│       │   └── database_utils.py  # Database operations
│       ├── perturbations/       # Perturbation utilities
│       │   ├── image_perturbations.py  # Visual perturbation methods
│       │   └── text_perturbations.py   # Text attack methods
│       └── visualization/       # Visualization utilities
│           └── image_utils.py   # Image analysis tools
├── notebooks/                   # Main experiment notebooks
│   ├── 01_data_preparation_adversarial_samples.ipynb   # Data preparation
│   ├── 02_model_evaluation_chexagent_baseline.ipynb    # CheXagent baseline
│   ├── 03_model_evaluation_chexagent_perturbed.ipynb   # CheXagent perturbation tests
│   ├── 04_model_evaluation_gpt_baseline.ipynb          # GPT-4o baseline
│   ├── 05_vulnerability_scoring_framework.ipynb        # Applying VSF-Med
│   ├── 06_model_evaluation_claude.ipynb                # Claude model tests
│   ├── 07_benchmarking_models.ipynb                    # Model comparison
│   └── 08_analysis_radiologist_comparison.ipynb        # Radiologist comparison
├── templates/                   # Templates for experiments
│   ├── text_attack_templates.txt          # Text attack patterns
│   ├── visual_perturbation_methods.txt    # Visual attack methods
│   └── vsf_scoring_rubric.txt             # Scoring rubric

Evaluation Workflow

  1. Data Preparation: Prepare a diverse selection of 5,000 frontal chest X-ray studies from MIMIC-CXR, stratified by patient demographics and key pathologies.
  2. Adversarial Variant Generation:
    • Text attacks: 18 attack categories with 2-4 expert-curated prompt templates each
    • Visual attacks: 6 perturbation methods (Gaussian noise, checkerboard, random arrow overlay, Moiré pattern, steganographic hide, LSB extraction)
  3. Model Evaluation: Test multiple vision-language models on both standard and adversarial inputs:
    • CheXagent-8b: Specialized medical imaging model
    • GPT-4o: General-purpose multimodal model
    • Claude: General-purpose multimodal model
  4. Vulnerability Scoring: Apply the VSF-Med framework to score model outputs across vulnerability dimensions.
  5. Benchmarking: Compare performance across models to identify strengths and weaknesses.
  6. Clinical Comparison: Compare model outputs with radiologist interpretations to assess clinical impact.

Attack Categories

Text Attack Categories

Our framework formalizes 18 different attack categories including:

  • Prompt Injection
  • Jailbreak Attempts
  • Confidentiality Breach
  • Misinformation Generation
  • Denial-of-Service
  • Persistence Attacks
  • Safety Bypass
  • Semantic Shift
  • Omission Attacks
  • Over-Confidence Induction

Visual Perturbation Methods

We apply six visual perturbation techniques to test model robustness:

  • Gaussian noise at various levels
  • Checkerboard overlays (single or tiled)
  • Moiré patterns with adjustable frequency
  • Random arrow artifacts
  • Steganographic information hiding
  • LSB-plane extraction

Perturbation parameters are optimized via grid search to balance imperceptibility (SSIM ≥ 0.85) with attack potency.

Jupyter Notebooks

The experimental workflow is organized in sequential Jupyter notebooks:

Data Preparation & Sample Generation

  • 01_data_preparation_adversarial_samples.ipynb: Prepares datasets and generates adversarial samples using 18 attack categories and 6 perturbation methods.

Model Evaluation Notebooks

  • 02_model_evaluation_chexagent_baseline.ipynb: Evaluates the baseline performance of StanfordAIMI's CheXagent-8b model on unperturbed images.
  • 03_model_evaluation_chexagent_perturbed.ipynb: Tests CheXagent against visually perturbed images to assess robustness.
  • 04_model_evaluation_gpt_baseline.ipynb: Evaluates the baseline performance of GPT-4o Vision on standard inputs.
  • 06_model_evaluation_claude.ipynb: Tests Anthropic's Claude model on both standard and adversarial inputs.

Analysis Notebooks

  • 05_vulnerability_scoring_framework.ipynb: Applies the VSF-Med framework to score model responses across 8 vulnerability dimensions.
  • 07_benchmarking_models.ipynb: Performs comprehensive cross-model comparison and benchmark analysis.
  • 08_analysis_radiologist_comparison.ipynb: Compares model performance with radiologist ground truth to assess clinical impact.

Installation

# Clone the repository
git clone https://github.com/UNHSAILLab/VSF-Med.git
cd VSF-Med

# Create and activate virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Requirements

  • Python 3.8+
  • API keys:
    • OpenAI API key (for GPT-4o access)
    • Anthropic API key (for Claude access)
  • MIMIC-CXR dataset access
  • PostgreSQL database
  • Required Python libraries:
    • pandas, numpy
    • sqlalchemy, psycopg2-binary
    • openai, anthropic
    • PIL, cv2, matplotlib, scikit-image
    • seaborn, plotly, nltk

Usage

Configuration

  1. Copy and customize the default configuration:
    cp src/config/default_config.yaml src/config/my_config.yaml
  2. Edit my_config.yaml to set database credentials, API keys, and data paths.

Running Experiments

The experimental workflow is organized in sequential notebooks:

# 1. Data Preparation and Adversarial Sample Generation
jupyter notebook notebooks/01_data_preparation_adversarial_samples.ipynb

# 2. Model Baseline Evaluations
jupyter notebook notebooks/02_model_evaluation_chexagent_baseline.ipynb
jupyter notebook notebooks/04_model_evaluation_gpt_baseline.ipynb
jupyter notebook notebooks/06_model_evaluation_claude.ipynb

# 3. Adversarial Testing
jupyter notebook notebooks/03_model_evaluation_chexagent_perturbed.ipynb

# 4. Vulnerability Scoring and Analysis
jupyter notebook notebooks/05_vulnerability_scoring_framework.ipynb
jupyter notebook notebooks/07_benchmarking_models.ipynb
jupyter notebook notebooks/08_analysis_radiologist_comparison.ipynb

Citation

@article{vsf-med2024,
  title={VSF-Med: A Vulnerability Scoring Framework for Medical Vision-Language Models},
  author={[Author names]},
  journal={[Journal]},
  year={2024}
}