OpenWaves

A Large-Scale Anatomically Realistic Ultrasound-CT Dataset
for Benchmarking Neural Wave Equation Solvers

Welcome to OpenWaves

OpenWaves, a large-scale wave equation dataset designed to bridge the gap between theoretical equations and practical imaging applications. This collection contains over 45 million frequency-domain wave simulations derived from anatomically realistic human phantoms encompasseing three critical anatomical regions: breast, leg, arm. OpenWaves establishes the first open-access repository for wave physics simulations in medical imaging applications, developed by PKU Computational Scientific Imaging Lab

Breast Dataset

  • 7,520 phantoms with
    15,400,960 wavefields
Breast Sample

Leg Dataset

  • 7,001 phantoms with
    14,338,048 wavefields
Leg Sample

Arm Dataset

  • 7,526 phantoms with
    15,413,248 wavefields
Arm Sample
USCT Framework
Ultrasound Computed Tomography Equipment

Dataset Specifications

Breast Dataset Details

Data Source Breast Type Frequency (MHz) Phantoms Storage
VICTRE Heterogeneous (HET) 0.25-0.60 880 773GB
Fibroglandular (FIB) 0.25-0.60 880 773GB
Fatty (FAT) 0.25-0.60 880 773GB
Extremely Dense (EXD) 0.25-0.60 880 773GB
Stable Diffusion Heterogeneous (HET) 0.25-0.60 1000 879GB
Fibroglandular (FIB) 0.25-0.60 1000 879GB
Fatty (FAT) 0.25-0.60 1000 879GB
Extremely Dense (EXD) 0.25-0.60 1000 879GB

Leg Dataset Details

Data Source Frequency (MHz) Phantoms Storage
X-ray CT Conversion 0.25-0.60 1,001 880GB
Stable Diffusion 0.25-0.60 6,000 5.15TB

Arm Dataset Details

Data Source Frequency (MHz) Phantoms Storage
X-ray CT Conversion 0.25-0.60 809 711GB
Stable Diffusion 0.25-0.60 6,717 5.77TB

Featured Challenge

AI4S Cup - Prediction of Wavefield in Ultrasonic Computed Tomography

Use deep learning neural networks to accelerate the calculation of solutions for wave equations, aiming to reduce computation time, ensure result accuracy, and achieve efficient ultrasound CT imaging.

Join Now →

AI4S Cup - Reconstruction of Wave Velocity in Ultrasonic Computed Tomography

Train neural networks to directly simulate the inversion process, enabling the inference of the wave velocity distribution of the observed object directly from wavefield observation data.

Join Now →

Recommended configuration

16-core CPU
MATLAB 2020b+
64GB RAM

Get Started in 3 Steps

1

Download source data from huggingface

2

Prepare Speed data

>> run split_data.m

output:
📂 your_project_path/
└── 📂 organ_speed/
    ├── 📂 train/train_xx.mat
    └── 📂 test/test_xx.mat

3

Launch OpenWaves.exe

This is the OpenWaves.exe runtime interface example. Configure the following parameters to control the data generation process And set the speed path to your output dir in last step. The system will generate data and detailed log files in your specified output directory.

BibTeX Citation

@misc{zeng2025openwaves,
title={OpenWaves: A Large-Scale Anatomically Realistic Ultrasound-{CT} Dataset for Benchmarking Neural Wave Equation Solvers},
author={Zhijun Zeng and Youjia Zheng and Hao Hu and Zeyuan Dong and Yihang Zheng and Xinliang Liu and Jinzhuo Wang and Zuoqiang Shi and Linfeng Zhang and Yubing Li and He Sun},
year={2025},
url={https://openreview.net/forum?id=u14Y236LwX}
}