Computational Chemistry · Molecular Dynamics

GPU Molecular Dynamics — Local Pipeline Demo

A full OpenMM pipeline stood up on a single workstation GPU: install → simulate → analyze → visualize. Explicit- and implicit-solvent runs of small systems, with GPU-vs-CPU timing. 在一張工作站顯卡上跑通的完整分子動力學流程 — 從環境安裝到模擬、分析、視覺化。給合作者看我們做了哪些模擬與效能。

📅 2026-07-01 GPU NVIDIA GT 1030 · 2 GB CUDA 12.6 OpenMM 8.4 AMBER ff14SB · TIP3P Windows 11

01 What we simulated

Three representative systems spanning explicit solvent (full water + PME) and implicit solvent (Generalized Born). All run on the CUDA platform with a Langevin Middle integrator, 2 fs timestep, 300 K.

SystemSolventAtoms Force fieldLengthWall timeThroughput
TIP3P water box
2 nm cube
explicit · PME 774 AMBER14 / TIP3P 10 ps 1.5 s 633 ns/day
Trp-cage (TC5b)
PDB 1L2Y · 20 res
implicit · GBn2 304 ff14SB + GBn2 500 ps 44.7 s 967 ns/day
Alanine dipeptide
Ace-Ala-Nme
implicit · OBC 22 AMBER / OBC 2 ns 91.8 s 1882 ns/day
Read this as a proof of concept. The GT 1030 is an entry-level display card (2 GB, ~384 CUDA cores). The point is that the whole pipeline runs end-to-end on commodity hardware — the identical scripts scale unchanged to a production GPU (A100 / RTX 4090) or an HPC cluster, where throughput is 10–50× higher.

02 Live trajectory

Trp-cage over 500 ps of implicit-solvent dynamics (100 frames, backbone aligned). Cartoon coloured N→C terminus. The mini-protein stays folded — thermal breathing, not unfolding.

Loading trajectory…
Trp-cage (1L2Y) · implicit solvent · 500 ps @ 300 K · 100 frames · CUDA

03 Analysis

Every run is fully reproducible — trajectories analyzed with MDTraj, figures generated from the raw output.

Alanine dipeptide Ramachandran plot
Ramachandran map — alanine dipeptide, 2000 frames. Density concentrates in the β/C5 and αR basins; the left-handed αL region is essentially empty. Textbook backbone conformational sampling.
Trp-cage RMSD, radius of gyration and energy
Trp-cage stability — backbone RMSD mean 2.56 Å, radius of gyration 7.59 ± 0.15 Å (stays compact), potential energy near-Gaussian. The protein is well-equilibrated and folded throughout.

04 Performance — GPU vs CPU

Same system (2661-atom TIP3P water box, PME, 3000 production steps), three OpenMM compute platforms. CUDA > OpenCL > CPU, as expected.

CUDA
218.7 ns/day · 2.3×
OpenCL
195.5 ns/day · 2.0×
CPU
95.5 ns/day · 1.0×
On this entry-level GPU the CUDA speed-up over multi-core CPU is a modest 2.3×. On a production GPU the same benchmark separates by one to two orders of magnitude — the value of the GPU path grows sharply with system size.

05 Methods & honest notes

  • Engine: OpenMM 8.4, CUDA platform, mixed precision.
  • Integrator: Langevin Middle, 2 fs, 300 K, 1 ps⁻¹ friction.
  • Explicit: TIP3P water, PME electrostatics, 0.9 nm cutoff, H-bond constraints.
  • Implicit: Generalized Born (GBn2 / OBC), no periodic box — fast conformational sampling.
  • Prep: PDBFixer (missing atoms, protonation at pH 7); analysis with MDTraj.
What implicit solvent buys and costs. Replacing thousands of explicit waters with a GB continuum makes small-protein sampling fast and cheap — but it drops explicit water structure (bridging waters, discrete H-bonds, viscosity). We use it for exploration; explicit solvent remains the reference for quantitative work.