Scalable and Reproducible Virtual Screening through an API-Integrated Workflow

Tiffany Huff, Austin Darrow, Joshua Medina, Erik Ferlanti, James Carson, John Fonner, Sal Tijerina, Stanley J. Watowich, William J. Allen

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Virtual screening is a key step of the drug discovery process which utilizes computational resources to simulate the behavior of small molecules in the binding site of a target protein. [13] Researchers often test millions of molecules when searching for an early hit compound, requiring significant CPU hours. An accessible, convenient, fast, and computationally efficient means for virtual screening is desirable in order for researchers to conserve resources in the early phase of drug discovery. We developed an application programming interface (API) integrated workflow that allows researchers to submit virtual screening batch jobs to the Lonestar6 supercomputer through a web portal. The containerized [7] workflow employs parallelized Python scripting using mpi4py [2] to efficiently distribute molecular docking tasks performed by AutoDock Vina. [3] The Texas Advanced Computing Center (TACC) API (TAPIS) framework [16], a REST API framework for research computing, was used to integrate the workflow into the University of Texas System Research Cyberinfrastructure (UTRC) web portal. [12] Five large libraries representing commercially-available small molecules or fragments were prepared and are available for screening. Here, we discuss our experience developing this service, as well as the results of extensive internal benchmarks to determine the most efficient parallelization scheme to employ for each molecule library when submitting batch jobs. Regardless of the chosen ligand library, the core, node, and parallel task specifications allow the user to run a virtual drug screening and receive their resulting top docking scores in 24 hours. The service is available to registered academic users, and more information can be found at the Drug Discovery at TACC website.

Original languageEnglish (US)
Title of host publicationPEARC 2023 - Computing for the common good
Subtitle of host publicationPractice and Experience in Advanced Research Computing
PublisherAssociation for Computing Machinery, Inc
Number of pages4
ISBN (Electronic)9781450399852
StatePublished - Jul 23 2023
Externally publishedYes
Event2023 Practice and Experience in Advanced Research Computing, PEARC 2023 - Portland, United States
Duration: Jul 23 2023Jul 27 2023

Publication series

NamePEARC 2023 - Computing for the common good: Practice and Experience in Advanced Research Computing


Conference2023 Practice and Experience in Advanced Research Computing, PEARC 2023
Country/TerritoryUnited States


  • API
  • Apptainer
  • Bioinformatics
  • Containers
  • Datasets
  • High-performance computing
  • MPI
  • Molecular Docking
  • Visualization

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Computer Science Applications
  • Software
  • Theoretical Computer Science


Dive into the research topics of 'Scalable and Reproducible Virtual Screening through an API-Integrated Workflow'. Together they form a unique fingerprint.

Cite this