Efficient data sharing combined with access to computational resources is becoming essential across the sciences. Large-scale simulations on HPC platforms are now as important for scientific advances as the associated experimental and theoretical studies. Sharing these computationally expensive results enables comparison with experiments and accelerates discovery.
In this talk, we introduce OpenCosmo, a project designed to make large cosmological simulation datasets widely available and enable flexible data access and analysis. The architecture is based on Globus services, using Globus Auth for federated identity management across facilities, Globus Flows for orchestrating multi-step workflows, and Globus Compute to execute analysis tasks on HPC resources. Users can explore data interactively through a web portal, while MCP (Model Context Protocol) servers expose analysis workflows to AI agents for natural-language-driven data discovery.
A key component is OpenCosmo's client-server model: users submit interactive queries from lightweight environments such as login nodes or ALCF Jupyter notebooks, while the server distributes execution across multiple compute nodes. This architecture decouples user interaction from compute-intensive operations, enabling responsive exploration of large datasets. The framework is designed to be extensible to other scientific domains seeking to couple data sharing with distributed computational capability.
Conference information
Registration
Registration for this event is currently open.