By Erica Yee • June 15, 2018
The Science IT Profiles series highlights how the Scientific Computing Group supports the work of Berkeley Lab researchers spanning various disciplines.
Professor David Romps spends his days making shapes out of clouds — only these clouds are enormous simulations often running on high performance computing (HPC) machines. As an associate professor in UC Berkeley’s Department of Earth and Planetary Science and a faculty scientist in Berkeley Lab’s Climate and Ecosystem Sciences Division, Romps leverages supercomputing systems to support large and complex datasets of everything from atmospheric to oceanographic records in order to better understand Earth’s climate.
Romps uses computing in three broad aspects of his research: global climate models, small domain high-resolution models and observational analysis.
“In all three cases, the computational workload is a lot more than you can manage on a laptop, or even a high performance desktop,” said Romps. “You need hundreds of cores and lots of memory to get these things done.”
That’s where Berkeley Lab’s newly formed Science IT organization and its Scientific Computing Group (SCG) comes in. The SCG (also known as the High Performance Computing Services), maintains the Lawrencium cluster, an institutional HPC resource for Berkeley Lab researchers to use.
For running big global climate models that use a grid spacing that can be 100 kilometers on a side, Romps’ group uses Lawrencium resources to get code up and running and for debugging and initial hypothesis testing. Then, he can take advantage of the synergies between Lawrencium and the larger supercomputing resources at Berkeley Lab’s National Energy Research Scientific Computing (NERSC) center for his big production runs.
“That’s extremely valuable, being able to have quick turnaround time and have local resources to do all of that development and initial testing,” Romps said of the Lawrencium cluster.
Romps also uses computing for running smaller domain simulations of the atmosphere, called cloud resolving models or large eddy simulations. These comparatively small simulations use a grid spacing that can be one kilometer or even 10 meters on a side in order to resolve the dominant, turbulent flow in clouds and storms and have more confidence that the atmosphere is represented accurately.
“Then we can use the results of those simulations to fine tune the algorithms that we use in the global climate models to represent those unresolved turbulent convective stormy motions,” said Romps.
Lastly, his group uses computing resources to analyze and process observational data, such as rainfall, satellite data or ground-based measurements of lightning.
A global climate model. (Credit: David Romps)
Searching for the Lightning and Wildfire Link
Romps has employed HPC in the past for understanding what sets distributions and frequency of lightning strikes. Lightning triggers wildfires and is consequently responsible for half of the burned area in the U.S. As the planet warms, there will likely be drier vegetation that could fuel wildfires, and the frequency of lightning strikes will likely also increase. Thus understanding how much that frequency increases and the spatial distribution of strikes is important to understanding the whole picture of how much wildfires will increase. Romps used Lawrencium resources for some of the initial data analysis on observations of lightning, high-resolution precipitation data and calculations of atmospheric properties. He’s currently introducing observations of wildfires to try to find the link between changes in frequency of lightning and changes in frequency of wildfires.
“It’s not an obvious link,” said Romps, explaining that lightning increasing by a certain percentage does not mean lightning-triggered wildfires will increase by the same percentage. The lightning rate could double, but if every additional lightning strike is hitting the charred ground where a previous strike already hit, it may not trigger another wildfire. If it does, the new wildfire may be so close to where another one was triggered that the two fires merge and the additional lightning does not contribute to any more burned area.
“So one has to be careful and look at these observations and try to tease out the relationship between frequency of lightning strikes and frequency of fire ignition or area burned by lightning-triggered fire,” explained Romps.
Different Sized Computing, Different Results
Running global climate models may also reveal unexpected results, as evidenced in a bizarre phenomenon found in simulations called convective aggregation.
Climate scientists run cloud-resolving simulations to observe the fluid flows and clouds popping up. Since these simulations are computationally expensive, early tests were only run on small domains. As computational resources improved over the years, researchers kept modeling bigger and bigger domains. However, when the domains reached a certain size — about 300 to 400 km on a side — the results changed unexpectedly.
“All the convection grouped up into one spot, it rained there, and everywhere else became dry and desert-like” even though it was over an ocean, Romps explained.
This unrealistic-looking convective aggregation in the simulations induced much hand-wringing from scientists because they did not see it happening in the real-life tropics. Yet, they did know storms and clouds tend to organize in certain locations before getting destroyed by winds and subsequently try to grow again.
“The thought is, is what’s happening in our models the end point that the tropics are trying to get to, but there are largescale winds that keep blowing these things apart?” said Romps, adding that the mystery of convective aggregation is still an unresolved question in climate science.
“Because in the models you see a dependence on the size of the domain, we need to explore large domain sizes,” he said. “With today’s computing, you can explore small- to medium-sized domains with this behavior on a cluster like Lawrencium. But if you want to explore the large domains, you’ve got to move to NERSC.”
Romps reiterated that Lawrencium’s rapid turnaround times are critical for initial testing stages.
“The small simulations are going to give me a really good look at what the answer is. The science and the learning often happens when you run the small simulations on small clusters,” he said. “You’re debugging your code, you’re trying different diagnostics and you can’t wait hours for the thing to run again because it disrupts your whole flow.”
Leveraging Condo Computing
Romps’ is also currently working on a Laboratory Directed Research and Development project for which he is adding to the Lawrencium resources. Lawrencium is set up as a Condo cluster to which principal investigators can add their own purchased compute nodes. Researchers then take advantage of Lawrencium’s high-speed connections and storage in exchange for letting other users make use of any free time on their nodes.
“Many Lab scientists use Condo computing because it is a cost-effective way to take advantage of state-of-the-art resources and a dedicated team managing workflows, while providing the scientists flexibility to scale,” said SCG manager Gary Jung.
With the new compute nodes, Romps aims to figure out how to take large-scale metrics from climate models and infer what’s happening at the smaller scale for unresolved weather phenomena such as hail, sphere winds and lightning. He is also exploring advancing next-generation algorithms and techniques for stereophotography, or stitching together images from different cameras and various vantage points.
“The SCG has standardized protocols to have PIs procure nodes and storage. It’s a well-run machine,” said Romps. “When I’ve got problems, the group is extremely responsive and fixes things right away. It’s been just a great experience.”