Richard Barnes

News

May 2024

Invited Workshop CSDMS Annual Meeting Montclair, NJ

May 2024

Apportioning sources and estimating fluxes of contaminants along a tributary of the Thames river with an inverse modelling strategy Accepted at Science of the Total Environment

May 2024

Using convex optimization to efficiently apportion tracer and pollutant sources from point concentration observations Accepted at Water Resources Research

Mar 2024

Invited Talk Seminar Imperial College London

Jan 2024

Apportioning sources and estimating fluxes of contaminants along a tributary of the Thames river with an inverse modelling strategy Preprint at EarthArXiv

Oct 2023

Invited Talk TBD Oxford, UK

Sep 2023

Using convex optimization to efficiently apportion tracer and pollutant sources from point concentration observations Preprint at EarthArXiv

May 2023

Panelist CSDMS Annual Meeting Boulder, CO

May 2023

Invited Workshop CSDMS Annual Meeting Boulder, CO

May 2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models Accepted at TLMR

Apr 2023

Invited Talk CMU Engineering and Policy Seminar Pittsburgh, PN

Sep 2022

Invited Talk UN Economic Commission for Europe Global Forum for Geography and Statistics Virtual

Aug 2022

Teacher ESA Short Course on Mathematical Programming in Ecology Virtual

Aug 2022

Mentor MIT Summer Geometry Initiative Virtual

June 2022

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models Preprint at arXiv

Jan 2022

Advances in hexagon mesh-based flow direction modeling Published at Advances in Water Resources

Dec 2021

Panelist Workshop on the Science of Scientific-Software Development and Use Virtual

Dec 2021

Using big data to study and improve scientific software Preprint at Workshop on the Science of Scientific-Software Development and Use

Nov 2021

Ten simple rules on writing clean and reliable open-source scientific software Published at PLOS Computational Biology

Oct 2021

Finding Antarctica's Pole of Inaccessibility Published at Polar Record

Jun 2021

Coupled groundwater and dynamic lake modelling using the Water Table Model (WTM) Preprint at Water Resources Research

Mar 2021

Invited Talk Lawrence Berkeley National Lab Berkeley, CA

Jan 2021

Invited Talk CMU Electrical and Computer Engineering Pittsburgh, PN

Jan 2021

Invited Talk Flatiron Institute New York, NY

Dec 2020

Gerrymandering and Compactness: Implementation Flexibility and Abuse First View at Political Analysis

Nov 2020

Fill-Spill-Merge: Flow routing in depression hierarchies Accepted at Earth Surface Dynamics

Nov 2020

2nd Place SC20 ACM Research Competition Global

Aug 2020

PhD Computational Science
UC Berkeley

Aug 2020

MS Computer Science
UC Berkeley

Aug 2020

BusTr: predicting bus travel times from real-time traffic Published at ACM SIGKDD 2020

Aug 2020

Invited Lecturer OpenGeoHub Summer School Wageningen, Netherlands

Jun 2020

Finding hierarchies in depressions and morphological segmentations Published at Earth Surface Dynamics

Jun 2020

BusTr: Predicting Bus Travel Times from Real-Time Traffic Preprint at arXiv

Jun 2020

Depression Hierarchies Published at Earth Surface Dynamics

May 2020

Fill-Spill-Merge: Flow routing in depression hierarchies Preprint at Earth Surface Dynamics

May 2020

Gerrymandering and Compactness: Implementation Flexibility and Abuse Accepted at Political Analysis

May 2020

Fill-Spill-Merge Preprint at Earth Surface Dynamics

Jan 2020

Invited Talk Santa Fe Institute Santa Fe, NM

Dec 2019

Outstanding Presentation Award AGU

Dec 2019

Cryosphere Innovation Award AGU

Nov 2019

Invited Talk U. Minnesota Department of Geography, Environment, and Society. Minneapolis, MN

Oct 2019

Rescal-snow: Simulating snow dunes Published at Journal of Open Source Software

Jun 2019

Finding hierarchies in depressions and morphological segmentations Preprint at Earth Surface Dynamics

Jun 2019

Depression Hierarchies Preprint at Earth Surface Dynamics

Feb 2019

Best Practices for Fostering Diversity and Inclusion in Data Science: Report from the BIDS Best Practices in Data Science Series Preprint at SocArXiv

Jan 2019

Optimal orientations of discrete global grids and the Poles of Inaccessibility Published at Intn'l J. of Digital Earth

Jan 2019

Accelerating a fluvial incision and landscape evolution model with parallelism Published at Geomorphology

Jan 2019

Keynote Speaker JRC Workshop on Benchmarking Global Digital Elevation Models Ispra, Italy

Sep 2018

1st Prize MCED Award for Innovative Contributions to Ecological Modelling Global

Aug 2018

Presenter Geomorphometry 2018 Boulder, CO

Aug 2018

Best Student Paper Geomorphometry 2018 Boulder, CO

Aug 2018

Fellow Berkeley Institute for Data Science Berkeley, CA

Jul 2018

Participant Second Workshop on Conceptualizing a Geospatial Software Institute Chicago, IL

Jun 2018

Software Carpentry Instructor Berkeley Institute for Data Science Berkeley, CA

Jun 2018

Mentor CU Boulder GPU Hackathon Boulder, CO

Mar 2018

Accelerating a fluvial incision and landscape evolution model with parallelism Preprint at arXiv

Mar 2018

Gerrymandering and Compactness: Implementation Flexibility and Abuse Preprint at arXiv

Mar 2018

Invited Talk Pacific Northwest National Lab’s “Computing at PNNL” Seminar Richland, WA

Feb 2018

Mentor BNL KNL Hackathon Upton, NY

Oct 2017

Mentor ORNL OpenACC Hackathon Knoxville, TN

Oct 2017

Invited Talk LBNL Terrestrial Biogeochemistry Group Berkeley, CA

Aug 2017

Presenter FOSS4G Boston, MA

Jul 2017

65 Million Years of Change in Temperature and Topography Explain Evolutionary History in Eastern North American Plethodontid Salamanders Published at American Naturalist

Jun 2017

Parallel Non-divergent Flow Accumulation For Trillion Cell Digital Elevation Models On Desktops Or Clusters Published at Environmental Modelling and Software

Mar 2017

Most Innovative Project TOM Berkeley Makeathon Berkeley, CA

Feb 2017

Parallel Non-divergent Flow Accumulation For Trillion Cell Digital Elevation Models On Desktops Or Clusters In-press at Environmental Modelling and Software

Feb 2017

65 Million Years of Change in Temperature and Topography Explain Evolutionary History in Eastern North American Plethodontid Salamanders In-press at American Naturalist

Aug 2016

Parallel Non-divergent Flow Accumulation For Trillion Cell Digital Elevation Models On Desktops Or Clusters Preprint at arXiv

Jul 2016

Parallel Priority-Flood Depression Filling For Trillion Cell Digital Elevation Models On Desktops Or Clusters In-press at Computers & Geosciences

Mar 2016

A Pipeline Strategy For Crop Domestication Published at Crop Science

Dec 2015

MS Evolutionary Ecology
U Minnesota

Apr 2015

Virtual Division Winner NASA Space Apps Challenge Global

Apr 2015

Fellow DoE Computational Science Graduate Fellowship USA

Oct 2014

Invited Speaker New Roots For Ecological Intensification Conference Estes Park, CO

Oct 2014

MyFutureClimate featured in Tech.MN

Sep 2014

The Reflective Plant Breeding Paradigm: A Robust System of Germplasm Development to Support Strategic Diversification of Agroecosystems Published at Crop Science

Sep 2014

Invited Speaker Twin Cities Climate Hack Minneapolis, MN

Sep 2014

Invited Speaker Twin Cities Climate Hack Minneapolis, MN

Sep 2014

First Place Hack4Good Global Climate Challenge Global

Aug 2014

3rd Place High Tech Division (for OMG Transit) Minnesota Cup Minneapolis, MN

Aug 2014

Finalist Knight Green Line Challenge St. Paul, MN

Jun 2014

Best Use of Public Data National Day of Civic Hacking Minneapolis, MN

Jun 2014

Code for Good Prize National Day of Civic Hacking Seattle, WA

Apr 2014

Presenter Data Services Accelerator Demo Day Santa Clara, CA

Apr 2014

Fellow NSF Graduate Research Fellowship USA

Jan 2014

Winner Beta.MN Start-Up Competition Minneapolis, MN

Dec 2013

Selected for the Intel Labs' Innovation Pipeline Portland, OR

Sep 2013

Invited Speaker Seminar St. Paul, MN

Aug 2013

Ignite Presenter ESA 2013 Minneapolis, MN

Aug 2013

Presenter ESA 2013 Minneapolis, MN

Aug 2013

Panelist House of Genius Minneapolis, MN

Jul 2013

Presenter MinneDemo Minneapolis, MN

Jul 2013

Invitee Champions of Change The White House

Jun 2013

ESRI Innovation Award National Day of Civic Hacking Minneapolis, MN

Jan 2013

An efficient assignment of drainage direction over flat surfaces in raster digital elevation models Published at Computers & Geosciences

Jan 2013

Priority-flood: An optimal depression-filling and watershed-labeling algorithm for digital elevation models Published at Computers & Geosciences

Dec 2009

Glacial sensors featured in BBC

Publications

2024

Apportioning sources and estimating fluxes of contaminants along a tributary of the Thames river with an inverse modelling strategy

Chrapkiewicz, Lipp, Barron, Barnes, Roberts

Science of the Total Environment (Accepted)

Abstract PDF

Concentrations of chemicals in water extracted from rivers provide crucial information for assessing environmental exposure to contaminants including fertilisers and insecticides, heavy metals, illicit drugs, pathogens, pharmaceuticals, plastics and perofluorinated substances. However, using concentrations measured along waterways to identify sources of contaminants and predict their fate is complicated by downstream mixing. We demonstrate how spot measurements of eco-toxically relevant chemicals collected along the Wandle, a rare chalk stream that flows into southeast London, UK, can be combined with the topology (structure) of its drainage network to objectively calculate locations and concentrations of contaminant sources using an inverse modelling approach. Mixing downstream is assumed to be conservative, and the location of sources and their concentrations are treated as unknown variables that we seek to find. Calculated source concentrations are used to predict concentrations of chemicals downstream. Contaminant fluxes are estimated by combining results with flow gauge measurements. Chemical risk quotients are calculated using predicted concentrations and estimates of probable no-effect values. Uncertainties are quantified and limitations to objective identification of unknown sources of chemicals from spot measurements are assessed. Principal component analyses of calculated source concentrations are used to identify suites of chemicals with similar pathways through the catchment. Two prevalent chemical ‘cocktails’ are identified. First, pharmaceuticals and insecticides (e.g. imidacloprid) are associated a subcatchment containing a known point source of treated wastewater (the Beddington wastewater treatment facility). Secondly, illicit drugs and salicylic acid are associated with multiple sources, and are interpreted as markers of input from untreated waste-water throughout the Wandle catchment. Finally, we develop a simple algorithmic approach to objectively design alternatively sampling campaigns. This approach optimises equal coverage of the river catchment for source apportionment from water samples by incorporating the topology of the drainage network. We suggest that inverse modelling of contaminant measurements provides an objective means to apportion sources in waterways from spot samples in catchments globally.

Using convex optimization to efficiently apportion tracer and pollutant sources from point concentration observations

Barnes and Lipp

Water Resources Research (Accepted)

Abstract 10.31223/X5708M

Rivers transport elements, minerals, chemicals, and pollutants produced in their upstream basins. A sample from a river is a mixture of all of its upstream sources, making it challenging to pinpoint the contribution from each individual source. Here, we show how a nested sample design and convex optimization can be used to efficiently unmix downstream samples of a well-mixed, conservative tracer into the contributions of their upstream sources. Our approach is significantly faster than previous methods. We represent the river's sub-catchments, defined by sampling sites, using a directed acyclic graph. This graph is used to build a convex optimization problem which, thanks to its convexity, can be quickly solved to global optimality---in under a second on desktop hardware for datasets of ~100 samples or fewer. Uncertainties in the upstream predictions can be generated using Monte Carlo resampling. We provide an open-source implementation of this approach in Python. The inputs required are straightforward: a table containing sample locations and observed tracer concentrations, along with a D8 flow-direction raster map. As a case study, we use this method to map the elemental geochemistry of sediment sources for rivers draining the Cairngorms mountains, UK. This method could be extended to non-conservative and non-steady state tracers. We also show, theoretically, how multiple tracers could be simultaneously inverted to recover upstream run-off or erosion rates as well as source concentrations. Overall, this approach can provide valuable insights to researchers in various fields, including water quality, geochemical exploration, geochemistry, hydrology, and wastewater epidemiology.

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Srivastava et al inc Barnes

TLMR

Abstract arXiv: 2206.04615

Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 442 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit 'breakthrough' behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.

2022

Advances in hexagon mesh-based flow direction modeling

Chang Liao, Tian Zhou, Donghui Xu, Barnes, Gautam Bisht, Hong-Yi Li, Zeli Tan, Teklu Tesfa, Zhuoran Duan, Darren Engwirda, L. Ruby Leung

Advances in Water Resources

Abstract 10.1016/j.advwatres.2021.104099

Watershed delineation and flow direction representation are the foundations of streamflow routing in spatially distributed hydrologic modeling. A recent study showed that hexagon-based watershed discretization has several advantages compared to the traditional Cartesian (latitude-longitude) discretization, such as uniform connectivity and compatibility with other Earth system model components based on unstructured mesh systems (e.g., oceanic models). Despite these advantages, hexagon-based discretization has not been widely adopted by the current generation of hydrologic models. One major reason is that there is no existing model that can delineate hexagon-based watersheds while maintaining accurate representations of flow direction across various spatial resolutions. In this study, we explored approaches such as spatial resampling and hybrid breaching-filling stream burning techniques to improve watershed delineation and flow direction representation using a newly developed hexagonal mesh watershed delineation model (HexWatershed). We applied these improvements to the Columbia River basin and performed 16 simulations with different configurations. The results show that (1) spatial resampling modulates flow direction around headwaters and provides an opportunity to extract subgrid information; and (2) stream burning corrects the flow directions in mountainous areas with complex terrain features.

2021

Ten simple rules on writing clean and reliable open-source scientific software

Hunter-Zinck, Fioravante de Siqueira, Vásquez, Barnes, Martinez

PLOS Computational Biology

Abstract 10.1371/journal.pcbi.1009481

Functional, usable, and maintainable open-source software is increasingly essential to scientific research, but there is a large variation in formal training for software development and maintainability. Here, we propose 10 “rules” centered on 2 best practice components: clean code and testing. These 2 areas are relatively straightforward and provide substantial utility relative to the learning investment. Adopting clean code practices helps to standardize and organize software code in order to enhance readability and reduce cognitive load for both the initial developer and subsequent contributors; this allows developers to concentrate on core functionality and reduce errors. Clean coding styles make software code more amenable to testing, including unit tests that work best with modular and consistent software code. Unit tests interrogate specific and isolated coding behavior to reduce coding errors and ensure intended functionality, especially as code increases in complexity; unit tests also implicitly provide example usages of code. Other forms of testing are geared to discover erroneous behavior arising from unexpected inputs or emerging from the interaction of complex codebases. Although conforming to coding styles and designing tests can add time to the software development project in the short term, these foundational tools can help to improve the correctness, quality, usability, and maintainability of open-source scientific software code. They also advance the principal point of scientific research: producing accurate results in a reproducible way. In addition to suggesting several tips for getting started with clean code and testing practices, we recommend numerous tools for the popular open-source scientific software languages Python, R, and Julia.

Finding Antarctica's Pole of Inaccessibility

Rees, Gerrish, Fox, Barnes

Polar Record

Abstract Code 10.1017/S0032247421000620

Antarctica’s Pole of Inaccessibility (Southern Pole of Inaccessibility: SPI) is the point on the Antarctic continent farthest from its edge. Existing literature exhibits disagreement over its location. Using two revisions of the Scientific Committee on Antarctic Research’s Antarctic Digital Database, we calculate modern-day positions for the SPI around 10 years apart, based on the position of the ‘outer’ Antarctic coastline i.e. its boundary with the ocean. These show that the position of the SPI in the year 2010 was around 83˚ 54’ S, 64˚ 53’ E, shifting on the order of 1 km per year as a result of changes of a similar magnitude in the Amery, Ronne-Filchner and Ross Ice Shelves. Excepting a position of the SPI calculated by British Antarctic Survey in 2005, to which it is very close, our newly calculated position differs by 150 to 900 km from others reported in the literature. We also consider the ‘inner’ SPI, defined by the coastline with floating ice removed. The position of this SPI in 2010 is estimated as 83˚37’ S, 53˚ 43’ E, differing significantly from other reported positions. Earlier cartographic data are probably not sufficiently accurate to allow its rate of change to be calculated meaningfully.

Fill-Spill-Merge: Flow routing in depression hierarchies

Barnes, Callaghan, Wickert

Earth Surface Dynamics

Abstract Code 10.5194/esurf-9-105-2021

Depressions—inwardly-draining regions—are common to many landscapes. When there is sufficient moisture, depressions take the form of lakes and wetlands; otherwise, they may be dry. Hydrological flow models used in geomorphology, hydrology, planetary science, soil and water conservation, and other fields often eliminate depressions through filling or breaching; however, this can produce unrealistic results. Models that retain depressions, on the other hand, are often undesirably expensive to run. In previous work we began to address this by developing a depression hierarchy data structure to capture the full topographic complexity of depressions in a region. Here, we extend this work by presenting a Fill-Spill-Merge algorithm that utilizes our depression hierarchy data structure to rapidly process and distribute runoff. Runoff fills depressions, which then overflow and spill into their neighbors. If both a depression and its neighbor fill, they merge. We provide a detailed explanation of the algorithm as well as results from two sample study areas. In these case studies, the algorithm runs 90–2,600× faster (with a 2,000–63,000× reduction in compute time) than the commonly-used Jacobi iteration and produces a more accurate output. Complete, well-commented, open-source code with 97% test coverage is available on Github and Zenodo.

2020

Gerrymandering and Compactness: Implementation Flexibility and Abuse

Barnes and Solomon

Political Analysis

Abstract Code 10.1017/pan.2020.36 arXiv: 1803.02857

Political districts may be drawn to favor one group or political party over another, or gerrymandered. A number of measurements have been suggested as ways to detect and prevent such behavior. These measures give concrete axes along which districts and districting plans can be compared. However, measurement values are affected by both noise and the compounding effects of seemingly innocuous implementation decisions. Such issues will arise for any measure. As a case study demonstrating the effect, we show that commonly-used measures of geometric compactness for district boundaries are affected by several factors irrelevant to fairness or compliance with civil rights law. We further show that an adversary could manipulate measurements to affect the assessment of a given plan. This instability complicates using these measurements as legislative or judicial standards to counteract unfair redistricting practices. This paper accompanies the release of packages in C++, Python, and R that correctly, efficiently, and reproducibly calculate a variety of compactness scores.

BusTr: predicting bus travel times from real-time traffic

Barnes, Buthpitiya, Cook, Fabrikant, Tomkins, Xu

ACM SIGKDD

Abstract 10.1145/3394486.3403376 arXiv: 2007.00882

We present BusTr, a machine-learned model for translating road traffic forecasts into predictions of bus delays, used by Google Maps to serve the majority of the world’s public transit systems where no official real-time bus tracking is provided. We demonstrate that our neural sequence model improves over DeepTTE, the state-of-the-art baseline, both in performance (−30% MAPE) and training stability. We also demonstrate significant generalization gains over simpler models, evaluated on longitudinal data to cope with a constantly evolving world.

Finding hierarchies in depressions and morphological segmentations

Barnes, Callaghan, Wickert

Earth Surface Dynamics

Abstract Code 10.5194/esurf-8-431-2020

Depressions – inwardly draining regions of digital elevation models – present difficulties for terrain analysis and hydrological modeling. Analogous “depressions” also arise in image processing and morphological segmentation, where they may represent noise, features of interest, or both. Here we provide a new data structure – the depression hierarchy – that captures the full topologic and topographic complexity of depressions in a region. We treat depressions as networks in a way that is analogous to surface-water flow paths, in which individual sub-depressions merge together to form meta-depressions in a process that continues until they begin to drain externally. This hierarchy can be used to selectively fill or breach depressions or to accelerate dynamic models of hydrological flow. Complete, well-commented, open-source code and correctness tests are available on GitHub and Zenodo.

2019

Rescal-snow: Simulating snow dunes with cellular automata

Kochanski, Defazio, Green, Barnes, Downie, Rubin, and Rountree

Journal of Open Source Software

Abstract PDF Code 21105/joss.01699

When wind blows over dry snow, it creates shapes known as snow bedforms. These features, which include snow dunes, waves, snow-steps and sastrugi, ornament Antarctica, Arctic sea ice, tundra, and mountain ridges. They change the reflectivity and average thermal conductivity of snow, and may change the patterns of snow accumulation and transport. Despite these effects, however, snow bedforms are poorly understood and not yet included in major snow or climate models. Here, we present a model which generates snow bedforms.

Optimal orientations of discrete global grids and the Poles of Inaccessibility

Barnes

International Journal of Digital Earth

Abstract PDF Code 10.1080/17538947.2019.1576786

Spatial analyses involving binning often require that every bin have the same area, but this is impossible using a rectangular grid laid over the Earth or over any projection of the Earth. Discrete global grids use hexagons, triangles, and diamonds to overcome this issue, overlaying the Earth with equally-sized bins. Such discrete global grids are formed by tiling the faces of a polyhedron. Previously, the orientations of these polyhedra have been chosen to satisfy only simple criteria such as equatorial symmetry or minimizing the number of vertices intersecting landmasses. However, projection distortion and singularities in discrete global grids mean that such simple orientations may not be sufficient for all use cases. Here, I present an algorithm for finding suitable orientations; this involves solving a nonconvex optimization problem. As a side-effect of this study I show that Fuller’s Dymaxion map corresponds closely to one of the optimal orientations I find. I also give new high-accuracy calculations of the Poles of Inaccessibility, which show that Point Nemo, the Oceanic Pole of Inaccessibility, is 15 km farther from land than previously recognized.

Accelerating a fluvial incision and landscape evolution model with parallelism

Barnes

Geomorphology 330

Abstract PDF Code 10.1016/j.geomorph.2019.01.002 arXiv: 1803.02977

Solving inverse problems, performing sensitivity analyses, and achieving statistical rigour in landscape evolution models requires running many model realizations. Parallel computation is necessary to achieve this in a reasonable time. However, no previous landscape evolution algorithm is able to leverage modern parallelism. Here, I describe an algorithm that can utilize the parallel potential of GPUs and many-core processors, in addition to working well in serial. The new algorithm runs 43x faster (70s vs. 3000s on a 10,000x10,000 input) than the previous state of the art and exhibits sublinear scaling with input size. I also identify key techniques for multiple flow direction routing and quickly eliminating landscape depressions and local minima. Complete, well-commented, easily adaptable source code for all versions of the algorithm is available on Github and Zenodo.

2017

Parallel Non-divergent Flow Accumulation For Trillion Cell Digital Elevation Models On Desktops Or Clusters

Barnes

Environmental Modelling & Software 92: 202–212

Abstract PDF Code 10.1016/j.envsoft.2017.02.022

Continent-scale datasets challenge hydrological algorithms for processing digital elevation models. Flow accumulation is an important input for many such algorithms; here, I parallelize its calculation. The new algorithm works on one or many cores, or multiple machines, and can take advantage of large memories or cope with small ones. Unlike previous algorithms, the new algorithm guarantees a fixed number of memory access and communication events per raster cell. In testing, the new algorithm ran faster and used fewer resources than previous algorithms exhibiting ~30% strong and weak scaling efficiencies up to 48 cores and linear scaling across datasets ranging over three orders of magnitude. The largest dataset tested has two trillion (2*10^12) cells. With 48 cores, processing required 24 minutes wall-time (14.5 compute-hours). This test is three orders of magnitude larger than any previously performed in the literature. Complete, well-commented source code and correctness tests are available for download from Github.

65 Million Years of Change in Temperature and Topography Explain Evolutionary History in Eastern North American Plethodontid Salamanders

Barnes and Clark

American Naturalist 190(1)

Abstract PDF Code 10.1086/691796

For many taxa and systems, species richness peaks at mid-elevations. One potential explanation for this pattern is that large-scale changes in climate and geography have, over evolutionary time, selected for traits that are favored under conditions found in contemporary mid-elevation regions. To test this hypothesis, we used records of historical temperature and topographic changes over the past 65 Myr to construct a general simulation model of Plethodontid salamander evolution in eastern North America. We then explore possible mechanisms constraining species to mid-elevation bands by using the model to predict Plethodontid evolutionary history and contemporary geographic distributions. Our results show that models which incorporate both temperature and topographic changes are better able to predict these patterns, suggesting that both processes may have played an important role in driving Plethodontid evolution in the region. Additionally, our model (whose annotated source code is included as a supplement) represents a proof of concept to encourage future work that takes advantage of recent advances in computing power to combine models of ecology, evolution, and earth history to better explain the abundance and distribution of species over time.

2016

Parallel Priority-Flood Depression Filling For Trillion Cell Digital Elevation Models On Desktops Or Clusters

Barnes

Computers & Geosciences (Volume 96, Nov 2016, pp. 56–68)

Abstract PDF Code 10.1016/j.cageo.2016.07.001

Algorithms for extracting hydrologic features and properties from digital elevation models (DEMs) are challenged by large datasets, which often cannot fit within a computer's RAM. Depression filling is an important preconditioning step to many of these algorithms. Here, I present a new, linearly-scaling algorithm which parallelizes the Priority-Flood depression-filling algorithm by subdividing a DEM into tiles. Using a single-producer, multi-consumer design, the new algorithm works equally well on one core, multiple cores, or multiple machines and can take advantage of large memories or cope with small ones. Unlike previous algorithms, the new algorithm guarantees a fixed number of memory access and communication events per subdivision of the DEM. In comparison testing, this results in the new algorithm running generally faster while using fewer resources than previous algorithms. For moderately sized tiles, the algorithm exhibits ~60% strong and weak scaling efficiencies up to 48 cores, and linear time scaling across datasets ranging over three orders of magnitude. The largest dataset on which I run the algorithm has 2 trillion (2*10¹²) cells. With 48 cores, processing required 4.8 hours wall-time (9.3 compute-days). This test is three orders of magnitude larger than any previously performed in the literature. Complete, well-commented source code and correctness tests are available for download from a repository.

A Pipeline Strategy For Crop Domestication

DeHaan, VanTassel, Anderson, Asselin, Barnes, Baute, Cattani, Culman, Dorn, Hulke, Kantar, Larson, Marks, Miller, Poland, Ravetta, Rude, Ryan, Wyse, Zhang

Crop Science (Volume 56, May–Jun 2016, Pages 1–14)

Abstract PDF 10.2135/cropsci2015.06.0356

In the interest of diversifying the global food system, improving human nutrition, and making agriculture more sustainable, there have been many proposals to domesticate wild plants or complete the domestication of semi-domesticated “orphan” crops. However, very few new crops have recently been fully domesticated. Many wild plants have traits limiting their production or consumption that could be costly and slow to change. Others may have fortuitous pre-adaptations that make them easier to develop or feasible as high-value, albeit low-yielding, crops. To increase success in contemporary domestication of new crops, we propose a “pipeline” approach, with attrition expected as species advance through the pipeline. We list criteria for ranking domestication candidates to help enrich the starting pool with more pre-adapted, promising species. We also discuss strategies for prioritizing initial research efforts once the candidates have been selected: developing higher value products and services from the crop, increasing yield potential, and focusing on overcoming undesirable traits. Finally, we present new-crop case studies which demonstrate that wild species' limitations and potential (in agronomic culture, shattering, seed size, harvest, cleaning, hybridization, etc.) are often only revealed during the early phases of domestication. When nearly insurmountable barriers were reached in some species, they have been (at least temporarily) eliminated from the pipeline. Conversely, a few species have moved quickly through the pipeline as hurdles such as low seed weight or low seed number per head were rapidly overcome, leading to increased confidence, farmer collaboration, and program expansion.

2014

The Reflective Plant Breeding Paradigm: A Robust System of Germplasm Development to Support Strategic Diversification of Agroecosystems

Runck, Kantar, Jordan, Anderson, Wyse, Eckberg, Barnes, Lehman, DeHaan, Stupar, Sheaffer, Porter

Crop Science (Volume 54, Sep–Oct 2014, Pages 1939–1948)

Abstract PDF 10.2135/cropsci2014.03.0195

Over the last half-century, crop breeding and agronomic advances have dramatically enhanced yields in temperate summer-annual cropping systems. Now, diversification of these cropping systems is emerging as a strategy for sustainable intensification, potentially increasing both crop production and resource conservation. In temperate zones, diversification is largely based on the introduction of winter-annual and perennial crops at spatial and temporal locations in annual-crop production systems that efficiently increase production and resource conservation. Germplasm development will be critical to this strategy, but we contend that to be feasible and efficient, germplasm improvement must be closely integrated with commercialization of these crops. To accomplish this integration, we propose a novel approach to germplasm development: the reflective plant breeding paradigm (RPBP). Our approach is enabled by developments in genomics, agroecosystem management, and innovation theory and practice. These developments and new plant-breeding technologies (e.g., low-cost sequencing, phenotyping, and spatial modeling of agroecosystems) now enable germplasm development to proceed on a time scale that enables close coordination of breeding and commercialization (i.e, development of cost-effective production systems and supply–value chains for end-use markets). The RPBP approach is based on close coordination of germplasm development with enterprise development. In addition to supporting strategic diversification of current annual-cropping systems, the RPBP may be useful in rapid adaptation of agriculture to climate change. Finally, the RPBP may offer a novel and distinctive pathway for future development of the public plant-breeding programs of land-grant universities with implications for graduate education for public- and private-sector plant breeders.

An Efficient Assignment of Drainage Direction Over Flat Surfaces in Raster Digital Elevation Models

Barnes, Lehman, Mulla

Computers & Geosciences. Vol 62, Jan 2014, pp 128-135, ISSN 0098-3004.

Abstract PDF Code 10.1016/j.cageo.2013.01.009

In processing raster digital elevation models (DEMs) it is often necessary to assign drainage directions over flats—that is, over regions with no local elevation gradient. This paper presents an approach to drainage direction assignment which is not restricted by a flat's shape, number of outlets, or surrounding topography. Flow is modeled by superimposing a gradient away from higher terrain with a gradient towards lower terrain resulting in a drainage field exhibiting flow convergence, an improvement over methods which produce regions of parallel flow. This approach builds on previous work by Garbrecht and Martz (1997), but presents several important improvements. The improved algorithm guarantees that flats are only resolved if they have outlets. The algorithm does not require iterative application; a single pass is sufficient to resolve all flats. The algorithm presents a clear strategy for identifying flats and their boundaries. The algorithm is not susceptible to loss of floating-point precision. Furthermore, the algorithm is efficient, operating in O(N) time whereas the older algorithm operates in O(N^3/2) time. In testing, the improved algorithm ran 6.5 times faster than the old for a 100x100 cell flat and 69 times faster for a 700x700 cell flat. In tests on actual DEMs, the improved algorithm finished its processing 38–110 times sooner while running on a single processor than a parallel implementation of the old algorithm did while running on 16 processors. The improved algorithm is an optimal, accurate, easy-to-implement drop-in replacement for the original. Pseudocode is provided in the paper and working source code is provided in the Supplemental Materials.

Priority-Flood: An Optimal Depression-Filling and Watershed-Labeling Algorithm for Digital Elevation Models

Barnes, Lehman, Mulla

Computers & Geosciences. Vol 62, Jan 2014, pp 117-127, ISSN 0098-3004.

Abstract PDF Code 10.1016/j.cageo.2013.04.024

Depressions (or pits) are low areas within a digital elevation model that are surrounded by higher terrain, with no outlet to lower areas. Filling them so they are level, as fluid would fill them if the terrain were impermeable, is often necessary in preprocessing DEMs. The depression-filling algorithm presented here—called Priority-Flood—unifies and improves on the work of a number of previous authors who have published similar algorithms. The algorithm operates by flooding DEMs inwards from their edges using a priority queue to determine the next cell to be flooded. The resultant DEM has no depressions or digital dams: every cell is guaranteed to drain. The algorithm is optimal for both integer and floating-point data, working in O(n) and O(n log₂ n) time, respectively. It is shown that by using a plain queue to fill depressions once they have been found, an O(m log₂ m) time-complexity can be achieved, where m does not exceed the number of cells n. This is the lowest time complexity of any known floating-point depression-filling algorithm. In testing, this improved variation of the algorithm performed up to 37% faster than the original. Additionally, a parallel version of an older, but widely-used depression-filling algorithm required six parallel processors to achieve a run-time on par with what the newer algorithm's improved variation took on a single processor. The Priority-Flood Algorithm is simple to understand and implement: the included pseudocode is only 20 lines and the included C++ reference implementation is under a hundred lines. The algorithm can work on irregular meshes as well as 4-, 6-, 8-, and n-connected grids. It can also be adapted to label watersheds and determine flow directions through either incremental elevation changes or depression carving. In the case of incremental elevation changes, the algorithm includes safety checks not present in other algorithms.

2013

Modeling of Bovine Spongiform Encephalopathy in a Two-Species Feedback Loop

Barnes and Lehman

Epidemics (Volume 5, Issue 2, June 2013, Pages 85–91)

Abstract PDF Code 10.1016/j.epidem.2013.04.001

Bovine spongiform encephalopathy, otherwise known as mad cow disease, can spread when an individual cow consumes feed containing the infected tissues of another individual, forming a one-species feedback loop. Such feedback is the primary means of transmission for BSE during epidemic conditions. Following outbreaks in the European Union and elsewhere, many governments enacted legislation designed to limit the spread of such diseases via elimination or reduction of one-species feedback loops in agricultural systems. However, two-species feedback loops—those in which infectious material from one-species is consumed by a secondary species whose tissue is then consumed by the first species—were not universally prohibited and have not been studied before. Here we present a basic ecological disease model which examines the rôle feedback loops may play in the spread of BSE and related diseases. Our model shows that there are critical thresholds between the infection's expansion and decrease related to the lifespan of the hosts, the growth rate of the prions, and the amount of prions circulating between hosts. The ecological disease dynamics can be intrinsically oscillatory, having outbreaks as well as refractory periods which can make it appear that the disease is under control while it is still increasing. We show that non-susceptible species that have been intentionally inserted into a feedback loop to stop the spread of disease do not, strictly by themselves, guarantee its control, though they may give that appearance by increasing the refractory period of an epidemic's oscillations. We suggest ways in which age-related dynamics and cross-species coupling should be considered in continuing evaluations aimed at maintaining a safe food supply.

2012

E-tracers: Development of a low cost wireless technique for exploring sub-surface hydrological systems

Bagshaw, Burrow, Wadham, Bowden, Lishman, Salter, Barnes, Nienow

Hydrological Processes. Vol 26, Issue 20, Jul 2012, Pages 3157–3160.

Abstract PDF 10.1002/hyp.9451

This briefing describes the first deployment of a new electronic tracer (E-tracer) for obtaining along-flowpath measurements in subsurface hydrological systems. These low-cost, wireless sensor platforms were deployed into moulins on the Greenland Ice Sheet. After descending into the moulin, the tracers travelled through the subglacial drainage system before emerging at the glacier portal. They are capable of collecting along-flowpath data from the point of injection until detection. The E-tracers emit a radio frequency signal, which enables sensor identification, location and recovery from the proglacial plain. The second generation of prototype E-tracers recorded water pressure, but the robust sensor design provides a versatile platform for measuring a range of parameters, including temperature and electrical conductivity, in hydrological environments that are challenging to monitor using tethered sensors.

Trading Space for Time: Constant-Speed Algorithms for Managing Future Events in Scientific Simulations

Lehman, Keen, Barnes

CSC

Abstract PDF

2011

Distributed Parallel D8 Up-Slope Area Calculation in Digital Elevation Models

Barnes, Lehman, Mulla

PDPTA

Abstract 10.5281/zenodo.1162392

This paper presents a parallel algorithm for calculating the eight-directional (D8) up-slope contributing area in digital elevation models (DEMs). In contrast with previous algorithms, which have potentially unbounded inter-node communications, the algorithm presented here realizes strict bounds on the number of inter-node communications. Those bounds in turn allow D8 attributes to be processed for arbitrarily large DEMs on hardware ranging from average desktops to supercomputers. The algorithm can use the OpenMP and MPI parallel computing models, either in combination or separately. It partitions the DEM between slave nodes, calculates an internal up-slope area by replacing information from other slaves with variables representing unknown quantities, passes the results on to a master node which combines all the slaves' data, and passes information back to each slave, which then computes its final result. In this way each slave's DEM partition is treated as a simple unit in the DEM as a whole and only two communications take place per node.

2008

Developing a Text-Based MMORPG to Motivate Students in CS1

Barnes and Gini

AAAI Conference – AI Education Workshop

Abstract 10.5281/zenodo.3780678

We present the outline of a class project in which entry-level students in our CS1 course spent a full day developing a text-based massively multi-player online role-playing game(MMORPG) using Scheme. We describe briefly our CS1course, the specifics of the game we asked students to implement, and the project organization. Comments from the students about their experience are also presented. Most students felt that the project was a beneficial learning experience. The project was organized as part of a larger multi-year effort to increase student learning and student participation.Class performance shows that more students have completed the course and have obtained higher grades than in the past,providing support to the educational value of this project and the other active learning opportunities we have used during the semester.