If you care about #Asia’s #rivers, or interested in its #paleoclimate, I have a data set for you: 813 years of annual discharge (streamflow) at 62 stations on 41 rivers in 16 countries, from 1200 to 2012. This thread explains how we created it, and all the data underlying it.
To estimate discharge hundreds of years ago, we need three things: (1) modern discharge observations, (2) a proxy of past climate, and (3) a model that links those two.
Most of our modern observations came from GSIM, a massive effort by Hong Do, @LukGudmundsson, Michael Leonard, and @sethwestra. Besides, we emailed everyone we knew asking for more data, and our colleagues kindly shared theirs. https://www.earth-syst-sci-data.net/10/765/2018/
We were careful to remove stations that have large reservoirs upstream, since reservoir operations could alter the annual water budget. We used reservoir data from GRanD. https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1890/100125
Our paleoclimate proxy is the Monsoon Asia Drought Atlas (MADA), a long-term record of soil moisture, reconstructed from tree rings by Ed Cook, @thirstygecko, Brendan Buckley, and others. The MADA was first published in 2010, and we use version 2 of it. http://www.sciencemag.org/cgi/doi/10.1126/science.1185188
Why didn’t we use #treerings directly? Processing tree ring data takes enormous efforts, and those efforts have been spent in making the MADA. We leverage that (standing on the shoulder of giants). With some caveats, tree rings, soil moisture, and streamflow are all related.
The MADA is a huge grid. We need to select a subregion as predictors for each station. Rather than relying on correlations as usual, we use climate similarity as our physical basis for selection. But the popular Köppen-Geiger climate classification didn’t fit: it’s discrete.
Then we found the clever classification by Wouter Knoben, @rosswoodskiwi, and @FreerJim, which I dubbed the KWF system. Every point on Earth has an RGB colour, and climate similarity can be defined on a continuous spectrum.
https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2018WR022913
https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2018WR022913
Using the KWF system, we were able to select sensible subregions of the MADA, and the selection agrees well with one you would have with correlations.
Finally, the reconstruction model we used is the one that I published with @GalelliStefano two years ago, which is available in the #rstats package ldsr. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2017WR022114
That’s (most of) it. For more details, check out our paper. Amazing coauthors @GalelliStefano, Brendan Buckley, and @sean_turner. https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2020WR027883
Also, read @KHwave’s wonderful review paper on the world’s big rivers. I found it while looking for ideas to write the introduction. As a result, my first sentence is “Of the world’s 30 biggest rivers, ten are located within Monsoon Asia…” https://www.nature.com/articles/s41561-018-0262-x
Science is a collective endeavour. We benefited a lot from open data and open source software: #QGIS, #LaTeX, #rstats, #RColorBrewer, #ggplot2, #patchwork, and #cowplot. Big thanks to my heroes @hadleywickham, @thomasp85, @ClausWilke and the open source software community.
In turn, we make our data, code, and results public here. https://github.com/ntthung/paleo-asia. Very soon, you will be able to explore this data set interactively, together with other reconstruction works, on @stagge_hydro's awesome PaleoFlow site. http://www.paleoflow.org/