With all digital geospatial data sets, users must be aware of certain characteristics of the data, such as resolution, accuracy, method of production and any resulting artifacts, in order to better judge its suitability for a specific application. A characteristic of the data that renders it unsuitable for one application may have no relevance as a limiting factor for its use in a different application (NASA/JPL 2005).
This section provides an overview of the applied processing steps for the generation of HydroSHEDS and discusses some key technical specifications in order to allow the user to better estimate the suitability of the data set for a specific application. Additional data validation details are addressed in section 4. Please also refer to the flowchart of Appendix A in the technical documentation.
3.1 Combination of unfinished SRTM-3 and finished DTED-1 data
3.3 Sink identification
3.4 Hydrologic conditioning
3.5 Manual corrections
3.7 Derived products
For the generation of HydroSHEDS, the performance of the publicly available SRTM-3 and DTED-1 versions of SRTM at 3 arc-second resolution have been tested. Due to their specific characteristics, each data set showed both advantages and disadvantages for hydrological applications.
As stated earlier, SRTM-3 has been derived through averaging of 1 arc-second SRTM data, as opposed to the subsampling method of DTED-1. As averaging reduces the high frequency “noise” that is characteristic of radar-derived elevation data, it is the method generally preferred by the research community (NASA/JPL 2005).
On the other hand, SRTM-3 data does not represent open water surfaces and shorelines well. DTED-1 has been specifically corrected to represent these features. However, the correction protocol introduced some critical artifacts for hydrological applications. For example, when large rivers were identified and monotonically stepped down in height towards the ocean, it was assured that the surface of each river pixel was lower than that of the directly adjacent land pixels. Yet a slightly elevated riverbank, say due to a levee or simply caused by the interpretation of riparian vegetation in the radar image, may allow for a river reach being somewhat higher than the floodplain behind the riverbank. Since the original processing was performed at 1 arc-second resolution, the elevated riverbank can disappear in the aggregated 3 arc-second version if it is only thin (one pixel wide). The resulting effect in the derived flow direction map is a possible breakout of the river course into the floodplain.
For above reasons, and after conducting a series of local tests, it was decided to apply both SRTM-3 and DTED-1 data in combination. For each pixel the minimum value found in either SRTM-3 or DTED-1 was used to generate an initial HydroSHEDS elevation model. The minimum requirement preserves the lower of both surfaces in the combined elevation data, which is considered desirable for the later identification of drainage directions.
At the ocean surface, the combined data initially shows elevation values of 0 (from DTED-1) or negative (from SRTM-3). Since land close to the shoreline can also be 0 or even negative, using elevation alone as a criterion does not allow for a clean identification of the ocean shoreline. Thus to aid in the shoreline delineation, SWBD was employed as ancillary data: where SWBD indicates “ocean”, the values of the HydroSHEDS elevation model were reclassified to no-data. The resulting shoreline was then slightly generalized in order to remove small artifacts: land was first extended by a one-pixel rim into the ocean and the boundary was then smoothed using a local cell filter. All detached ocean surfaces (e.g. small estuaries entirely surrounded by land cells) were treated as land and their elevation values were retained rather than set to no-data. Some larger rivers are defined in SWBD to extend relatively far into the ocean. In these cases, the shoreline was modified based on the shoreline of DCW. Finally, some small errors were detected in SWBD in visual inspections (e.g. some incomplete island boundaries) and were individually corrected. Note that in all up-scaled HydroSHEDS layers, each cell that contains at least one land cell at 3 arc-second resolution is defined as land.
Both SRTM-3 and DTED-1 original 1-degree by 1-degree data tiles are defined via the coordinates of the center of their lower-left pixel (see 2.4). This characteristic leads to overlapping edges of adjacent tiles and to some artifacts when aggregating a tile to coarser resolutions: either all adjacent tiles have to be included in the aggregation process, or overlapping edges may have to be eliminated in the result. As the processing steps for the generation of HydroSHEDS are rather complex and aggregation (scaling) plays an important role, it was decided to shift the original SRTM data by 1.5 arc-seconds to the north and east, and to remove each tile’s overlapping right column and top row. This shift leads to a 3 arc-second HydroSHEDS tile having 1200 rows and 1200 columns at an extent of exactly 1-degree by 1-degree without overlaps to adjacent tiles. All other HydroSHEDS resolutions are based on the initial 3 arc-second data and, therefore, include this shift. With respect to deriving river networks, the effect of the shift on the accuracy of the data can be considered negligible, particularly when compared to the subsequently applied data manipulations as discussed below. Note, however, that the shift may lead to significant anomalies when directly comparing HydroSHEDS elevation data and original SRTM elevation data.
Back to Top
In its original release, SRTM data contains regions of no-data (voids), specifically over large water bodies, such as lakes and rivers, and in areas where radar-specific problems prevented the production of reliable elevation data. These areas include mountainous regions where the radar shadow effect is pronounced, such as the Himalayas and Andes, as well as certain land surfaces, such as bare sand or rock conditions as found in the Sahara Desert. The existence of no-data in the DEM causes significant problems for deriving hydrological products, which require continuous flow surfaces. Therefore, a void-filling procedure has been applied to provide a continuous DEM for HydroSHEDS.
Numerous methods have been developed for void-filling of SRTM data (see e.g., Gamache 2004a), but they rarely focus on specific hydrological requirements. For HydroSHEDS, two different void-filling algorithms have been applied in combination. The first has been developed by CIAT (see Jarvis 2004; in collaboration with R. Hijmans and A. Nelson). The second has been specifically developed for HydroSHEDS. While the CIAT algorithm delivers smooth interpolation surfaces, the HydroSHEDS algorithm focuses on low and flat-water surfaces. Both methods and their combination are summarized below.
The CIAT algorithm fills the no-data voids by applying an interpolative technique. The original SRTM elevation data are used to produce contours at an interval of 10 meters. The contours are interpolated using the TOPOGRID algorithm in Arc/Info. TOPOGRID, based upon the established algorithms of Hutchinson (1988, 1989), is designed to use contour and point elevation data along with mapped hydrography to produce hydrologically sound DEMs. This method produces a smooth elevation surface within the no-data regions. While micro-scale topographic variation is likely to be underrepresented, most macro-scale features are captured well in small to intermediate sized voids. Jarvis et al. (2004) performed a detailed analysis of the accuracy of the interpolated elevation data for a region in Colombia and found little difference when compared to a cartographic DEM, particularly for hydrological applications. Gamache (2004b and personal communication) also analyzed the CIAT results and concluded that the void-filling algorithm is quite successful in representing broad scale patterns in topography. The void-filled elevation data (CIAT 2004) is available from the CGIAR-CSI SRTM 90m Database at http://srtm.csi.cgiar.org.
The HydroSHEDS algorithm fills the no-data voids by means of an iterative neighborhood analysis. The first step fills the outermost pixel-rim of a no-data void using a combination of a 3x3 minimum and a 5x5 mean filter (the minimum filter dominates the mean filter by a factor of 3:1). Then, the next pixel-rim is filled until the entire no-data void is processed. The no-data area is finally smoothed using a 9x9 mean filter. Particularly in the case of lakes and large river surfaces the emphasis of the minimum filter results in rather low elevation values inside the voids and a relatively flat relief as small peaks are successively filtered out.
The lowering effect of the HydroSHEDS algorithm for open water surfaces seems desirable for hydrological applications, as it tends to force the flow course to stay within river channels and lakes. In mountainous regions, however, the CIAT results are expected to better represent the general topography. To optimize results, both algorithms were combined. For each pixel the minimum value of either the CIAT or HydroSHEDS algorithm was used. If, however, the HydroSHEDS algorithm computed values more than 30 meters lower than CIAT, CIAT values minus 30 meters were used.
In some large no-data voids entire mountains are lost using either of the two filling methods. Therefore, starting at a distance of 0.03 degrees (approximately 3 km at the equator) from the rim of large voids, elevation values were inserted from GTOPO30, a global DEM at 30 arc-second (approximately 1 km at the equator) resolution (Gesch et al. 1999). To avoid cliff effects, the inserted values were smoothed or “feathered” in a 0.03-degree wide transition zone.
The filled voids were then merged into the initial HydroSHEDS elevation data to provide a continuous elevation surface with no void regions. The entire process was performed for each 1-degree by 1-degree tile with a 0.25-degree overlap to the eight adjacent tiles, thus ensuring seamless transitions of topography even in areas with large voids.
|The final result of steps 3.1 and 3.2 is |
the HydroSHEDS gap-filled elevation model, termed HydroSHEDS_NoGap,
at 3 arc-second (90 meter) resolution.
Back to Top
Typically, an original DEM will show a large number of sinks or depressions. These are single or multiple pixels which are entirely surrounded by higher elevation pixels. Some of these sinks are naturally occurring on the landscape, representing endorheic (inland) basins with no outlet to the ocean. In most cases, however, the sinks are considered spurious, often caused by random and mostly small deviations in the elevation surface. These anomalies occur even in high quality DEMs and high resolutions due to DEM production methods. The spurious sinks are critical problems in hydrological applications as they interrupt continuous flow across the DEM surface. Therefore, sinks are typically removed from the DEM before deriving a river network. Standard GIS procedures have been developed to remove spurious sinks, and a common approach is to raise the elevation values within the sinks until an outflow point is encountered. Natural sinks can be forced to remain in the DEM through “seeding”, e.g. by putting a no-data cell into their center.
As for HydroSHEDS, the definition of natural vs. spurious sinks has been accomplished using a GIS-assisted manual process. All sinks of the void-filled elevation model were identified in a standard GIS procedure, and their maximum depth and extent were calculated. Sinks deeper than 10 meters and larger than 10 km2 were highlighted as "potential" natural sinks. All regions of potential natural sinks were then inspected visually and were either seeded or rejected. The decision was based on information derived from DCW, ArcWorld, GLWD, and additional atlases and maps. For example, a mapped "salt lake" with no obvious river draining from it is considered a strong indication for an endorheic basin. The visual inspections were performed at a zoom to 1-degree by 1-degree windows, and several thousand naturally occurring sinks were identified globally.
Obviously, the manual sink identification process is subjective, and in many cases the definition of natural sinks is difficult and ambiguous. Some depressions overflow periodically, following seasonal flooding cycles, others spill only occasionally. Some large, relatively dry areas may show numerous small depressions within a generally sloped surface, and flow paths are poorly developed if at all (e.g. the Argentinean Pampas and many desert areas). These depressions may or may not overflow in a rain event. In some areas of no obvious drainage only some “structural” sinks were placed at strategic locations. They do not terminate the flow at all single depressions but at a final one to indicate the endorheic character of the region. In karstic areas, rivers may disappear in surface depressions, yet they can be closely connected to a larger basin via underground pathways. In cases of large karstic depressions, sinks were introduced, as it seemed easier for a user to later remove the sinks and restore connectivity than to introduce them from scratch. Artificial sinks, however, like large pits in surface mining areas, were rejected.
Back to Top
Besides sinks, original DEMs show a series of other characteristics, artifacts and anomalies that can cause significant problems or errors in hydrological applications. Some types of problems that are typical for the SRTM elevation model are discussed in section 4. The most significant characteristic is likely the fact that the elevation values of SRTM, being a radar-derived product, are influenced by the vegetation cover. In areas of low relief, these small deviations from the true surface elevation can cause significant errors in the derived river courses and flow directions.
In order to improve the performance of a DEM for hydrological applications, a series of GIS processes and procedures exist and are routinely applied. Yet due to the individual characteristics of different DEMs and, on a global scale, due to the regional variations in the type of errors, no one method exists that addresses all possible problems. For HydroSHEDS, a sequence of hydrologic conditioning procedures has been implemented, either adapted from standard GIS functionality, newly developed, or customized. The general focus was to strike a compromise between forcing the DEM to produce correct river network topology, particularly for the largest of rivers, while preserving as much original SRTM information as possible. Note that in any case the conditioning process alters the original elevation data and may render it unusable for other applications.
The following hydrologic conditioning procedures have been applied for the HydroSHEDS elevation data:
All rivers and lakes as identified in SWBD were deepened by 10 meters in order to force the derived flow to stay within these objects. As no-data voids in the original SRTM elevation data may also indicate open water surfaces (see 3.2), all void areas were lowered by 10 meters as well. The 10-meter threshold was chosen as it imparts a strong enough effect in flat areas (where the identification of river channels and lakes is particularly difficult), while producing only insignificant changes in areas with steeper slopes (where no-data voids are probably caused by radar shadow rather than open water).
In the radar-derived elevation model, mangrove or coastal vegetation belts may be interpreted as a low but continuous embankment blocking any direct outflow to the ocean. In the derived river network model, these barriers can cause significant backwater effects. To reduce this effect, the coastal zone, i.e. a 0.02-degree wide buffer (approximately 2 km at the equator) along the ocean shoreline was “weeded” by reducing every random third cell by 5 meters. This subtle change, in combination with the following filters, forces occasional breakthroughs into the slightly elevated coastal embankments.
The most extensive conditioning process in the generation of HydroSHEDS has been the so-called “stream burning” procedure. Stream burning is an often-used process to enforce known river courses into an elevation surface. The elevation values along the rivers, as depicted e.g. in an existing vector layer, are lowered by a certain value, thus “burning” deep gorges into the elevation surface. The burning can be extended to include a buffer around the river lines in order to shape a smoother transition between the original surface and the gorge. For HydroSHEDS, only large rivers and lakes were burned into the elevation surface in order to avoid excessive alterations of the SRTM surface. All perennial and intermittent rivers and lakes of ArcWorld, as well as all rivers and lakes of GLWD were used, while the higher resolution but unclassified DCW data was omitted. Since the accuracy of the existing global maps is unknown, attempts were made to minimize the impact of these datasets on the SRTM data. After multiple tests, the burning depth for rivers was set to 12 meters, with a buffer of 0.005 degrees (approximately 500 meters at the equator) around the river courses. The burning depth was reduced, in a stepwise manner, from 12 meters at the thalweg to 2 meters at the edge of the buffer. Lakes were burned with a depth of 14 meters and a buffer distance of 0.0025 degrees. The parameter setting aimed for a noticeable forcing of the main rivers in flat areas, where otherwise the correct delineation of rivers is difficult. In steep regions, the small burning depth results in rather insignificant changes of the elevation surface, hence the SRTM data remains the dominant information for deriving drainage directions.
The entire elevation surface was then filtered by applying a directional 3x3 neighborhood analysis. The elevation values of all possible straight and obtuse angle flow paths in a 3x3 kernel were averaged and the minimum value was assigned to the center cell. This filter aims to remove remaining spikes and wells while preserving and enforcing linear river courses and valley bottoms. In particular, single pixels that can block a continuous flow path are removed.
Next, valley courses were depicted through a neighborhood terrain analysis and were deepened by 3 meters. The valleys were identified through a 5x5 kernel median analysis combined with a grid-thinning algorithm to detect linear features. This procedure of valley “molding” has been specifically developed to improve river delineations in tropical lowland areas by removing small obstacles in shallow valleys. Due to the small deepening of 3 meters, no significant changes occur in areas with stronger relief.
In a standard process, all spurious sinks in the elevation surface were filled. Natural sinks were seeded in order to exclude them from removal (see 3.3).
After sink filling, a river map was produced from the conditioned elevation surface. All main river courses, defined as rivers with an upstream catchment area of more than 1000 cells (approximately 8 km2 at the equator), were depicted. The main rivers were then projected onto the initial HydroSHEDS elevation model, and all elevation rises along the rivers when moving downstream were identified. These rising reaches in the original elevation surface, which have obviously been removed through filtering or sink-filling in the conditioning process, may represent dams, bridges, embankments of any kind, or narrow gorges that block the flow path. In many of these cases, the sink-filling effect (i.e. the lifting and implicit flattening of the dammed area) may not be desirable as any existing relief information within the filled area is lost. To minimize this effect, a second conditioning iteration was performed: first, all rising reaches along the main river courses were leveled out in the initial elevation data by appropriately lowering their respective heights, thus effectively “carving” through the barriers. After this process, all other conditioning steps (3.4.1 to 3.4.6) were repeated.
During the entire conditioning process, hard- and software limitations were reached due to the very large data size at 3 arc-second resolution. All steps have therefore been performed on a tile-by-tile basis, with extents between 1-degree by 1-degree and 5-degree by 5-degree. In order to avoid edge effects, appropriate overlaps to the adjacent tiles were added. In particular the sink-filling algorithm proved highly susceptible to tile sizes and edge effects and had to be implemented in an iterative approach. The processing was performed with an overlap of up to 5 degrees (approximately 500 km at the equator) to adjacent tiles to ensure seamless results without edge effects.
Back to Top
The result of section 3.4 is a hydrologically conditioned elevation surface at 3 arc-second resolution. From this elevation surface, a new river network was derived and used for error checking. Because computation of the river network at 3 arc-second resolution is very time intensive, the data was first upscaled to 15 arc-second resolution (approximately 500 meters at the equator; for upscaling see 3.6 below). The derived river network was then compared visually to the rivers of DCW, ArcWorld, and various atlases and paper maps.
Errors occurred particularly in flat areas with varying vegetation cover (see section 4), such as floodplains and coastal zones. If the actual rivers could be visually detected in the raw elevation data, their courses were traced or adopted from the existing DCW river layer. These rivers were then added to the stream layer used in the river burning procedure of 3.4. In some areas, the given elevation values significantly misrepresented the actual flow conditions (e.g. blocked pathways due to narrow gorges, or inadequate filling of the no-data voids of the original data). In these cases, the burning depth was individually adjusted. Some other topological problems (e.g. diversions into canals or multiple spillways of reservoirs) were treated in a similar manner through introduction and adjustment of main pathways. Actual flow channels of braided rivers and large river deltas could not be topologically resolved due to the constraint of allowing only one drainage direction per cell (the single flow direction algorithm does not allow for river bifurcations). These zones have only been “cleaned” to represent the main channel properly.
After detecting the errors and preparing the corresponding correction data, all steps of 3.4 were repeated. In some areas, several iterations of manual corrections were performed. As with the sink identification process, the manual correction process is highly subjective. The visual inspections were performed at a zoom to one-degree by one-degree windows, and corrections were applied for several thousand locations globally.
The final results of steps 3.4 and 3.5 are|
(1) the HydroSHEDS hydrologically conditioned elevation model (CON), and
(2) the HydroSHEDS drainage direction map (DIR) at 3 arc-second resolution.
Back to Top
All procedures described in sections 3.1 to 3.5 were performed at 3 arc-second resolution. Yet for many applications, in particular continental or global assessments, coarser resolutions are desirable as they may significantly reduce calculation times while providing acceptable accuracy. HydroSHEDS therefore delivers various resolutions, from 3 arc-second to 5 minute. The coarser resolutions are all derived from the 3 arc-second data through upscaling.
Upscaling drainage directions is not a straightforward process, as typical aggregation methods, such as averaging of neighborhood kernels, are not appropriate for directional values. A frequently applied upscaling method is to first upscale the elevation data, and then derive a new drainage direction map from this coarser DEM. This method is generally fast and easy to perform, but it often delivers low-quality results with respect to river network topology, due to the loss of significant information in the aggregation process. An alternative option is to first derive the river network at high resolution, and then to upscale this network. This option preserves the network information, which is most important for hydrological applications. However, it requires complex procedures, which are difficult to realize at a global scale and for the desired high resolutions. As a compromise, a combined method has been developed and applied for generating HydroSHEDS. The main steps in the upscaling process are as follows:
1. The void-filled DEM is upscaled from the original 3 arc-second to the desired resolution. For this process, an algorithm was applied that calculates both the mean and minimum value found within the aggregation kernel and then takes the average. The minimum value is included in the calculation to emphasize valleys. Natural sinks were preserved in the upscaling process.
2. A network of main rivers was calculated at 3 arc-second resolution. Main rivers were defined as those having an upstream catchment area of more than 1000 cells (approximately 8 km2 at the equator). The river network was derived for five-degree by five-degree tiles with a one-degree overlap to adjacent tiles to avoid edge effects.
3. The main rivers were then burned into the upscaled elevation surface. The burning depth was defined as the sum of a constant (500 meters) and a value dependent on the size of the respective river reach (0-400 meters, proportional to the logarithm of upstream cells). The relatively large burning depth assured that the river channels were preserved in the new elevation surface. No buffering was applied.
4. Sinks were filled in the upscaled and burned elevation surface, and finally new drainage directions were calculated. Note that due to the strong burning, the elevation surface does not represent natural conditions any more. It is appropriate only for deriving drainage directions. To avoid confusion with true DEMs, the upscaled elevation surface is not offered as a standard HydroSHEDS product.
The upscaling process delivers a new drainage direction map (DIR) from which a new river network can be derived. Due to the applied stream burning, all main rivers (as defined in the upscaling process) should be in very good alignment with the original river network. Only if two close-by rivers drain through the same or adjacent upscaled cells, they may be incorrectly merged into one flow channel. Smaller rivers, for which no burning occurs, are based solely on the upscaled elevation surface. Their quality may thus differ from the river network at 3 arc-second resolution.
The final results of step 3.6 are upscaled HydroSHEDS drainage direction maps (DIR) at resolutions of 15 arc-second and 30 arc-second. Also, a 5 minute product is in preparation.|
Back to Top
Ancillary HydroSHEDS products can be derived from the individual drainage direction maps at their respective resolutions. These products include flow accumulations, flow distances, river networks, and watershed boundaries. A list of available HydroSHEDS data sets is provided in section 5.