MC2 Logo Multicore Computational Center

Brown Bag Seminar



Date:                            March 24, 2010
Time:                           12 PM
Location:                     ITE 325
Speaker:                      Curt Tilmes
Title:                           Data Provenance Management for Earth Science Reproducibility


Abstract:

 A fundamental aspect of all science is reproducibility.  In the past
few decades, Earth Science has been increasingly based on remote
sensing (aircraft, satellites, ocean buoy sensors, etc.) that have
produced tremendous volumes of data.  There is often a long chain of
complex processing steps that ultimately lead to published science.
Understanding the processing chain, and maintaining scientific
reproducibility of results is a major challenge.

 We are contructing a model of scientific data processing that captures
and maintains the provenance of all of the artifacts of processing.
These include the data transformation algorithms and all data in the
system, both inputs from external sources and data produced within the
system.  Other artifacts include the hardware and software of the
processing framework, the source instruments and satellites,
scientific literature and documentation, and people and
organizations. The origin of any data or algorithms is recorded and
the entire history of the processing chains are stored such that a
researcher can understand the entire data flow.  Provenance is
captured in a form suitable for the system to provide basic scientific
reproducibility of any data product it distributes even in cases where
the physical data products themselves have been deleted due to space
constraints.


Back to List...
CSS Template by Rambling Soul | Valid XHTML 1.0 | CSS 2.0