| |
American Statistical Association (ASA)
Last Modified: 2003-Feb-16 |
|
NOTE: Downloads of some of the Presentation material are now available here.
A three-day conference entitled "Spatial Statistics: Integrating Statistics, GIS, and Statistical Graphics," to be held October 17-19, 2002, in Seattle, Washington, is being organized by the Statistics and Environment Section of the American Statistical Association (ASA) and the National Research Center for Statistics and the Environment. There will be a one-day short course on October 17th. A workshop will begin Friday, October 18th, and extend until noon, Saturday, October 19th. Papers will be given on recent advances in the analysis and display of environmental spatial data.
Short Course, Thursday, October 17th:
Integrating Geostatistics and GIS by Jay Ver Hoef and Konstantin Krivoruchko
Note: The short course has filled up and registrations are closed as of September 16.
This course will consist of a morning lecture and a PC lab in the afternoon. All participants will receive a CD with data sets and Powerpoint presentations.
The morning will consist of an introduction to the ideas of geostatistics. The fundamentals of geostatistics will be demonstrated using the Geostatistical Analyst (GA), which is an extension to ArcInfo/ArcMap. Topics to be covered include,
The afternoon will consist of a "hands-on" lab. Participants are encouraged to bring their own data sets. Data sets should consist of DBF or ASCII files with a minimum of 3 columns: x-coordinate, y-coordinate, data value. Those familiar with ArcInfo/ArcMap can bring shape files of their data. A variety of data sets will also be provided. There will be 6 instructors to assist with the lab:
The instructors will work with the participants and answer questions as they use the geostatistical methods to create maps from their data.
Workshop, Friday, October 18 to 19th:
Spatial Statistics: Integrating Statistics, GIS, and Statistical Graphics
A series of fourteen 30-minute presentations will be given. Most titles and abstracts are given below as well as in the attached Word file. Ample time will be provided for a thorough discussion of the topics.
The registration fee for the workshop includes all sessions on Friday, October 18 and Saturday, October 19, workshop materials and refreshment breaks.
Student Workshop Fees, Friday, October 18, 2002-Saturday, October 19, 2002
Student registration fees for the workshop include all sessions on Friday, October 18 and Saturday, October 19, workshop materials and refreshment breaks. Students must register by fax or mail and include a supporting letter from their faculty supervisor.
You may register on-line at http://www.engr.washington.edu/~uw-epp/gis/reginfo.html This site also has hotel information. Please note that the deadline for making hotel reservations at the room block rate is September 16th. For questions on the technical program, please contact Linda J. Young by e-mail at LJYoung@unl.edu or telephone at (402)483-2392.
Schedule for GIS Workshop, October 18th and 19th:
Friday, October 18:
8:15 a.m. Welcome and Announcements
8:30 a.m. Spatial Statistics and GIS
GIS and Spatial Statistics: One World View or Two?
Michael F. Goodchild
University of California, Santa Barbara
GIS began in the mid 1960s as a computer application performing operations on geographic information that were too tedious, inaccurate, or expensive to do by hand. Since then it has evolved into an integrated software application supporting a variety of representations of phenomena on the Earth's surface, together with tools to perform virtually any conceivable operation on such representations, along with the means to visualize the data and results of analysis. Most recently, much emphasis has been placed on the value of GIS as a medium for communicating geographic information. The GIS software industry is driven by a range of applications, some of which place more emphasis on analytic tools than others. The geography-as-continuum emphasis of GIS can be contrasted with the geography-as-attribute emphasis, and I review examples where the distinction is particularly clear. The two world views are converging, aided by developments in technology and extensive dialog between the respective communities.
Spatial Statistics in the Presence of Location Error
Noel Cressie, Director, Program in Spatial Statistics and Environmental Science
Department of Statistics, The Ohio State University, Columbus OH 43210
Geographic information scientists are well aware that spatial databases contain both attribute error and location error, but spatial statistics has tended to concentrate on attribute error and ignore location error. This talk considers methods for adjusting spatial inference in the presence of data-location error, particularly for data that have a continuous spatial index (i.e., geostatistical data). Classical, empirical Bayesian, and Bayesian techniques are presented. This research is joint with John Kornak and John Gabrosek.
9:50 a.m. Break
10:20 a.m. Environmental Applications
Spatial Statistical Analysis of Georeferenced Environmental Data Utilizing GIS: Three Case Studies
Daniel A. Griffith
Department of Geography, Syracuse University
This presentation emphasizes interfaces between statistics and GIS with: (1) environmental applications involving auto-normal, auto-Poisson, and auto-logistic models; (2) an integration of spatial statistics (geostatistics and spatial autoregression), GIS, and statistical graphics; and, (3) visualization of georeferenced data. The first case study is of Arsenic (As) contamination across the Murray superfund site and neighboring residential areas, using the auto-normal model. The second case study is of counts of cholera deaths recorded by Snow in the Broad Street pump neighborhood and vicinity of London, using the auto-Poisson model. And, the third case study is of the percentage of sample composite silt clay content in the Chesapeake Bay recorded for the USEPA EMAP pilot project, using the auto-binomial model. Software illustrated in this presentation includes SAS and ESRI's ArcGIS, employing the Geostatistical Analyst, and ArcView, employing the Map Stat and Thiessen Polygon scripts. Some of the statistical graphics were constructed using MINITAB, while some of the numerically intensive computations were completed using FORTRAN code.
Using GIS to Improve Analysis Weights for Environmental Surveys
Sarah M. Nusser
Department of Statistics, Iowa State University
To estimate population totals using sample survey data, an analysis weight or expansion factor is constructed for each sample unit. The weight is the number of population units represented by the data associated with the sampled unit. As part of the weighting process, the population is often partitioned into groups called post-strata. External control information for post-strata, such as Census Bureau figures on the number of households in a geographic area, is incorporated in the analysis weights to improve the precision of estimates. For environmental surveys, the study region may be post-stratified into political units, watersheds, or other types of geographic areas, and surface areas for the post-strata are used as control totals in the weighting process. GIS data can be used to improve post-stratification weights by creating a tighter link between the spatial distribution of control variables and the weights assigned to sample units. We discuss methods used in the National Resources Inventory (NRI) to adjust weights using GIS information on federal land parcels and large water bodies within polygons defined by the intersection of counties and 4-digit hydrologic units. GIS data are used to create imputed points that represent change observed in area segments and post-strata, and to assign surface areas for specific federal and water polygons to associated sample and imputed points during the weighting process. The goal is to improve the precision of estimates, especially for changes in land cover/use for smaller regions.
Use of GIS in Wildlife Observational Studies
Lyman L. McDonald, Senior Biometrician
WEST, Inc., 2003 Central Avenue, Cheyenne, Wyoming 82001
I will review applications that WEST, Inc. has made of data from Geographical Information Systems (GIS) in our consulting and contract work for government and industry. These statistical analyses and modeling exercises were conducted to help meet the objectives of: understanding habitat and other resource selection by animals, prediction of population sizes and wildlife habitat under different land management strategies, determining sampling strategies to best design long term monitoring projects, and study of statistical properties of analysis procedures in the face of essentially infinite sample sizes from GIS. Specific applications will include: habitat selection by birds and moose on the Innoko Wildlife Reserve in Alaska, ice habitat selection by polar bear in the Beaufort Sea, roost site selection by whooping crane on the Platte River in Nebraska, and design of long term biological monitoring plans.
12:20 p.m. Lunch
1:45 p.m. Visualization
New Graphics For Geospatially-Indexed Statistics: Interactive Linked Micromap Plots And Dynamically Conditioned Choropleth Maps
Daniel B. Carr
George Mason University
This talk introduces interactive extensions to two recently developed templates for displaying geospatially-indexed estimates. The primary purpose of the first template, linked micromap plots, is to communicate statistical summaries. The template links small generalized maps with statistical panels that describe regions. Research centered at the National Cancer Institute addressed the task of communicating state and county cancer statistics and tailored this template to show estimates, confidence intervals, and Healthy People 2010 target values. The research also integrated the following interactive options into a Java applet: variable selection, sorting, fixed header scrolling, mouse tips, a popup containing a below, chosen region, and above three class map with a color-linked cumulative distribution; and drill down. Substantial usability tests lead to refinements. The tested interactive template should be of interest to numerous agencies and institutions wanting to communicate geospatially-indexed statistics.
The second template, called conditioned choropleth maps, is available as Java shareware called CCmaps. The purpose of CCmaps is to improve hypothesis generation about spatial patterns of a dependent variable appearing in a classed choropleth map. CCmaps uses a partitioning slider to dynamically define the class boundaries for the dependent variable. The application uses two similar sliders to partition regions into a 3 x 3 layout of maps based on values of two related variables. This design promotes comparison of distributions across strata and study of spatial variation restricted to regions with similar values for the two related variables. Dynamically-updated population-weighted means and a 3 x 3 layout of dynamic QQplots facilitate distribution comparisons. Examples focused on health and environmental studies.
From ArcView/XGobi to R/GGobi: Recent Developments in Exploratory Spatial Data Analysis
Juergen Symanzik11, Deborah F. Swayne2, Duncan Temple Lang3, Dianne Cook4
1Utah State University; 2AT&T Labs - Research; 3Bell Labs, Lucent
Technologies; 4Iowa State University
In this talk, we will present the recent evolution from the linked ArcView/XGobi software environment to R/GGobi for exploratory spatial data analysis (ESDA). In previous work, we extended the ArcView GIS by linking it to XGobi, general purpose interactive graphics software for multivariate data. More recently, a new pairing has emerged: the statistics environment R can be extended by integrating it with GGobi, an updated version of XGobi.
We will describe the goals of the ArcView/XGobi link, and discuss its capabilities and limitations, and then demonstrate how some of those limitations can be overcome in the new integrated environment. A major part of this talk will be a demonstration of the R/GGobi capabilities related to ESDA. R and GGobi can be downloaded for free from http://www.r-project.org/ and http://ggobi.org/, respectively.
3:05 p.m. Break
3:35 p.m. Spatial Sampling Design
Spatial Survey Designs for Aquatic Resources
Anthony R. Olsen1, Denis White1, Richard Remington1, Don L. Stevens, Jr.2, Barbara Rosenbaum3, and David Cassell4
1USEPA NHEERL Western Ecology Division, Corvallis, OR; 2Department of Statistics, Oregon State University; 3INDUS, Corvallis, OR; 4CSC, Corvallis, OR
Federal and state agencies have an interest in monitoring the water quality and biological condition of all the aquatic resources within their jurisdictions. Given the impossibility of actually sampling all the aquatic resources, the agencies must have some process for selecting a set of sites to monitor and an inferential process for generalizing from this set of sites to the entire aquatic resource. During the past ten years, a number of agencies have chosen to use statistical survey designs as the basis for sampling. The objective of this presentation is to describe a class of spatially-balanced survey designs that have been successfully applied in a number of monitoring programs for lakes, streams, and estuaries across the United States. Geographic information system (GIS) coverages are an integral component in monitoring in aquatic resources. Although GIS coverages of lakes, streams and estuaries are not perfect, they are sufficiently accurate to be used as a spatial sample frame. The sample frames are characterized as GIS coverages of points, linear networks, and areas, each requiring a different survey design approach. One desirable characteristic for a spatial survey design is to have every realization of a design be spatially-balanced. Spatially-balanced means that every replication of the sample exhibits a spatial density pattern that closely mimics the spatial density pattern of the resource. Typically, the GIS coverages result in sample frames in which many "small" portions of the resource dominate a few "large" portions of the resource. For example, typically, 60% of stream length in a state is associated with headwater streams while major rivers contribute less than 10% of the length. Also some regions of a state have a greater spatial density of streams than other regions. A simple spatially-balanced sample would reflect these variations in spatial density pattern. The generalized random tessellation stratified (GRTS) survey design results in spatially-balanced samples while allowing for unequal probability for selection, stratification, and frame imperfections. GRTS survey design procedures have been implemented using a combination of a C-program, ArcInfo, and SAS. A new implementation is being developed and will be available as an R software library.
Evaluating and designing environmental monitoring networks
Douglas Nychka and Eric Gilleland
Geophysical Statistics Project, National Center for Atmospheric Research
An important problem in spatial statistics is determining where to make measurements. For example, in monitoring environmental pollutants one would like to know how to place a network of measuring instruments to make most efficient use of resources. Given a network that is in place it is also of interest to determine its efficiency in extrapolating to spatial locations where measurements are not taken. This talk discusses some spatial methods for determining the spatial predictive power of a network and the use of a space-filling criterion for network thinning. For motivation we will focus on the AMS/SLAMS network for monitoring ozone and also consider the spatial properties of the nonlinear statistic directly related to the EPA standard (three year averages of the third highest daily values). We have found cross-validation to be a useful strategy for calibrating the standard errors from spatial prediction and determining the validity of different covariance models.
4:55 p.m. End
Saturday, October 19
9:10 a.m. Change of Support
Expanding the "S" in GIS: Statistics and Spatial Support
Carol A. Gotway Crawford1 and Linda J. Young2
1National Center for Environmental Health, Centers for Disease Control and Prevention;
2Department of Biometry, University of Nebraska-Lincoln
One of the most powerful functions in geographic information systems is the ability to synthesize spatial data from a variety of sources. Through functions such as aggregation, buffering, overlay and spatial query, GIS users can easily merge data on different units and change scales in a way that is relatively easy and transparent. However, in performing these functions, the most meaningful aspect of spatial data, its support (e.g., shape and orientation), is ignored or compromised. As digital spatial data has become more plentiful, many researchers in a variety of disciplines have developed more sophisticated solutions to the problem of combining incompatible spatial data. In this presentation, we synthesize these solutions and discuss their utility for solving change of support problems and their potential for implementation within a GIS.
Inference for Misaligned Spatial Data Settings: The "Change of Support" and "Modifiable Areal Unit" Problems
Alan E. Gelfand
ISDS, Duke University
An established problem in working with spatial data is the matter of having acquired spatial data at one scale of resolution and seeking to infer about what would be expected at a different scale. The so-called change of support problem is usually concerned with data obtained at the point-referenced level with interest in inference at block/areal unit level or perhaps vice versa. The modifiable areal unit problem is usually concerned with data obtained for one set of areal units with inference regarding expectations for an alternative set of units. With increased collection of spatial data, a related problem arises. One routinely finds spatial data layers which are obtained at different scales of resolution with interest in building a regression models to enable the use of one layer to explain another. A variety of ad hoc algorithmic approaches can be proposed to address these two problems. However, they may not be satisfying from an inferential perspective. In this presentation, we will review some of these ad hoc procedures but the focus will be on fully model-based methods. Through a range of environmental and ecological examples we will describe how stochastic modeling can be brought to these problems, how we can fit these models and what sorts of inference can be extracted.
10:20 a.m. Break
11:00 a.m. Agile GIS
Component-based development of geospatial visualization and analysis applications with GeoVISTA Studio
Alan M. MacEachren
GeoVISTA Center Geography, Penn State
GeoVISTA Studio (subsequently referred to as Studio) is a software environment that supports construction of component-based Java applications. It is distributed with a suite of components developed to support geovisualization and related computational and statistical analysis. The presentation will provide an introduction to Studio and to the potential of integrated visual, statistical, and computational data analysis tools. Example software tools developed with Studio and typical applications to environmental data analysis will be provided.
The goal for Studio is to support the fusing of diverse visual and analytical capabilities into custom analysis tools that enable a multi-perspective approach to knowledge construction and dissemination. Studio provides a visual programming environment that allows an analyst to package assembled functionality into a working program (in the form of a cross-platform, JavaBeans component, an applet, or an application). The result can be easily disseminated or deployed on the Internet. Like commercial visual programming environments for scientific visualization, Studio allows users to quickly combine components into flexible applications using a visual, direct manipulation design canvas. However, unlike other visual programming environments, components are written in pure Java (thus do not rely on a commercial software tools) and the available components address a range of activities that span statistical analysis, visualization and machine learning. In addition, since the environment supports integration of any Java components that can be encapsulated as JavaBeans, Studio has the potential to support a distributed community of developers who can work independently of one another while sharing resources easily.
Linking visualisation and analysis tools
Thomas Lumley
University of Washington
Tools for statistical analysis, high-dimensional visualisation, and GIS are still very separate. I will discuss the need for links between these systems that allow them to be used as an integrated whole. As examples, the Orca project is working on dynamic graphics for visualisation of high-dimensional space and time data and linking this to the statistics package R, and R has also been linked to the GRASS GIS.
"Agile GIS": Building application-specific spatial analytic software from freely available software tools
Lance A. Waller and Andrew Barclay
Department of Biostatistics, Rollins School of Public Health, Emory University
Current geographic information system (GIS) and statistical software packages offer much in the way of flexibility within their own purview but little in the way of cross-functionality. On the other hand, many environmental impact assessments require both GIS operations (e.g., layering or buffering) and non-trivial statistical calculations (e.g., calculation and comparison of distribution functions). As a result, routine analyses often require individuals to master two domains, resulting in an awkward process especially for repeated tasks arising from multiple assessments (e.g., reviews of site proposals for new sources of environmental pollution). We consider development of "agile" spatial analytic software, i.e., tools built in concert with users providing functions necessary for completing their task, but little else. We construct software products from statistical and GIS open-source toolboxes, and provide sound statistical and GIS functionality, tightly constrained by user-defined boundaries of application. We illustrate the concept on two environmental applications, the first based on assessments of "environmental justice" (providing a measure of disparate proximity to proposed waste locations for different sociodemographic groups), and the second based on assessing spatial patterns in sea turtle nesting behavior with respect to a new fishing pier.
1:00 p.m. End
For further information, please contact Linda J. Young at LJYoung@UNL.edu or check the Registration Information web site at http://www.engr.washington.edu/~uw-epp/gis/reginfo.html.