Data Acquisition and Documentation
Access to high quality spatial data for decision making is frequently daunting for small organizations with limited access to technical data and tools. The developers of the Atlas worked with targeted end-users and advisory committees to identify and acquire relevant data and information. The data were sourced from multiple organizations, including the Natural Resources Research Institute (NRRI), academic units, and government agencies that produce and deliver data.
All of the data are available for viewing within the Mapping Tool; most datasets can be downloaded for external use. The Atlas provides download access to the data via the originator of the data (e.g., the Natural Resources Research Institute, other academic entities, or local, state, and federal agencies). Each institution has its own method for distributing its data, and we provide users direct access to these distribution sites.
Two sites provide access to the majority of the data available in the Atlas. NRRI data and metadata are provided through our Comprehensive Knowledge Archive Network (CKAN) site. Access to much of the agency data and metadata is provided through Minnesota’s CKAN site, the Minnesota Geospatial Commons.
The remaining data have varying degrees of accessibility. Some are available directly through organizational websites. A limited number are available only in a tabular format. Some, often very large files that are impractical to distribute, are not available for download. We do our best to make all of the Atlas data available for external use, but are not always able to do so.
Data Management Plan
The University of Minnesota spends significant resources to assure appropriate data quality, documentation, storage, preservation, distribution, and confidentiality when needed. The project PIs and co-PIs will be responsible for maintaining data integrity and are responsible for ensuring compliance with the data management plan.
The Natural Resources Research Institute (NRRI) has developed, derived, and compiled substantial amounts of data that will be made available through the Natural Resource Atlas. In addition, the Atlas will make available data from other reputable sources including, but not necessarily limited to, the Minnesota Geospatial Commons, the Environmental Protection Agency EnviroAtlas, and Minnesota Pollution Control Agency databases. The data management policies of external data sources will be reviewed to ensure they are providing accurate, current, and representative data. The PIs will not be responsible for ensuring the integrity and maintenance of this external data – that will be the responsibility of the providing organization.
Expected Data Types and Formats
Atlas data will consist of different types of spatial data, the associated metadata, and other relevant documentation. Spatial data will include shapefiles, images, grids, GeoTIFFs, GeoJSON, and tabular data associated with specific locations or sample points. Spatial metadata will follow the Minnesota Geographic Metadata Guidelines (http://www.mngeo.state.mn.us/committee/standards/mgmg/metadata.htm) – a state version of the Federal Geographic Data Committee standards. All metadata will be stored in human-readable and XML formats.
Data Storage and Preservation
NRRI maintains a federated data server based on the CKAN open source platform (www.ckan.org). The CKAN server at NRRI is a seven terabyte system with nightly backup to a secondary seven terabyte file-server. Both servers use RAID-5 file storage which provides protection against drive failure. NRRI developed and derived data will be stored on this system. External data from other providers will be stored in the provider’s database.
Data Sharing and Public Access
The spatial data used in this project will be made publically available to be viewed within the Atlas and downloaded via the data source. NRRI developed and derived data will be available for download through the NRRI CKAN. Direct links will be provided to different platforms for downloading data that was developed and derived by other sources. Exceptions will be made for sensitive information as specified by data originators, such as infrastructure and biological data. Sensitive data will not participate in external server requests, even with authentication. They will be stored and utilized internally for analysis, the results of which can then be passed along to public users in the form of generic summaries, rather than sensitive information (e.g. location, attributes).
When acquiring data that will be hosted on NRRI’s servers, it will be noted whether the data is static or dynamic. If the data set is dynamic, the metadata will be reviewed to determine expected update intervals. A calendar will be populated according to these intervals and the source of those data sets will be reviewed for potential updates at the indicated times.
Access to the server will be limited by its firewall and available only through a finite number of open ports. These open ports will only be accessible through certain avenues. Access to the data on the server will initially be secured through authorization enforced by authentication levels. There will be both public and private access to the data. Access to private and/or sensitive data will be secured using username/password authentication. In order to protect the data, communication of these credentials will pass through an encrypted proxy. Password Policy, Account Lockout and Kerberos Policy are configured and managed by OIT through Active Directory GPO. Success and failure is audited for use, access, and change status.
Roles and Responsibilities
Project PIs and co-PIs will be responsible for coordinating all data management. All project staff involved with data in the Atlas will be informed of and agree to protocols of the data management plan. Co-PIs will periodically audit the project to determine if the data are being managed properly. The data management plan will be reviewed annually and updated as needed by project investigators. PIs Johnson and Host will be responsible for ensuring that the plan for data storage, preservation, sharing, and public access of the data are implemented. No data will be transferred to a data archive or made available for public release until metadata have been created and evaluated.