Table of Contents | Director's Message | Executive Summary | SCD Achievements
Education and Outreach | Community Service | Awards | Publications | People | ASR 2004 Home

Executive Summary

SCD's mission is to serve the computing, data management and research needs of the atmospheric and related sciences. Consistent with this mission, SCD supports a robust, advanced computational infrastructure at NCAR that allows the scientific community to address both national and international research agendas. In anticipation of future needs, SCD also conducts applied research in partnership with NCAR's scientific community in the areas of visualization, networking, computer science, numerical methods and computational science.

Here we discuss our major FY 2004 activities and program highlights that collectively contribute to the advancement of NCAR science across many fronts.

High Performance Computing

In FY2004, Phase III of the current Advanced Research Computing System (ARCS) was delivered to NCAR. This was an expansion of the IBM 1600 cluster (Bluesky) by fourteen, 32-way p690 Symmetric Multi-Processor (SMP) servers, resulting in a net increase of two teraflops in computing capacity. Bluesky is now comprised of 50 POWER4 38 Regatta-H Turbo frames, making it the single largest system of this type in the world.

Working in partnership with the Intergovernmental Panel on Climate Change (IPCC), Bluesky contributed over 25 centuries of simulated climate—over half of all IPCC computing during this campaign. At the conclusion of the IPCC campaign in late FY 2004, a total of fourteen p690 nodes were subsequently released to the community to augment SCD's computing capacity for all users. SCD's current aggregate peak capacity is 12.1 teraflops distributed across six SMP computers.

In July we procured a Linux cluster machine (left) to determine the feasibility of deploying such systems as viable, cost effective supercomputing alternatives. The cluster (an IBM e1350) has 256 processors, each running at 2.2 GHz, and can achieve a peak of 1.1 teraflops. The system is being used to build, test, and evaluate models at NCAR, such as the Community Climate System Model (CCSM) and the Weather Research and Forecasting (WRF) model. Since many of NCAR's university partners who use these models have Linux-based systems, we anticipate that testing and developing these models on a similar system at NCAR will accelerate their distribution to the university community.

This diagram illustrates the configuration of the machine room, as of September 30, 2004.

Finally, we note briefly that SCD's supercomputing resources are comprised of two separate computational facilities: the Climate Simulation Laboratory (CSL) and the Community Computing facilities. Some of our systems (e.g., Bluesky and Blackforest) are shared between these two facilities. This distribution of resources continues to be a major feature of our commitment to both long-running, numerically intensive climate models and highly focused research objectives from our university constituents.

Computing Security and Divisional Threat Response

We collectively manage for NCAR a diverse computational and data storage environment containing high-end computers, mass storage subsystems, data archives, visualization, E-mail, DNS, authentication and web servers, and networks (including IP telephony). Aside from their significant monetary value, they comprise vital scientific research tools and business continuation systems used by the UCAR/NCAR organization and university communities.

In response to a major cybersecurity incident in early 2004, SCD expeditiously developed and deployed a long-term solution for the protection of the supercomputing and mass storage systems at NCAR. The use of one-time password tokens, arbitrated via encryption cards issued to all users, are now required to access these systems. Security procedures were updated and published to provide all users with guidelines and instructions on how to work within the secure supercomputing environment.

Further, with funding from the National Science Foundation (NSF), we convened and hosted a two-day Cybersecurity Summit in late September in Arlington, VA. The summit, attended by over 120 people, aimed at increasing cooperation on cybersecurity among the nation's research and high-performance computing institutions.

Data Archiving and Management: The Mass Storage System (MSS)

Often referred to as one of the "crown jewels" at NCAR, the MSS exceeded two petabytes (PB) of data storage in July 2004. The net growth rate of unique data in the MSS is approximately 30 terabytes per month. The MSS continues to provide a scalable and robust data storage system that meets the data storage requirements of our users.

Our MSS Storage Manager (a new file management server developed in-house that provides connectivity to fiber-channel devices) was enhanced to support StorageTek's 9940B tape drives. They provide more than a threefold increase in capacity over the current 9940A cartridges (60 GB per cartridge increasing to 200 GB per cartridge), and have increased the total capacity of the existing StorageTek robotic tape libraries (silos). They also provide a threefold increase in data transfer rates (from 10 MB/sec to 30 MB/sec).

To assist with capacity planning and performance tuning of the MSS, we developed a simulator that includes all the major hardware and software components of the MSS. The simulator enables the us to consider different design alternatives for new software and hardware components and estimate how the different designs will perform before the components are added to the actual system. Simulation studies were conducted in mid-2004 using an earlier version of this simulator to aid in configuring and sizing the Storage Manager disk cache system.

Initial deployment of 20 StorageTek 9940B tape drives was completed in FY2004. Managed by our new Storage Manager, these drives are servicing the files off-loaded from the disk cache and local system backup files. An additional twenty 9940B tape drives will be installed in early FY2005 and with the expansion of the disk cache. Migration of data (termed "data ooze") will begin in FY2005 to complete the replacement of the 9940A technology.

Computational Science Research

Our computational science staff conducts basic applied mathematical research aimed at developing software for novel, accurate, efficient, and scalable numerical dynamical cores for climate, weather, and turbulence applications.

In FY 2004 we participated in several software infrastructure development projects, each with strong research components:

1. The Spectral Tool Kit (STK) deployed the first software release of this new functionality and began packaging STK functions with the NCAR Command Language (NCL).

2. An NSF-ITR funded project, the Inverse Ocean Modeling (IOM) system for modular ocean data assimilation is composed of these principal activities: continued development and deployment of parallel communication infrastructure needed for IOM and the PEZ ocean model, design and implementation of an observation class for IOM, testing of different parallel execution modes for the IOM system component architecture, and collaboration with Martin Erwig's team at Oregon State University that is using software derived from the Haskell language.

In addition, SCD and the Pittsburgh Supercomputing Center (PSC) collaborated on the Network Path and Application Diagnosis project to help address next-generation Internet performance issues. We will investigate underlying causes of bulk transport protocol limitations and develop and test new mechanisms in high-performance networking environments such as the Extensible Terascale Facility and the Abilene Network of the Future.

Progress on a new NSF ITR award for extending progressive data access techniques to handle time-varying data and irregular data began this year. Staff have started to investigate Open Source user tools based on promising technologies explored under this award.

Development Activities

Work continued on the collaborative Earth System Modeling Framework (ESMF). A much-anticipated release of ESMF software Version 2.0 occurred in July 2004. The ESMF Version 2.0 release includes software for representing and manipulating components, states, fields, grids, and arrays, as well as a number of utilities such as time management, configuration, and logging. It runs on a wide variety of computing platforms, including SGI, IBM, Compaq, and Linux variants.

The Grid-BGC project completed a top-level user interface design, selected a GIS technology for handling maps and geographical information, began implementing Globus protocols; made implementation decisions regarding the software framework; completed "look-and-feel" designs for static and dynamic visualization tools; and performed data transfer and computational capacity testing on existing parallel hardware.

The Earth System Grid (ESG) moved into production mode for climate model research data with dedicated service for IPCC services in the area of coupled climate model data.

Finally, we completed work on the Web100 and NET100 projects.

Computing Center Operations and Infrastructure

Our Operations and Infrastructure Systems (OIS) section is committed to delivering secure, reliable, high-quality, customer-focused services and infrastructure around the clock, 365 days per year.

As part of this committment, in FY2004 we released new versions of the MySCD portal that provides, for the first time, customizable GAU charging information directly to our users. In addition, the SKIL database was upgraded to add "modify-and-delete" functionality.

Two collaborative projects are ongoing with the University of Colorado at Boulder:

1. The METIS event-based workflow system evaluation is nearly complete.

2. A group of CU students are modernizing the UCAR room reservation system.

The continued increase in electric power consumption (and the corresponding need to dissipate heat in the machine room) is beginning to impact the computing facility's infrastructure -- the computer room has reached its maximum cooling capacity. To address this situation, in FY2004 we dealt with near-term needs by augmenting the chilled water capability to match a maximum electrical load of 1.2 MW. We also developed short- and long-term plans to meet the demands of future computing systems. Multiple planning options were articulated, including the option of expanding the data center. The plans for a data center expansion have been presented to the NCAR directorate, the UCAR President's Council, and SCD's Advisory Panel. Presentations will also be made to the UCAR Board of Trustees in October 2004. This work will continue into FY2005.

Additionally, the Mesa Lab standby generators were commissioned and put into service. The twin generators provide an eight-hour window of emergency power to the NCAR Computer Room as well as the Mesa Lab's life-safety systems in case of electrical outages. Some field modifications have been made to simplify the generators' operation.

Other activities of note during FY2004:

  • We have assumed after-hours support, setup, and distribution of our secure, one-time password tokens (CRYPTOCards) required by our strengthened security requirements.
  • Tape media conversions continue with the move to 200 GB tapes.
  • Early in FY2004, computer operators transitioned to a new rotating schedule format to more effectively cover the 7x24 requirements of the data center. This rotating format offers equitable opportunities for each operator to interact more fully with the rest of the division.
  • Significant additional responsibilities were added throughout the year, including developing fully documented procedures for managing major SCD services, e.g., e-mail, DNS, Web and VPN services.

Network Engineering and Telecommunications

Our Network Engineering and Telecommunication Section (NETS) built the local-area networking infrastructure in the new Center Green Campus, bringing that campus into compliance with UCAR networking infrastructure standards. This work included building a fiber link along the bike path from Center Green to the Foothills Lab to interconnect the campuses with UCAR-owned fiber.

We also upgraded local, metropolitan and wide area network (LAN, MAN, and WAN) links to 10Gbps as required. Staff completed wireless service deployment throughout all office areas on all UCAR campuses and installed fiber optic links between our Jeffco and Mesa Lab campuses, replacing leased land services.

Finally, we continue to be an active participant in important regional, state, and municipal projects such as the Front Range GigaPOP, Quilt and Westnet projects.

Assistance and Support for NCAR's Diverse Research Community

Our User Support Section (USS) continued to provide high-level technical support to NCAR's research community via web-based resources, and consultant, phone, and E-mail contacts. A total of 1,529 researchers representing 169 institutions used SCD computing resources during FY2004. Of these 169 institutions, 102 are universities in the U.S.

(Click on the image at left for a larger version of the graphic.)

We helped users in transitioning models and large codes from the SGI Origin 03000 (Chinook) to the IBM 1600 cluster (Bluesky) and in transitioning data analysis jobs from the SGI 02000 (Dataproc) to our new SGI 03800 (Tempest). We worked collaboratively with other divisions to test, document, and resolve user issues for the new IBM Linux cluster (Lightning) that arrived in July 2004. There has also been an increase in training for specific groups of users, starting with Advanced Study Program ((ASP) fellows and SOARS protégés. We continue to provide on-going training on use of the TotalView debugger with a complex code—namely the Community Climate System Model (CCSM).

Revisions to current on-line user documentation as well as new guides for Lightning and Tempest were created and published on the SCD web site. Staff worked to create four new websites in support of SCD, NCAR and UCAR activities, including a secure site for the Cybersecurity Summit 2004. We also participated in a broad-based, UCAR-wide effort to redesign and deploy a new, umbrella web site for NCAR, UCAR, and UOP. The efforts of this Web Outreach, Redesign and Development (WORD) project created a database-driven infrastructure that will continue to transform our overall web presence into a more dynamic, interactive experience.

Finally, we contributed to the successful inauguration of new security procedures for over 1,000 users—following a security intrusion in mid-2004—-by issuing one-time password token cards (CrytoCards) and linking them to individual user's accounts.

Visualization and Enabling Technologies

We were pleased— via our Web Engineering Group (WEG)—to be an integral part of UCAR's Web Outreach, Redesign and Development (WORD) Project. This project involved over 25 staff members from throughout the organization (five from SCD) and was tasked with developing a new UCAR, NCAR, UOP umbrella web site bringing together news and resources from all three institutions into a unified web presence. The site was designed to engage a wide audience of scientists, educators, students and the public alike. The WORD group created a database-driven infrastructure that will continue to transform our web presence into a more dynamic, interactive experience. The new web site was launched in May 2004.

During 2004, we saw major progress in both content and infrastructure for the Community Data Portal (CDP) effort. The portal's metadata architecture was re-engineered to accommodate the new Thematic Realtime Environmental Distributed Data Services (THREDDS) v1.0 specification, which includes information for collection-level search and discovery of datasets. Accompanying that, the portal was equipped with a powerful new user interface for initiating complex searches on distributed metadata catalogs.

Our Earth System Grid web portal was released in the summer of 2004. Designed for general use by the Climate Modeling Community, the portal allows easy access to the latest CCSM (Community Climate System Model) v3 data. Users may browse the data catalogs hierarchically, perform searches on metadata, download full files or subset the virtual aggregated datasets.

This summer, SCD helped two students in the Significant Opportunities in Atmospheric Research and Science (SOARS) program give face-to-face presentations to their peers and mentors — even though they were thousands of miles away.

We continue to maintain and operate several Access Grid (AG) nodes in the division, including the main node in the Vislab as well as a portable node that can be moved, on short notice, to other local meeting facilities. We continued to support AG technology both on site and with our university partners; completing the acquisition and testing of an AG system at Howard University, and successfully integrating an AG system into SCD's conference room.

Our Web Engineering Group (WEG) continued its mission to consolidate web hosting at UCAR in 2004, adding four high-profile NCAR divisions and programs (ESIG—now ISSE, HIAPER, MMM, and HAO). The group completed its configuration of a new back-end cluster that provides core services and hosts dynamic websites. The WEG was instrumental in the WORD group launch of the new UCAR/NCAR/UOP website, creating a metadata management system called VAVOOM that automates the posting of news and other information to the site.

Research Data Support and Services

Our Data Support Section (DSS) manages and curates important research data archives (RDA) containing observational and analysis reference datasets that support a broad range of meteorological and oceanographic research. The archive content continues to grow through systematic additions to the modern collections, by data received from special projects, and from data rescued for historical time periods. The archive—currently over 30 terabytes (TB) in size—is also enhanced by capturing and improving documentation and associated metadata, enforcing a systematic organization, applying data quality assurance and verification checks, and developing access software for multiple computing platforms.

Research efforts have been made easier by providing on-line access to more current data. For example, three main data streams—the Unidata Internet Data Distribution (IDD), the National Center for Environmental Prediction (NCEP) operational observations and final global analysis, and the NCEP/NCAR Reanalysis—have the most recent 90 days and annual time periods maintained on-line. These new services will aid global and mesoscale modeling and time series studies that need readily available, up-to-date data.

The International Comprehensive Ocean-Atmosphere Data Set (ICOADS) issued an update adding 1998-2002 data to the archive that begins in 1784. ICOADS is a two-decade long collaboration with the NOAA Climate Diagnostic Center and NOAA National Climatic Data Center. The observations and monthly summary statistics are now available online and are freely distributed worldwide.

We continue to provide a crucial service in the area of atmospheric data re-analyses. The National Center for Environmental Prediction (NCEP) now runs NCEP/NCAR and NCEP/DOE re-analyses in an operational mode and these products will continue to be archived here within SCD. The European Centre for Medium-range Weather Forecasting (ECMWF) Re-analysis 40 (ERA-40) computation is finished and it provides a T159 (approximately 1.1° resolution) global analysis, four times per day, on 60 vertical levels for the years1957-2002. The complete archive will be available both on-line and from the MSS through data servers and web interfaces. We also received, processed and delivered the new NCEP North America Regional Reanalysis (NARR) in FY 2004. NARR covers the North America domain with a 32 km horizontal resolution eight times per day, with 45 vertical levels for the years 1979-2003.

We have also been capturing and storing dataset and data file metadata in systematically organized ASCII files for the past 20 years. The content of the RDA has been made more visible by extracting basic metadata from the ASCII files and writing a Dublin Core compliant form that has been compiled into Thematic Realtime Environmental Distributed Data Services (THREDDS) catalogues and hosted on the UCAR Community Data Portal (CDP). This is a first step toward improved data discovery of the RDA within the context of all UCAR data holdings.

Table of Contents | Director's Message | Executive Summary | SCD Achievements
Education and Outreach | Community Service | Awards | Publications | People | ASR 2004 Home

National Center for Atmospheric Research University Corporation for Atmospheric Research National Science Foundation Annual Scientific Report - Home Atmospheric Chemistry Division Advanced Studies Program Atmospheric Chemistry Division Climate and Global Dynamics Division Environmental and Societal Impacts Group High Altitude Observatory Mesoscale & Microscale Meteorological Division Research Applications Program National Center for Atmospheric Research Scientific Computing Division