Table of Contents | Director's Message | Executive Summary | SCD Achievements
Education and Outreach | Community Service | Awards | Publications | People | ASR 2004 Home

SCD Achievements

High Performance Computing

Maintaining NCAR's production supercomputing environment

The production supercomputer environment managed by SCD for NCAR has evolved over the years. During the last 20 years, SCD has brought NCAR's science into the multiprocessing supercomputer world. Prior to the introduction of the 4-CPU Cray X-MP in October 1986, all modeling was performed with serial codes. Since then, the focus has been on redeveloping codes to harness the power of multiple CPUs in a single system and, most recently, in multiple systems.

(Click on the image for a larger version.)

Supercomputing systems deployed at NCAR

During the last 20 years, SCD has deployed a series of Parallel Vector Processor (PVP) systems ranging from a 2-CPU Cray Y-MP to a pair of 24-CPU Cray J90se systems. Massively Parallel Processing (MPP) systems included the Cray T3D with 128 processors and the Thinking Machines CM2 and CM5 systems. Most recently, Distributed Shared Memory (DSM) systems have been deployed; these include the Hewlett-Packard SPP-2000, SGI Origin2000, Compaq ES40 cluster, SGI Origin3800, IBM SP POWER3 and POWER4 systems, and now Linux clusters.

The diagram at left shows the systems that SCD has deployed for NCAR's use since its inception. The systems shown with blue bars are those deployed for production purposes, those shown in red were (are) considered experimental systems.

In 1986, with the first multiprocessor system (the Cray X-MP/4) on NCAR's floor, SCD could deliver on average approximately 0.25 GFLOPS of sustainable computing capacity to NCAR's science. In the roughly 20 years since, that sustained computing capacity has grown significantly to over 587 GFLOPS, with a peak capacity of 12.1 teraflops (TFLOPS). The image at right illustrates this trend. (Click on image for a larger version.)

FY2004 production system overview

In FY2004, Phase III of the current Advanced Research Computing System (ARCS) was delivered to NCAR. This was an expansion of the IBM cluster (bluesky) by 14, 32-way p690 SMP servers with each server based on the POWER4 micro-processor and operating at a clock frequency of 1.3 GHz.

 

Each server included 64 GB of memory. The system expansion also included 10.5 TB of formatted disk storage, which was added to the existing disk subsystem, thereby increasing bluesky's total disk capacity to 31 TB. Of the 14 servers, only 12 were added to bluesky, the remaining two servers are temporarily being used for a special SCD testbed project. At end-FY2004, bluesky is comprised of 50 POWER4 38 Regatta-H Turbo frames, making it the single largest system of this type in the world.

The 12 additional 32-way P690 SMP servers were used to support CCSM for contributions to the IPCC process, as reported in SCD's Annual Budget Review. The installation of the blueksy system and its subsequent augmentation has doubled the capacities of both the Climate Simulation Laboratory and Community computing.

Further, there were several major system software upgrades performed on all supercomputers.

Supercomputer systems maintained during FY2004

Distributed Shared Memory (DSM) systems:

  • SGI Origin2100 (chinookfe), with 8 processors, was used in the Climate Simulation Laboratory.
  • SGI Origin3800 (chinook), with 128 processors, was used in the Climate Simulation Laboratory.
  • SGI Origin2000 (dataproc), with 16 processors, was used by both Climate Simulation Laboratory and Community users.
  • SGI Origin2000 (mouache), with 4 processors, was used as a test platform by SCD for evaluation of new Irix systems, libraries, and compilers prior to their installation on the production SGI platforms; all interested users now have access to mouache.
  • IBM SP (babyblue), with 64 processors, was shared by the Climate Simulation Laboratory and the Community.
  • IBM SP (blackforest), with 1,308 processors, was shared by the Climate Simulation Laboratory and the Community.
  • IBM NightHawk2 (dave), with 16 processors, was shared by the Climate Simulation Laboratory and the Community.
  • IBM p690 Regatta (bluedawn), with 16 processors, was used as a test and development platform for the integration of the IBM POWER4 Cluster 1600.
  • IBM Cluster 1600 (bluesky), with 1,600 processors, was shared by the Climate Simulation Laboratory and the Community.
New supercomputer systems added during FY2004

DSM systems:

As an element of its five-year strategic plan to aggressively evaluate and deploy potentially more cost-effective new computing technologies, SCD acquired a large-scale Linux-based supercomputer cluster.  Following a competitive procurement process, IBM was selected to deliver a 256-processor e1350 Linux cluster. The system, called lightning, was delivered in July 2004 and uses 2.2 GHz AMD Opteron processors, has a peak computational capacity of 1.14 teraflops, 0.5 terabytes of memory, and 7 terabytes of disk.  The CAM and POP benchmarks demonstrated that lightning will outperform bluesky by a factor of 1.3 or more on a per-processor basis.  This system brought the total computational capacity at NCAR to 12.1 teraflops.

Production system performance and utilization statistics

At the end of FY2004, the production supercomputer environment managed by SCD for NCAR included five IBM supercomputers and four SGI supercomputers. The following tables provide average utilization and performance statistics for the production supercomputer systems SCD operated in FY2004.

In addition, SCD publishes monthly usage reports at http://www.scd.ucar.edu/dbsg/dbs/. These reports provide summary information on system usage, project allocations, and General Accounting Unit (GAU) use.

End—FY2004 production supercomputer systems

The SCD supercomputer resources are comprised of two separate computational facilities: the Climate Simulation Laboratory (CSL) and Community Computing facilities. Some systems, such as the IBM SP systems, the dave system, and the dataproc system are shared between these two facilities. The following sections describe the supercomputing systems available in these two facilities.

CSL facility:

The Climate Simulation Laboratory facility provided the following supercomputing resources at the end of FY2004:


Climate Simulation Lab facility, FY2004 configuration


 

System


# CPUs


GB
memory


Peak
GFLOPS


Notes


Dedicated:

IBM SP (blackforest)

560

280

840.0

1,120 total system batch CPUs; 560 dedicated to CSL

Dedicated:

IBM SP
(bluesky)

704

1408

3660.8

1,408 total system batch CPUs; 704 dedicated to CSL

Dedicated:

SGI Origin3800 (chinook)

124

64

124.0

124 CPUs dedicated to CSL

Dedicated:

SGI Origin2100 (chinookfe)

8

8

4.0

Front-end system for chinook

Shared:

IBM SP (babyblue)

48

24

72.0

Shared new-release test platform; available for user use

Shared:

SGI Origin2000 (dataproc)

16

32

8.0

Shared with Community for data analysis and post-processing applications

Shared:

IBM NightHawk2 (dave)

16

32

24.0

Shared with Community for data analysis and post-processing applications


Community Computing facility:

The Community Computing facility provided the following supercomputing resources available at the end of FY2004:


Community Computing facility, FY2004 configuration


 

System


# CPUs


GB
memory


Peak
GFLOPS


Notes


Dedicated:

IBM SP (blackforest)

560

280

840.0

1,120 total system batch CPUs; 560 dedicated to Community

Dedicated:

IBM SP
(bluesky)

704

1408

3660.8

1,408 total system batch CPUs; 704 dedicated to Community

Shared:

IBM SP (babyblue)

48

24

72.0

Shared new-release test platform; available for user use

Shared:

SGI Origin2000 (dataproc)

16

16

8.0

Shared with CSL for data analysis and post-processing applications

Shared

IBM NightHawk2 (dave)

16

32

24.0

Shared with CSL for data analysis and post-processing applications


Key maintenance activities

During FY2004, SCD provided ongoing maintenance activities to ensure the integrity and reliability of existing computational systems and improved the quality of service to the NCAR user community. Some of the key areas were:

Maintain supercomputer operating systems
SCD stayed apprised of major software releases from IBM and carefully scheduled upgrades to the production systems and product set software based on the judged stability of those upgrades in the NCAR production environment. SCD also continued to provide major system support for the SGI Origin3800 and Origin2000 systems.

Maintain stability and reliability of systems
One of the most significant attributes of the NCAR computational environment is its overall stability and reliability. For instance, the NCAR Mass Storage System has a reputation for reliability, and SCD has in the last year deployed a number of high-availability fileserver systems. This reliability and stability does not come easily; it stems from a combination of choosing reliable, stable vendor products and using proven, fail-safe system administration and maintenance techniques. SCD will continue to focus on ensuring, in whatever ways possible, highly stable and reliable systems and systems operations.

System monitoring
Over the years, SCD has developed a large number of system monitoring procedures, techniques, and tools. SCD continued to enhance and utilize its collective experience to maintain the stability of the existing production systems through this proactive monitoring. In addition, SCD continued to enhance its monitoring tools, techniques, and procedures, and SCD automated a number of procedures for detecting system failure or trouble. This automation was integrated with commercial alphanumeric paging technology to provide more rapid alert mechanisms to SCD operations and systems staff and thus reduce the amount of time that systems are unavailable to the NCAR user community when they do fail.

Computer Security and Divisional Threat Response

SCD manages a diverse computational and data storage environment containing high-end computers, mass storage subsystems, data archives, visualization, e-mail, DNS, authentication and web servers, and networks (including IP telephony). Not only are these systems valuable monetarily, they comprise vital scientific research tools and business continuation systems used by the UCAR/NCAR organization and university communities.

In response to a major cybersecurity incident that involved multiple high-performance computing sites in March 2004, SCD rapidly developed and deployed a long-term solution for protecting the supercomputing and mass storage systems at NCAR. SCD now requires one-time password tokens, arbitrated via encryption devices issued to all users, to access these systems. Security procedures were updated and published to provide all users with guidelines and instructions for working within the secure supercomputing environment.

One of the problems encountered during the March 2004 incidents was a lack of effective communication among the affected institutions. SCD proposed a conference to bring together stakeholders from the nation's research and high-performance computing centers to prepare a coordinated response for future incidents.

With funding from the National Science Foundation (NSF), SCD planned, organized, and hosted a two-day Cybersecurity Summit near Washington D.C. Attended by over 120 cybersecurity experts from some of the nation's leading research institutions, the summit explored the competing needs of having an open, collaborative research environment while protecting the security and integrity of its computing and data assets.

Sites participating in the Cybersecurity Summit

The map shows the locations of the sites participating in the Cybersecurity Summit. This broad-based collaboration aims to coordinate strong response plans for threats against research computing and data.

Cybersecurity Summit 2004 was the first step in laying the foundation for responding to future large-scale security breaches and reducing the disruptive impact of such incidents on the nation's research agenda. These research institutions are increasing their cooperation on security policies, procedures, and incident response to better protect the nation's scientific computing and data resources.

Data Archiving and Management: The Mass Storage System (MSS)

The NCAR Mass Storage System (MSS) is a large-scale data archive that stores data used and generated by climate models and other programs executed on NCAR's supercomputers and compute servers. At the end of FY2004, the NCAR MSS managed stored data exceeding 25 million files of over 1,247 unique terabytes (TB), and the total holdings exceeded 2,149 TB (2.1 petabytes) when including duplicate copies. The net growth rate of unique data in the MSS was approximately 30 TB per month.

On average, 160,000 cartridges are being mounted each month, approximately 1% (1,000) of these by operators and 99% in the StorageTek Powderhorn Automated Cartridge Subsystems (ACS). The StorageTek Powderhorn ACS systems (also called "silos") use robotics to mount and dismount cartridges. On a daily basis, the MSS handles approximately 41,000 requests resulting in the movement of over 3,900 GB of data. During FY2004, data transfers servicing user requests to and from the MSS exceeded 1,400 TB.

While some of the data stored on the NCAR MSS originate from field experiments and observations, the bulk of the data is generated by global climate-simulation models and other earth-science models that run on supercomputers. SCD therefore faces an increasing demand to archive the data generated by increasingly more powerful supercomputers. As supercomputers become larger and faster, they generate more data to be archived. Ever-greater demands for archiving data will result from the growing use of coupled atmospheric/oceanic simulation models.

MSS history

The NCAR Mass Storage System has evolved over the last 18 years. Prior to late 1989, mass storage at NCAR was comprised strictly of offline, manual-mount media. In November 1989, the first STK Powderhorn "silo" was acquired, commencing a new era of mass storage at NCAR. The following figure illustrates the various technologies that have been used to store critical datasets throughout NCAR's history.

(Click on the image at left for a larger version.)

During FY2003, the NCAR Mass Storage System grew from 20,340,049 files with a total of 880  unique TB to 25,121,621 files with a total of 1,247 unique TB. Total holdings grew from 1,500 TB (1.5 PB) to over 2,149 TB (2.149 PB) This was an average net growth rate of 30 unique TB (60 total TB) per month during FY2004.

The MSS Today

MSS Access Methods

During FY2004, the technology used to access MSS data continued to undergo substantial change. A migration is underway from the use of the older, non-commodity, High Performance Parallel Interface (HiPPI) technology to the use of Gigabit Ethernet (GigE) and Fibre Channel (FC) technologies.

The HiPPI technology provides direct storage-device access via the High-Performance Data Fabric (HPDF). The data fabric consists of HiPPI channel interfaces to host computers, non-blocking HiPPI switches capable of supporting multiple bi-directional 100 MB/sec data transfers, and protocol converters that connect the HiPPI data fabric to the IBM-style device control units. To utilize the HPDF, SCD staff wrote a file transport type of interface to enable users to copy files between their host systems and the MSS. At the end of FY2004, the HPDF data fabric supports 12 independent file transfer operations between the tape devices and the compute servers sustaining 10 MB/sec each, for an aggregate total of 120 MB/sec.

HiPPI technology continues to be deployed only in a niche market. It has not shown signs of spreading into the commodity marketplace, and as a result the cost of HiPPI technology has remained high and the number of HiPPI vendors is small. The lack of availability of and support for HiPPI technology is becoming a critical issue to the continued operation of the MSS.

To alleviate these issues, SCD staff wrote the UNIX-based Storage Manager (STMGR), which replaces the HPDF as the method used to access data by host systems. STMGR isolates the client host systems from directly accessing the storage devices, simplifying the code SCD has to write and maintain for each type of host operating system. It also eliminates the need for HiPPI channel interfaces and device drivers on the client hosts. In place of HiPPI, commodity TCP/IP networking is used to access STMGR from the client host systems. Client host systems can use any available network interface at any speed to access files on the MSS. Currently, when using GigE, data rates in the range of 30-60 MB/sec are easily achievable with recent computer hardware. Using high-speed Ethernet as the client system interconnect means that future deployment of higher-speed GigE will automatically raise the capacity of the client system interconnect.

The use of UNIX systems for STMGR allows SCD to deploy the latest storage hardware and software technologies to manage MSS data. STMGR server systems initially use a FC Storage Area Network (SAN) to access RAID and tape drives via a high-reliability switch. Fibre Channel is currently available in versions that support either approximately 100 MB/sec or 200 MB/sec bidirectionally. Multiple FC connections may be made between STMGR servers and storage devices, and aggregate I/O rates approaching 1 GB/sec are possible with commodity components on a single STMGR server. The use of FC RAID plus journalling file systems allows STMGR to improve the robustness and flexibility of the disk cache. Also, MSS administrators can have STMGR reallocate resources between disk cache partitions or add space to disk cache partitions on the fly without interruption to MSS clients.

Near the end of FY2003, STMGR was placed into production as a replacement for the old IBM 3390 disk farm. The old disk farm could store approximately 180 GB, was used to buffer files that were smaller than 15,000,000 bytes long, and supported an aggregate transfer rate of 12 MB/sec. During the initial deployment, the STMGR disk cache stores approximately 500 GB and supports an aggregate transfer rate approaching 120 MB/sec. During FY2004, the STMGR disk cache was increased to approximately 8 TB to buffer files up to 50 MB in size. In FY2005, the STMGR disk cache will grow to approximately 60 TB, will buffer files of all sizes, and will support an aggregate transfer rate approaching 400 MB/sec. A disk cache of this size will permit newly written files to reside in the cache longer, which will reduce the number of tape mounts and tape I/O. STMGR will also, with further improvements in MSS software, allow better tape utilization by allowing files with differing storage requirements to be segregated on separate tape media. Both of these improvements will reduce the total number of tape drives that will be required to support the aggregate data rates between the MSS and the client host systems.

Also during FY2004, the use of HiPPI was reduced for newly written tape files when STMGR assumed the role of providing tape access. HiPPI can then be decommissioned in FY2005 once all data has oozed off the StorageTek 9840A media. New tape devices, such as the StorageTek T9940B Fibre-Channel-attached drive, store up to 200 GB and support I/O rates in the neighborhood of 30 MB/sec. This will be an improvement of 3 times in both storage density and transfer rate over the current tape devices. These improvements will allow the MSS to expand into the multi-petabyte range while reducing the latency to access MSS files.

MSS Storage Hierarchy

The NCAR MSS currently uses two levels of storage: online and offline. The most frequently accessed data are kept on the fastest storage media, which is the online storage devices: 8 TB of Fibre Channel RAID storage, and five StorageTek Powderhorn ACSes. The Powderhorn ACSes use StorageTek 9840A and 9940A, as well as StorageTek 9940A technology. Currently, the NCAR MSS has five ACSes providing a total online capacity of approximately 2 petabytes. The total capacity of the online archive will exceed 6 petabytes utilizing 9940B 200 GB cartridges.

Expansion of the MSS storage hierarchy is planned over the next five years with the introduction of new tape technologies, new ACSes, and with the integration of a multi-terabyte disk farm cache. Simulations of the MSS workload indicated that a 60-TB disk farm cache can reduce the amount of tape readback activity by as much as 60%. The disk farm cache would not only reduce the number of tape drives required in the system but also provide a much-improved response time to read and write requests. In addition, the MSS Group will continue to evaluate hardware and software solutions being developed by vendors throughout FY2005 and how they might be integrated into the NCAR MSS.

MSS Import/Export Capability

Another important capability of the NCAR MSS is the ability to import and export data to and from external portable media. Importing data involves copying data from portable media to the MSS data archive, while exporting data involves copying data from the MSS data archive to portable media. Import/export allow users to bring data to NCAR with them, as well as take data away. Import also allows data from field experiments to be copied to the NCAR MSS archive.

Options to exchange data with smaller satellite storage systems are being investigated. Using this technique, data generated at NCAR could be transferred to remote sites for further analysis. The NCAR SCD storage model would thus be geographically distributed, rather than centrally located and administered.

In addition to 3480 and 3490E cartridge tapes, the NCAR MSS also offers import/export to single and double-density 8mm Exabyte cartridge tapes. The deployment of an MSS-IV Import/Export server in FY2000 provided the ability to support many more device types, such as CD-ROM, DAT, and newer Exabyte media, to name a few.

MSS Accomplishments for 2004

Disk farm cache simulator

To aid capacity planning and performance tuning of the MSS, a simulator that includes all the major hardware and software components of the MSS was developed in 2003. The simulator enables the MSS group to consider different design alternatives for new software and hardware components and estimate how the different designs will perform before the components are added to the actual system. Simulation studies were conducted in 2004 using an earlier version of this simulator (that only simulated the disk cache component of the MSS) to aid in configuring and sizing the STMGR disk cache system.

In addition, simulator output was combined with MSS warehouse information to help measure the effectiveness of external data caches which were deployed to avoid rereading data from the MSS, thus avoiding the abuse of a data archive as a file server. The external cache deployment resulted in as much as a 60% drop in such re-reads.

StorageTek 9940B Technology

Initial deployment of 20 StorageTek 9940B tape drives was completed in FY2004. Managed by the STMGR, these drives are servicing the files offloaded from the disk cache and local system backup files. An additional 20 9940B tape drives will be installed in early FY2005, and with the expansion of the disk cache, a data ooze will be started in FY2005 to replace the 9940A technology.

User Education

As a result of the SCD-held user forum on computing issues, MSSG compiled for SCD's Consulting Office a short list of do's and don'ts regarding the NCAR Mass Store to help guide users toward efficient and proper use of the MSS.

New MSS Hosts

The IBM eSeries Linux Cluster, named lightning, was provided with Mass Store connectivity in 2004.

MSS Growth

NCAR Mass Storage System growth during FY2004 increased over FY2003. Average net growth rate during FY2003 was 27 TB per month, whereas the average net growth rate during FY2004 was 30 TB per month. This increase in the growth rate can be attributed to several factors, such as new MSS hosts coming online, increased amounts of local disk storage on several machines (which increases the size and number of MSS backup files), the IPCC initiative, and 14 computing nodes added to the IBM POWER4 cluster (bluesky). Further increases in the net growth rate are expected in FY2005 with the addition of two Linux clusters in early FY2005. Projecting this growth into the future, it is not difficult to realize that new storage paradigms and user education will be required, since without this the growth in just three to five years will be untenable.

The following table compares year-end statistics for FY2000 - FY2004.

Mass Storage System Growth Statistics


 

 

eFY2000


eFY2001


eFY2002


eFY2003


eFY2004


Total storage unique bytes (TB):

273

379

 519

 880

1,500

Total files (millions):

8.3

10.9

14.4

20.3

25

Net growth (unique bytes TB per month) at eFY:

5.2

10

13.3

27

30

Data read/written (TB per month):

25

 37

49

83

118

Data migrated internally (TB per month):

25

74

68

83

90

Manual tape mounts (number per month):

18,000

10,000

11,000

6,800

1,000

Robotic tape mounts (number per month):

54,000

95,000

110,000

123,200

160,000

Offline cartridge count:

142,000

126,000

70,000

20,000

24,000

Sustained GFLOPS on NCAR computing floor:

~75

~75

~140

~388

~500

 

Future Plans for the MSS

Key issues to be addressed over the next four years include:

  • Managing data growth and integrating new storage technologies to keep pace with the projected growth in computing power, finding ways to reduce the MSS growth.
  • Providing web-based MSS tools and interfaces to handle the unique problems of large-scale MSS file management and folding at least some of these into the SCD Portal.
  • Develop Quality of Service metrics from warehoused and other data to measure and report system performance.
  • Explore possibilities for collaborative research topics pertinent to the management and performance of large-scale data systems.
  • Implementing a multi-terabyte "internal" disk farm cache to be positioned in front of the MSS tape archive to improve overall response time and reduce tape traffic substantially.
  • Integrating multiple "external" disk caches to further reduce the load on the MSS and pave the way for a global, shared front-end fileserver.
  • Deployment of a new Metadata Server to replace the Master File Directory (MFD). The new Metadata Server will use a commercial database with the capability of scaling beyond the limitations of the current MFD.
  • Warehouse ongoing MSS performance data for subsequent accounting, analysis, and reporting.
  • Investigating solutions to address disaster recovery.
  • Use the newly developed MSS simulator to aid in capacity planning and performance tuning of the system.

Computational Science Research

CSS's mission is to help realize the end-to-end scientific simulation environment envisioned by the NCAR Strategic Plan. To this end CSS's role is to benchmark and evaluate computer technology, learn to extract performance from it, pioneer new and efficient numerical methods, create software frameworks to facilitate scientific advancement, particularly through interdisciplinary collaborations, and share the resultant software and findings with the community through open source software, publications, talks, and websites.

Applied Computer Science Research Activities

In 2004, CSS-applied computer science efforts have centered on three activities: studying experimental, massively parallel architectures such as Blue Gene/L; benchmarking and evaluating Linux clusters as part of an SCD procurement; and porting applications to Linux-Itanium systems as part of the Gelato Federation.

Blue Gene/L Application Research

IBM has developed a novel, low power, densely packaged, massively parallel computer system called Blue Gene/L. Each node of Blue Gene/L consists of dual PowerPC 440 cores running at 700 MHz. Each core is capable of two floating multiply-adds per clock cycle, and 1,024 nodes can be packed into a single 19-inch rack. Thus a single rack of Blue Gene/L processors has a peak speed of 5.6 teraflops. This is achieved while consuming about 15 kW of electrical power, a tiny fraction of that consumed by conventional massively parallel systems. Apart from its low power and dense packaging, Blue Gene/L has several interesting architectural characteristics, for example a dedicated tree reduction and synchronization network, as well as a toroidal interconnect.

IBM Blue Gene/L

Figure 1: 512 nodes (1,024 processors) of an IBM BlueGene/L system (photograph courtesy of IBM Research).

In 2004, scientists in CSS, in collaboration with researchers from CU-Denver and CU-Boulder, submitted a proposal to the NSF's Major Research Infrastructure program. The objective was to acquire a 1,024-node Blue Gene/L system to study the performance of scalable applications on it, and to evaluate its production capabilities. This proposal was recently funded by the NSF, and SCD is currently in negotiations with IBM to obtain a Blue Gene/L supercomputer for evaluation. The system will be used for high-resolution studies of moist physical processes, employing the cloud-resolving convection parameterization (CRCP). Because of its scalability, the primitive equations dynamical core for these studies of CRCP physics will be the section's prototype spectral element model, HOMME. Throughout the past year, members of CSS, working closely with IBM computer scientists, have been benchmarking a 512-node Blue Gene/L prototype located at IBM's T.J. Watson Research Center. The benchmarks have been chosen to measure the system's performance on key algorithms drawn from our proposed atmospheric science projects. All-to-all, point-to-point, and global reduction communication benchmarks have been used to measure the capabilities of the Blue Gene/L's networks, and prototype CRCP physics packages have been ported and optimized.

Benchmarking, Porting and Performance Modeling Activities

CSS has also been extensively involved in evaluating and benchmarking clusters for the recent procurement by SCD of a 256-processor Linux-based system. This procurement resulted in the acquisition of an Opteron/Myrinet Linux cluster, which achieved performance levels 1.3-1.4 times higher than an equivalent number of IBM 1.3-GHz POWER4 processors. In 2004, CSS performed extensive testing and evaluation of the IBM "Federation" interconnect.

CSS has continued to expand its engagement with computer science students. Dr. Henry Tufo in CSS has played a key role in exploiting this opportunity by leveraging his joint appointment as a Computer Science professor at the University of Colorado to involve four graduate students in NCAR research problems. Students of Dr. Tufo are working in the areas of application porting and tuning, Linux cluster system administration, and Grid computing applications. CSS staff also provided technical support to computer science students in a course taught by John Halley at the University of San Diego, in which NCAR applications were ported to a variety of platforms.

Gelato Membership

The Itanium very long instruction word (VLIW) architecture represents an important departure from the traditional superscalar RISC microprocessor and CISC-like Pentium architectures used in the geosciences departments at most universities today. Since the VLIW relies on the compiler rather than on-chip circuitry to extract parallelism from the instruction stream, developing robust optimizing compilers for Itanium is critical. As Itanium microprocessors become plentiful in the geosciences community, access to reliable compilers, ported modeling applications, and open-source high-performance mathematical libraries optimized for this architecture enable scientific progress on Itanium Linux systems.

In the past year, CSS's role as a member of the Gelato Federation, an organization devoted to the advancement of the Linux-Itanium technical solution, has been to "beta test" the Intel Fortran and C++ compilers on the Intel Itanium and Itanium-2 processors by porting and tuning a variety of applications, such as CAM2 and MM5 to this platform. In this capacity, CSS has closely collaborated with SGI to port CCSM to the Altix (Itanium-based) shared-memory architecture. As a result, CCSM has recently been validated and has successfully demonstrated exact restart capability on this platform.

Applied Mathematics Research Activities

The research activities of the Computational Science Section (CSS) at NCAR are focused on three broad goals. First, work sponsored by the Department of Energy's Climate Change Prediction Program (CCPP) is developing a new generation of accurate, efficient, and scalable general circulation models, based on high-order methods and suitable for use by the atmospheric research community. To this end, CSS has conducted applied mathematical research, tested novel numerical algorithms using the standard test cases of the atmospheric science research community, and has created efficient software implementations of these algorithms.

CSS has also been working to integrate two physics packages into these models: the physics in the Community Atmospheric Model Version 2 (CAM2), recently used for IPCC simulations as a component of the Community Climate System Model (CCSM) (Blackmon, et al. 2001), and a Cloud Resolving Convective Parameterization (CRCP) sub-grid scale physics scheme acquired through a collaboration with the Cloud Dynamics Group in the MMM division at NCAR.

New Semi-Implicit Implementation

In 2004, CSS completed re-implementing a semi-implicit time step for the spectral element primitive equations. As before, the 3D governing primitive equations were specified in curvilinear coordinates on the cubed sphere combined with a hybrid pressure vertical coordinate. The new non-staggered formulation eliminates the interpolation for nonlinear terms that caused problems for the staggered semi-implicit during year seven of our research. The new dry dynamical core, based on a non-staggered weak formulation, has been validated using the standard 1,200-day Held Suarez test problem.

The semi-implicit solver of this model is based on vertical eigenmode decomposition and an iterative conjugate-gradient elliptic solver. In tests, the performance of the solver has been greatly improved using a simple preconditioner proportional to the determinant of the metric tensor. The vertical eigenmode with the largest velocity is the last to converge and effectively controls the rate of integration. To be useful, the longer time-step allowed by the semi-implicit method must overcome the additional cost of the Helmholtz solver. Preliminary tests indicate that the semi-implicit integration rate is at least three times faster than the explicit spectral element dynamical core on a single processor. Scalability tests of the new formulation are planned for later in 2004.

Adaptive Mesh Refinement of Non-conforming Spectral Elements

Year 2's success rests on the outstanding work of Amik St-Cyr and John Dennis, two very promising young scientists at NCAR. In the past year they successfully developed and implemented a multi-level AMR version of the section's spectral element dynamical core. This is a fully parallel code based on the geometrically non-conforming SEM of Kruse and Fischer combined with a novel tree management strategy for AMR on the cubed-sphere called HAMR (HOMME AMR). Time-stepping restrictions caused by refinement are partially alleviated by employing the novel nonlinear operator integration factor splitting (OIFS) scheme of Thomas and St-Cyr. (As an added benefit, the resulting 3D equations are well-posed under AMR as OIFS does not require local time-stepping.) Our refinement/de-refinement technology is based on the error estimator work for spectral element methods of C. Mavriplis. Though validation testing is not complete the current release of the code has been validated on several of the shallow water test cases of Williamson, in particular test case 5. Other highlights of year 2 include numerous journal publications, conference presentations, and the involvement of several CU graduate students in the project.

Test output

Figure 2: Adaptively refined non-conformal spectral elements tracking a cosine bell test shape in shallow water equations.

After investigating the currently available packages to support AMR, the decision was made in September 2003 to build our own package to support AMR on the cubed-sphere. Using the static non-conforming code developed in year 1 as a guide, an entirely new AMR implementation was developed for HOMME. The HOMME AMR implementation (HAMR) is based on the TFS communication library of Tufo. TFS is a scalable direct stiffness summation package with low setup cost. It uses unique global IDs to pair shared degrees of freedom in a distributed environment. HAMR is designed around the concept of a distributed graph, while a lightweight bit-shifting tree algorithm is used to maintain inheritance properties among the spectral elements. The topology of the cube-sphere necessitated that a minimum of six separate trees, one tree for each face of the cube, be maintained along with the connectivity information between each tree. Because of the unavoidable need for graph management, it was decided that all spectral element connectivity information be maintained in graph form (versus tree form). This decision allows for an arbitrary select of the underling base grid. The distributed graph is updated each time a spectral element is refined or coarsened. Local graph query functions are used to set the proper global degree of freedoms for the TFS library. HAMR has been demonstrated in parallel to support both refinement and coarsening for multiple levels of refinement and achieves load-balancing via element migration.

In 2003, St-Cyr and Thomas developed a novel time-stepping scheme to ameliorate the time-step restrictions encountered under AMR and to maintain well-posedness of the 3D equation set. Merging the OIFS time-stepping required major revisions to the Krylov solvers, as generation of the preconditioning matrices on the fly is non trivial. In addition, HOMME implementation was generalized to remove unnecessary edge rotations. In the earlier version, a special treatment of vector quantities was necessary when on an edge of the cubed sphere. This change was necessary to use TFS library for the direct stiffness summations. The inter-element trace matching is generalized, and the masks necessary to eliminate doubled corner contributions are generated automatically.

As stated earlier, the OIFS time-stepping approach needs more aggressive preconditioning techniques. Martin J. Gander is collaborating with the team to determine whether an optimized Schwarz preconditioner can be used in the P_N - P_N (non-staggered) version of HOMME. Recent results obtained by Gander and St-Cyr include a proof that changing the preconditioning matrices in the Dryja-Widlund form of the additive Schwarz procedure leads to the optimized iterates. This result will help the community in accepting these novel preconditioning techniques.

Integration of Spectral Element Dynamics with CAM Physics

In 2004, CSS began Integrating HOMME explicit dynamics with CAM physics from version cam_2_0_2_dev69. The API between the dynamics and the CAM physics and the necessary CAM program management units was identified. Inconsistencies and incompatibilities with respect to the grid structures were identified and resolved. Most issues related to the initialization of an "Aqua Planet" [Hyashi86] experiment have also been resolved.

Integration of Cloud Resolving Convective Parameterization (CRCP) with CAM Physics

In FY2004, work began interfacing a Cloud-Resolving Convection Parameterization (CRCP; a.k.a. super parameterization; Grabowski Smolarkiewicz 1999; Grabowski 2001, 2003) with the HOMME dynamics. CRCP is a novel technique for representing clouds in atmospheric models. The idea is to embed a 2D cloud-resolving model in each column of a large-scale model to represent small-scale and mesoscale processes. Khairoutdinov and Randall (2001) have tested this approach in the Community Climate System Model (CCSM). A stretched vertical coordinate has recently been implemented in the CRCP code, facilitating direct coupling to a pressure vertical coordinate.

Conservative Advection using Discontinuous Galerkin Method

The Discontinuous Galerkin (DG) Method is a hybrid of finite-element and finite-volume methods, and it provides a class of high-order accurate conservative algorithms for solving nonlinear hyperbolic systems. This method is known for being highly parallelizable and the for being able to capture discontinuity of the exact solution without producing spurious oscillations.

In FY2004, a DG conservative transport scheme has been developed on the cubed-sphere (Nair 2004). This scheme has been further extended to a nonlinear flux-form shallow water model (SW) in curvilinear coordinates on the cubed-sphere. The spatial discretization employs a modal basis set consisting of Legendre polynomials. Fluxes along the element boundaries (internal interfaces) are approximated by a Lax-Friedrichs scheme. A third-order total variation diminishing Runge-Kutta scheme is applied for time integration, without any filter or limiter. The model has been evaluated using the standard SW test suite proposed by Williamson et al. (1992). The DG scheme shows exponential convergence for shallow water test case 2 (flow over a mountain). The DG solutions to the shallow water test cases are comparable to that of a standard spectral-element model. Even with high-order spatial discretization, the solutions do not exhibit spurious oscillations for the flow over a mountain test case. However, a spectral-element model or a global spectral model produces spurious oscillations for this particular test.

The model conserves mass to the machine precision. Although the scheme does not formally conserve the global invariants such as total energy and potential enstrophy, these quantities are better preserved than in existing finite-volume models. Currently, the DG transport scheme is being implemented in the NCAR/SCD High-Order Multiscale Modeling Environment (HOMME).

Radial Basis Functions (RBFs)

CSS has been focusing on two areas of research within RBFs. The first is examining the interpolation properties for oscillatory Bessel RBFs: these are an entirely new group of RBFs with interesting properties. For example, it has been shown very recently that pseudospectral (PS) approximations are just a subclass of RBF approximations in the flat basis function limit, i.e. as the parameter that controls the shape of the RBF goes to zero. Not only do oscillatory Bessel RBFs possess unconditional nonsingularity of the interpolation matrix for any scattered node distribution, but they are the only class of RBFs immune to divergence of the interpolant in the limit that the shape parameter goes zero. To further explore the relationship between PS and RBF approximations, it is important to understand the accuracy of oscillatory Bessel RBF interpolation in multi-dimensions. Dr. Natasha Flyer in CSS has proven in one-dimensional space that an oscillatory Bessel RBF expansion on an infinite lattice will exactly reproduce an n-dimensional polynomial of any order. She has gone on to provide an extension of the proof to arbitrary n-dimensional space. This is a great leap forward in RBF theory, as this is the only class of RBF functions known to possess this property. Dr. Flyer is working with Dr. Elisabeth Larsson of Uppsala University to extend this result to scattered node locations rather than a lattice.

The second RBF research area is to develop a theory applicable to spherical geometries. The importance of this research is to develop a new grid-free approach using RBFs to solve time-dependent PDEs in spherical domains. Such an approach is singularity free (due to its independence of any surface-based coordinate system), spectrally accurate for arbitrary node locations on the sphere, and naturally permits local mesh refinement. No other discretization method currently in use in spherical geometries can attest to these properties.

Modeling Solar Coronal Mass Ejections

The CSS collaboration with HAO studying Coronal Mass Ejection involves three related efforts. The first project is to extend recent results related to magnetic-field confinement in the solar corona. The second project, with Mei Zhang of the Chinese Astronomical Observatory, is to show that there is an upper bound on the amount by which the total magnetic energy in the force-free field for a dipole field configuration can exceed the Aly limit, which is defined by the amount of energy needed to completely open the solar magnetic field (i.e. have one end of a line of force anchored to the sun and the other running out to infinity). Dr. Flyer in CSS has been able to show numerically that not only does such a bound exist, but that it is 8.33%. This number has been guessed by some physicists in the field, but never before verified either numerically or analytically. The last project is to solve the hydromagnetic equations describing magnetic fields in realistic three-dimensional geometry, both in the force-free state and in force balance with plasma pressure and gravity. The general 3-D case is far more demanding computationally, featuring four coupled PDEs in three space dimensions, and is the subject of a recent CSS proposal submitted to NASA. This will be a cross-collaborative effort with HAO, CU-Boulder, and University of Wisconsin-Madison.

Shallow Water Flows Develop Singularities on the Sphere

In 2004, research demonstrating that certain shallow water test cases on a non-rotating sphere develop singularities was completed, and a paper on this topic has been accepted for publication [Swarztrauber 2004].

Development Activities

CSS development activities are aimed at providing modeling frameworks and mathematical libraries that support the research community's efforts to create portable and efficient models and scalable and efficient post-processing tools.

Earth System Modeling Framework

The Earth System Modeling Framework (ESMF) is building software infrastructure for climate, weather, and data assimilation applications. Collaborators include NCAR SCD, CGD, and MMM, NOAA GFDL, NOAA NCEP, MIT, the University of Michigan, DOE ANL, DOE LANL, and NASA/GSFC GMAO. The project is organized around a series of 11 milestones, the first five of which were submitted during FY2002 and FY2003.

The sixth and seventh ESMF milestones, submitted during FY2004, marked the public release of Version 2.0 of the ESMF software, the demonstration of three interoperability experiments using the framework, and the Third ESMF Community Meeting held at NCAR in Boulder, Colorado on July 15, 2004. The day-long Community meeting included a discussion of features in the release, a brief tutorial on adopting ESMF, and presentations describing how ESMF has been used to create applications from existing components developed at GFDL, MIT, NCEP, NASA and NCAR. ESMF Version 2.0 code and documentation can be downloaded from the ESMF website, http://www.esmf.ucar.edu/

The ESMF Partners and active collaborators list expanded to include groups at the DOD Naval Research Laboratory and the DOD Air Force Weather Agency, as well as existing partners at the Goddard Institute for Space Studies, UCLA, the Center for Ocean-Land-Atmosphere Studies, and the NASA GSFC Land Information Systems project. ESMF continues to coordinate with the European Programme for Integrated Earth System Modeling (PRISM) and the DOE Common Component Architecture (CCA) projects.

Spectral Toolkit Development

Work developing high-performance portable, highly efficient, open source numerical libraries for use by the mathematical and geosciences communities made steady progress in first part of FY2004, but this effort slowed later in the year due to staff hour cutbacks in this area to support CRCP physics integration activities.

In particular, development of the Spectral Toolkit library continued with the completion of the multithreaded spherical harmonic transform. Also developed were new distributed memory (MPI-based) 2D and 3D FFTs, using a generic pairwise generic transpose algorithm developed for the library. To date, completed components of the Spectral Toolkit include:

  • General mixed radix real and complex FFTs that are highly portable and among the fastest available.
  • Complex number class and operators that enable very efficient native complex arithmetic comparable to Fortran.
  • Multithreaded 2D and 3D real and complex FFTs scalable on both dual processor workstations and large enterprise-class servers.
  • Distributed 2D and 3D real and complex FFTs that enable scalable transforms on clusters using unique generic pair-wise transpose.
  • Multithreaded spherical harmonic transform that uses recurrence relations for spherical harmonics for small memory footprint, along with cache-blocked matrix multiplication for efficient multiple-instance transforms.
  • Associated Legendre function accurate evaluation at arbitrary points for generating spherical harmonic or spectral element reference points.
  • Gaussian quadrature very accurate Gauss-Legendre quadrature that provides high-resolution grids without requiring quad precision, also includes Gauss-Lobatto points and weights for spectral element grids.
  • Multidimensional generic array allocation that provides very efficient multidimensional array support comparable to Fortran, including triangular arrays for storing spherical harmonic coefficients.

Development Activities

Work continued on the collaborative Earth System Modeling Framework (ESMF). A much anticipated release of ESMF software Version 2.0 occurred in July 2004. The ESMF Version 2.0 release includes software for representing and manipulating components, states, fields, grids, and arrays, as well as a number of utilities such as time management, configuration, and logging. It runs on a wide variety of computing platforms, including SGI, IBM, Compaq, and Linux variants.

The Grid-BGC project completed a top-level user interface design, selected a GIS technology for handling maps and geographical information, began implementing Globus protocols, made implementation decisions regarding the software framework, completed "look-and-feel" designs for static and dynamic visualization tools, and performed data transfer and computational capacity testing on existing parallel hardware.

The Earth System Grid (ESG) moved into production mode for climate model research data with dedicated service for IPCC services in the area of coupled climate model data.

SCD completed work on the Web100 and NET100 projects.

Computing Center Operations and Infrastructure

Applications

Released new versions of the MySCD portal which provides, for the first time, customizable GAU charging information directly to the users. SKIL was upgraded to include modify/delete functionality. Two collaborative projects are ongoing with the University of Colorado: the METIS event-based workflow system evaluation is nearly complete, and a group of students are modernizing the room reservation system.

MySCD Portal version 2.0 Release

The new version of the MySCD portal was released. This particular release was delayed as a result of security changes. These security changes required that the Portal system be retooled to support one-time passwords. The September release adds significant functionality that for the first time allows users of SCD's computational resources to get a summary view of their allocation usage including total percentage charges for computational and mass storage usage.

GAU usage graphs




GAU usage graphs

Additional accomplishments for applications:

  • Remedy 5.1.12 upgrade
  • SKIL release 2.0: addition of modify/delete functionality
  • Metis Workflow System: completed analysis and evaluation of the final Metis system in the NCAR environment
  • Room Reservation System: a senior capstone project with the University to develop a next-generation reservation system

Computer Room Infrastructure

SCD is developing both short and long-term plans to meet the demands of future computing systems. Multiple options are being developed that include building a second data center to expand, this work will continue into FY2005. The Mesa Lab standby generators were commissioned and put into service, and some field modifications will be made to simplify their operation. The computer room has reached its maximum cooling capacity; FY2004 focused on a design and procurement process to upgrade the chilled water systems.

The standby power generation system was installed and commissioned in March 2004. The shakedown and familiarization with the systems continued well into the summer with some modifications to the control sequence as the result of lessons learned.

standby power generator

Additional accomplishments for infrastructure:

  • Chilled water expansion project: design was completed and the construction phase will begin in early FY2005
  • Data center expansion investigation: an extensive investigation is in progress to best meet the infrastructure needs of the organization that are being driven by the scientific demand for simulation capabilities. Options being investigated include:
    • Lease
    • Build
    • Collocate
    • Upgrade air handling units
    • Install Linux cluster

Computer Room Operations

Supported, set up, and distributed CryptoCards as part of a new responsibility associated with strengthened security requirements. Media conversions continue with the move to 200-GB media. Rotating schedules have been a success with much more exposure of all operators to the rest of SCD staff.

Operations stepped into a new support role that came about as part of new security requirements. During the March timeframe, the implementation of one-time passwords resulted in a new need to distribute and support CryptoCards.

Enterprise Services

Network and system security dominated the year. Several significant changes, including the introduction of one-time passwords, were completed very quickly to secure the supercomputing assets. Storage area networks were investigated along with several significant upgrades to production systems for data provisioning and web access.

During March a new security perimeter was quickly established, and the Distributed Systems Group (DSG) designed, implemented, and rolled out a one-time password solution to protect supercomputing assets from intrusion attempts. These efforts were instrumental in returning the supercomputer systems to the network for community use in a three-week period. Since then, there have been a number of attempts that have been successfully turned away.

In addition to the security work, a Storage Area Network (SAN) testbed was put together. A cost-effective solution is under investigation that will utilize Serial ATA technologies and a SAN solution that works in a heterogeneous environment. At this early point, Linux and Sun systems have been successfully integrated into the SAN.

Additional accomplishments for enterprise services:

  • Server upgrades and additions:
    • Web cluster
    • DSS (huron)
    • Database server
  • Windows desktops:
    • AD migration to CIT AD or SCDAD
    • Automated virus updates and patches
    • Veritas for backup solution

Network Engineering and Telecommunications

The Network Engineering and Telecommunications Section (NETS) is responsible for all engineering, installation, operation, maintenance, strategy, planning, and research regarding the state-of-the-art data networking and telecommunications facilities for NCAR/UCAR. Support of these facilities requires NETS staff to:

  • Troubleshoot network hardware and software
  • Update network configurations
  • Monitor network performance, load, and errors
  • Expand networks to meet increased numbers of users, increased bandwidth requirements, and new standards
  • Design, engineer, and construct new cabling plant and wireless networking systems
  • Support over 167 logical networks, approximately 200 monitored network devices, and over 4,400 network-attached devices at NCAR/UCAR
  • Support Remote Access System (RAS)
  • Design, engineer, and maintain UPS infrastructure
  • Provide network-based security
  • Support telecommunications and voicemail
  • Provide telephone operator and directory services
  • Support all SCD internal networking needs
  • Maintain network documentation and databases
  • Track and coordinate all network project activities via a work-request/work-tracking system
  • Manage networking equipment inventory
  • Develop and manage network budget
  • Manage and engineer the Front Range GigaPOP (FRGP) and the Boulder Point of Presence (BPOP)
  • Help manage the Boulder Research and Administration network (BRAN)
  • Lead, participate and support the networking portion of advanced projects in high-performance network research and development that are a product of outside or interagency funding, including international networking project support, Web100, NPAD, Cisco Large Packet Grant, Net100, National LambdaRail (NLR), and the Quilt
  • Evaluate new networking technologies and equipment
  • Consult with network users about configuration and performance issues with network applications, network hosts, and network connections
  • Provide high-performance networking testbeds
  • Provide high-performance networking measurement tools
  • Tune network performance

In summary, NETS provides a vital service to the atmospheric research communities by linking scientists and supercomputing resources (including mass storage systems and other data processing resources) at NCAR to other resources and scientists throughout the university research community. Such high-performance networking activities are essential to the effective use of NCAR/UCAR's scientific resources, and they foster the overall advancement of scientific inquiry.

The primary NETS accomplishments in 2004 include:

  • Participating in the National LambdaRail (NLR) Project
  • FL0 and CG1 design
  • Dark Fiber Expansion
  • Participating in The Quilt Project
  • CyRDAS Report

These projects along with the rest of the NETS 2004 accomplishments are described in this Annual Scientific Report.

Networking research projects and technology tracking

Networking research projects
NETS is a principal collaborator in several nationally recognized research and development networking and data communications technology projects. NETS hosts and presents at national and regional meetings on a variety of networking projects. NETS was awarded an STI award for the Network Path and Applications Diagnosis (NPAD) project proposals in collaboration with the Pittsburgh Supercomputing Center (PSC), and NETS contributed to the NIH BRIN Lariat project. NETS assisted in the submittal of the NSF Chronopolis proposal and NSF IRNC proposal.

Steering Committee for Cyberinfrastructure Research and Development in the Atmospheric Sciences (CyRDAS)
Marla Meehl served on NSF's The Steering Committee (SC) for Cyberinfrastructure Research and Development in the Atmospheric Sciences (CyRDAS). This committee was responsible for assessing the opportunities for advances in atmospheric science research that are made possible by current or anticipated advances in information technology and computer science, the opportunities for advances in science that might result from collaborative research between atmospheric scientists and computer scientists, ways in which cyberinfrastructure can contribute to formal and informal education in atmospheric science and the cyberinfrastructure needs of the NSF-funded atmospheric science research community. The committee also made recommendations on strategies for NSF that will help the academic research community exploit the opportunities identified in the assessments above. Finally, the CyRDAS Steering Committee developed an implementation plan for a distributed cyberinfrastructure that will meet the needs of the academic atmospheric science research community and which includes the flexibility to grow smoothly as that research advances and CI needs grow.

The Hybrid Optical and Packet Infrastructure Project (HOPI)
When Internet2 was first organized in October 1996, one defining mission was to provide scalable, sustainable, high-performance networking in support of the research universities of the United States. The resulting infrastructure, comprised of campus, regional, and national components, is a successful and robust packet switched network. In the next few years, however, it must evolve to take advantage of new infrastructures, and the HOPI testbed will examine future network architectures. In planning on the Internet2 networking architecture needed beginning 2006, HOPI is considering a hybrid of shared IP packet switching and dynamically provisioned optical lambdas. The term HOPI (for hybrid optical and packet infrastructure) is used to denote both the effort to plan this future hybrid and a testbed facility that will be built to test various aspects of candidate hybrid designs. A white paper describes a plan for the HOPI testbed facility. The goal of that facility is "to provide a facility for experimenting with future network infrastructures leading to the next generation Internet2 architecture." The eventual hybrid will require a rich set of wide-area lambdas together with switches capable of very high capacity and dynamic provisioning, all at the national backbone level.

Web100 project
The Web100 project is funded by the National Science Foundation (NSF) as a collaborative project developing end-host TCP performance measurement and enhancement tools, which help end-hosts automatically and transparently achieve high TCP data rates (>100 Mbps) over the high-performance research networks. Software and tools have been developed for the Linux operating system in an open manner so they can readily be ported to other operating systems. Such ports are currently in progress. The Web100 principal partners are the National Center for Atmospheric Research (NCAR), the Pittsburgh Supercomputing Center (PSC), and the National Center for Supercomputing Applications (NCSA).

The Web100 project has achieved significant progress on several of its key project milestones during the past year. By releasing the Web100 software to the general user community, numerous individuals and groups have incorporated the software into a wide and diverse set of useful applications. In addition, the Web100 TCP Extensions MIB is on the Internet Engineering Task Force (IETF) standards track and is expected to be submitted for last call by the end of this calendar year. Most importantly, the Web100 software is currently being officially incorporated into the Linux 2.6 kernel development release for use as library functions in all Linux distributions. Microsoft has reported incorporating much of the Web100 software functionality into the next release of their .NET server software, and a BSD port of the Web100 software is also underway. This project was completed in August 2004.

Net100 project
The Net100 project is creating software extensions that allow computer operating systems to dynamically adjust to the available network bandwidth for large data flows. Net100 has just completed the final year of a three-year grant from the Mathematical, Information, and Computational Sciences (MICS) program in the Office of Science at the U.S. Department of Energy (DOE). The project is a collaboration between the Pittsburgh Supercomputing Center (PSC), the National Center for Atmospheric Research (NCAR), Lawrence Berkeley National Laboratory (LBNL), and Oak Ridge National Laboratory (ORNL). During this past year considerable additions have been made to the wide-area daemon (WAD) extension algorithms. The WAD code has also been extensively used and tested by several groups of external users, providing useful feedback to the Net100 team. The Net100 project has also implemented algorithms previously only run in simulations, providing useful theoretical feedback to the algorithm's author for additional modifications along its Internet Engineering Task Force (IETF) standardization process.

Network Path and Applications Diagnosis (NPAD)
A key missing piece of the end-to-end performance puzzle is that the current set of diagnostic strategies do not adequately account for the effects of path delay. The project team is developing extensions to existing diagnostic tools which will effectively take path delay into consideration, compensate for a variety of delay times, and test the effects of these new diagnostic tools with network users and operators, using actual high-performance applications. Due to recent insights gained from the Web100 and Net100 projects, we can show that the missing piece of the performance diagnostic puzzle is that the symptoms of most application and network defects scale with increasing path delay. For example, a minor defect in a campus LAN might have an insignificant or negligible effect on an application running on a 1-ms path across campus. However, that same defect has a greater impact on performance when running on a long path across the continent. In this project, we are developing new diagnostic techniques, which include a suite of diagnostic tools and strategies for their use to test applications locally with a 100-ms virtual path, and then test each successive segment of the actual path extended with a virtual path to a total path delay of 100 ms. Such testing is expected to expose otherwise hidden flaws and impediments that contribute to delay, since each component of the path and application can be tested in a context equivalent to an ideal end-to-end path, while ruling out other potential flaws.

Cisco University Research Program Funding - Investigating Large Maximum Transmission Units (MTU)
Over the last decade, we have witnessed a tremendous increase in raw network capacity. Today, we are seeing the ubiquitous deployment of cross-country 10 Gbps optical networks and early standards efforts in support of 40 Gbps and 100 Gbps networking technology. As a result, long fat pipes (elephants) [RFC1323] are no longer a rarity -- in fact they are becoming the norm. For example, NSF funded the 40 Gbps Distributed Terascale Facility (DTF) [RB02], UCAID has built a sonet-based 9.6 Gbps Abilene Network of the Future [Int02], and contractual negotiations have been completed on a dark fiber lambda network called the National LamdaRail. However, it is not clear whether these network capacity increases will actually result in comparable increases in application performance due to a number of specific underlying technical issues and limitations. It is well known that bulk transport application performance has not kept up with network capacity increases. We believe that application performance has fallen by about two orders of magnitude relative to the raw performance of the underlying technologies. In this proposal, we intend to identify, investigate, and address technical issues specifically associated with the underlying network infrastructure and application performance for next-generation Internet networks.

NOAA High Performance Computing and Communications (HPCC) Program
NOAA has asked NCAR to participate with their proposal to the NOAA High Performance Computing and Communications (HPCC) Program, with no funds proposed for NCAR. NOAA and NCAR are making excellent advances in supercomputing and planning for the development of enterprise network architecture. However, NOAA and NCAR currently do not have a distributed network test facility for evaluating new network technologies, other than in local environments, although the benefits of such a facility are numerous. NOAA and NCAR would be well served by developing a way to test and integrate new network technologies to gain vital experience for promoting NOAA and NCAR research and operations. We propose to develop an optical network testbed using Dense Wavelength Division Multiplexing (DWDM) by leveraging NOAA and NCAR's existing investment in a metropolitan fiber optic network. This network testbed will provide a platform for integrating DWDM technology, and for determining how DWDM will best serve NOAA and NCAR research and operations, but it will also address other important challenges. The optimization of data flows and server input/output at 10 Gbps will require study, testing and tuning, and the question of how to secure data flows at this rate will also be addressed. The proposed DWDM network testbed will include a pair of optical network switches with Gigabit Ethernet (GbE), 10 GbE, and DWDM components to be procured and located at NOAA Boulder and at NCAR. An optical tap will also be part of the testbed to allow passive monitoring of all data flows. In addition, four 64-bit computer systems will be procured to transmit, receive, monitor, and secure the data flows. Two systems will be located at NOAA and two at NCAR. NCAR will cosponsor 5% of Pete Sakosky's time and provide a rack of computer room space for this project. This proposal was submitted and is pending approval.

Access Grid
NETS participated in the network configuration, testing, and optimization of the SCD Access Grid. NETS will continue to participate in the deployment of operational Access Grids.

Earth System Grid project
NETS provided network-engineering support to the DOE ESG II project.

Network technology tracking and transfer
NETS tracks networking technology via networking conferences, training classes, vendor meetings, beta tests, technology demonstration testing, email lists, networking journals, attending user conferences, and by meeting and exchanging information with universities and other laboratories. NETS staff continued to utilize all of these technology-tracking avenues during FY2004. New technologies are integrated into the production UCAR networks on an ongoing basis after a technology has met our capacity, performance, connectivity, reliability, usability, and other maintainability requirements.

Local Area Network (LAN) projects

NETS supports both NCAR/UCAR network needs as well as the special networking needs of SCD itself. Therefore, all LAN projects are further subdivided as being either NCAR/UCAR LAN projects or SCD LAN projects.

NCAR/UCAR LAN projects

UCAR network infrastructure recabling projects
The common goal of all UCAR recabling projects is to provide each workspace with a standard set of dedicated data communications links. The overall plan calls for each workspace to be provisioned with a standard Telecommunications Outlet (TO) that connects with four Category 6 (CAT6) twisted-pair cables and two pairs of multi-mode optical fiber. Additionally, intra-building (trunk) cabling must be installed to concentrate all workspace cables to intermediate and central locations.

Concurrent with recabling, each network device is delivered 100 Mbps of dedicated bandwidth via a dedicated Ethernet packet-switch port. Such dedicated-port access offers substantial networking performance improvement over shared-media Ethernet access.

NETS designed permanent network infrastructure for the CG1 and FL0 buildings. NETS assisted in the extensive relocation and related cabling for FL4 staff. NETS recabled the FL2-FL3 interbuilding cabling due to FL0 construction.

NETS also participated in the following projects: CG bike path design, Jeffco hanger networking design and implementation, UNAVCO move and lease completion, and the Nextel cellular repeater design and implementation.

Network monitoring project

NETS continues to use HP Openview, flowscan, Prognosis and Cricket as its principal tools for network monitoring and statistics gathering.

Additionally, NETS has installed certain specialized network monitors at the request of two national network-measuring organizations, namely MOAT and Internet2. NLANR's MOAT organization has placed an OC3MON monitor at NCAR's Mesa Lab and installed an OC12MON in the Front Range GigaPOP equipment racks located in CU Denver's computer room. MOAT has also placed an AMP monitor at both the Mesa Lab and the FRGP as well. On behalf of UCAID's Abilene network, Internet2. (ANS) has placed a Surveyor network monitor at both the Mesa Lab and at the FRGP.

Local serial-access project
NETS supports several terminal servers for providing serial console access to various computer and networking equipment. Serial support is also provided for the very few serial terminals remaining at UCAR.

NETS CSAC support project
The NCAR/UCAR Computer Security Advisory Committee (CSAC) is chartered by the SCD Director to assess the state of computer and network security at NCAR/UCAR, and to make recommendations to assist NCAR and UCAR management in setting policies related to the security of computers and other devices attached to the NCAR/UCAR network. CSAC membership is composed of technical representatives located throughout the various NCAR/UCAR organizations.

NETS is involved with CSAC since nearly all security policies involve various types of network-connected devices located between the networks belonging to the external world and the UCAR networks that are being protected from the external world. These network-attached devices can operate as filters and/or authentication devices operating at one or more OSI (Open Systems Interconnection) layers, usually at the Network/Router Layer (Layer 3) and higher. Based on CSAC recommendations, NETS continues to implement significant new gateway router filters to improve network security for UCAR. Extensive testing and extensive coordination throughout UCAR is required to implement the recommended security filters. NETS also cooperates on wireless, RAS, and VPN security measures.

VLAN Splitting Project/Layer 2/3 Design
NETS is in the process of re-engineering our backbone to provide higher reliability and redundancy to network-based services. Previously, NETS allowed single subnets and VLANs to span across all campuses. This was a manageable design when NCAR was located in only two main campuses. The design offered increased convenience for users in being able to request any subnet to be activated anywhere at the Mesa Lab or Foothills Lab campuses.

NETS was driven to reevaluate this design with the addition of the third major campus, Center Green, and a desire of increased reliability for VoIP and business continuity. The old design did not include any redundant links, which would create network loops that needed to be dealt with using the spanning tree protocol. However, with the addition of the CG campus, NETS has built redundant links so the ML, FL and CG campuses are all connected in a triangle. While the spanning tree protocol works adequately in a simple, loop-free network, it is quite poor at handling redundant links and the loops they form. To take full advantage of the entire mesh of links between the campuses, it is necessary to use a more intelligent protocol at a higher layer.

At the IP layer, NETS has been using OSPF for a number of years. It is fully capable of handling the current topology, not only detecting and routing around link failures in a matter of a few seconds, but properly routing traffic over the shortest path between campuses. Presently, NETS has the new backbone fully deployed and is in the process of restricting subnets to a single campus. Completion of the project's final stage is expected before the end of the year.

Multicast support activities project
Multicasting is a technology in which a single outbound stream of data can be made to arrive at multiple destinations. The data stream is multiplied in a tree-wise fashion using both software and hardware to effect the multiplication. Multicasting technology is particularly useful for video conferencing and audio conferencing applications. NETS continues to support and enhance multicast services for UCAR.

UPS project
NETS has continued installing UPS (Uninterruptible Power Supply) units into all new telecommunication closets so all networking equipment will receive short-term standby power in the event of any short-term power failure. UPS units also help filter out damaging power spikes. Upgrading, expanding and maintaining these devices is an ongoing process.

In addition, SCD installed a generator at the Mesa Lab and NETS has tied their equipment into this in the computer room and in areas where safety and security support is critical. NETS is also in the process of tying their UPSs at FL into the facilities generator to provide additional business continuity.

Grounding
NETS is in the process of grounding all communications closets and NETS hardware to eliminate static issues causing hardware and phone failures. This is a technically difficult and time-consuming project.

Wireless
NETS supports wireless network access in all public areas and conference rooms, and is in the process of deploying wireless access in all office areas as well. NETS also designed, tested and installed a long-distance, high-speed, wireless link between CG2 and FL4 and the Mesa Lab and Foothills Lab. These links provide backup to wired links.

Voice over IP (VoIP)
VoIP in its full implementation is the complete merging of data, voice and video networks. Traditional separate voice systems such as PBXs, telephone cabling infrastructure, and proprietary telephone handsets are replaced with computer-based call managers, IP telephones, and IP-based voicemail devices that are Ethernet attached just like any other networked device. With a VoIP system, the same cabling infrastructure that is used for traditional network devices can also be used for VoIP devices. Such a combination simplifies the overall cabling infrastructure and lowers total costs of installation and maintenance. Additional, ongoing cost savings are realized through simplified moves, adds, and changes made possible through a VoIP-based system. The same network monitoring system can be used to monitor both VoIP and data network devices.

The VoIP project was completed January 2004.

SCD LAN projects

In addition to its overall NCAR/UCAR networking responsibilities, NETS has additional special support responsibilities specific to SCD. NETS handles or consults on most of the host-based SCD networking, including all supercomputing networking. NETS is also responsible within SCD for several other tasks including:

  • Special cabling fabrication and installation
  • Networking and system testbeds
  • External and local conference networking and system support
  • Special-grant project networking needs
  • Security

Ongoing SCD network support project
NETS is responsible for most aspects of daily operation of the SCD LANs and host-based networking. Among these responsibilities are monitoring, managing, tuning, troubleshooting, upgrading, reconfiguring, and expanding SCD LANs and host-based networking. NETS works closely with the system administrators of all SCD network-connected systems.

Supercomputing network support project
NETS supports almost all aspects of networking for all SCD supercomputers. This includes hardware, software, and routing configuration support for Gigabit Ethernet and Fast Ethernet interfaces. IP routing configuration support is also provided for the supercomputer connections.

ML 29 move and upgrade
NETS has been asked to relocate their equipment in the SCD Computer Room 29. In this process, we will also upgrade the 6509 Ethernet switches to Sup720s to provide full backplane capabilities and support 10 Gbps in the future. The design and planning for this project is complete.

Metropolitan Area Network (MAN) projects

Boulder Point-Of-Presence (BPOP)
The Boulder Point-of-Presence, or BPOP, is a collaboration of institutions in the Boulder area. BPOP members cooperate to provide access to wide-area network services. The BPOP members are: NCAR, University of Colorado at Boulder (UCB), NOAA, Colorado State University, the City of Boulder, and the County of Boulder. BPOP equipment consists of a router, an ATM switch and an Ethernet switch. The router, Ethernet switch, and ATM switch are located at the UCB Telecommunications Center.

The BPOP offers two primary services to its members: access to ICG Dedicated Internet Service and access to the FRGP. Access to ICG is provided via an OC12 (upgraded this year from OC3) link connected to BRAN fiber. A Gigabit Ethernet connection to the FRGP is provided over dark fiber. This link is shared by NCAR, NOAA, CSU, and the City/County of Boulder. UCB has its own OC-12 link to the FRGP. CSU also has its own OC3 to the FRGP. UCB and the BPOP also back up each other's links to the FRGP in Denver.

  • BEAR Project - BPOP Enhancement And Relocation project or Moving the BPOP to CU
    Scot Colburn spearheaded the BEAR project to enhance robustness and redundancy in the UCAR and Boulder networks. NCAR's ML and FL sites are at remote ends of the BRAN fiber path, making these sites vulnerable to isolation should a BRAN fiber cut occur. A wireless link is in place to connect the two sites in case of such a cut, but NCAR's Internet connection depended on a border router at Mesa Lab not being isolated. Moving NCAR's border router to a location on the ring portion of BRAN created an opportunity for redundant Internet connections and a design where Internet access is invulnerable to BRAN fiber cuts. Co-locating NCAR network hardware on the BRAN ring involved finding and negotiating appropriate space in a building located on the ring, and it turned out that CU's Telecom building best fit the bill with a combination of well-secured and well-organized space having good power and cooling capacity. Fortunately, NCAR and CU have a history of working successfully together on projects such as BRAN. In addition to negotiating equipment space with CU, co-locating the necessary hardware involved scheduling, cutting into and re-terminating the NCAR portion of the BRAN fiber inside CU Telecom, which was done without incident. With BEAR complete, the BPOP members that use NCAR for Internet service or Internet backup connect to a network designed to be invulnerable to fiber cuts. Drivers for involved parties included:

    • NCAR sites receive reliable and robust Internet access. NOAA receives reliable and robust Internet access over NCAR's ML-FL wireless link. CSU GigE becomes less vulnerable to BRAN cuts. CoB gets more reliable Internet access and is invulnerable to BRAN cuts with a second GigE.
    • CU already has 622 Mbps to FRGP but backup FRGP access over NCAR's fiber path becomes more reliable via collocated switch.

    The BEAR project was completed this year.

  • BRAN project
    BRAN is the Boulder Research and Administration Network, which is an eleven-mile fiber network in Boulder built and operated by the four BRAN partners to privately interconnect their Boulder-area facilities. The BRAN partners include:

    • City of Boulder
    • University of Colorado at Boulder
    • National Center for Atmospheric Research
    • NOAA-Boulder and NIST-Boulder

    UCAR has greatly reduced intra-Boulder circuit costs by utilizing BRAN fiber. Active BRAN circuits involving UCAR facilities include ML-FL, ML-FL Security, City-UCB, NOAA-ML, CU-ICG, CSU-UCB, CU-ML, UCB-FRGP (via Level3), CG-ML. BRAN carries voice and data traffic. NETS continues to actively participate in the BRAN management committee and the BRAN technical committee.

Remote-working and home-access project
Remote access continues to be provided by four digital T1 PRI lines connected to a Cisco AS5300 Remote Access Server (RAS). Each T1 PRI line provides twenty-three 56-Kbps channels that can support analog or ISDN dial-in access. Long-distance access via direct 1-800 lines is overlaid on one of the PRI lines. Telnet, PPP, and ARAP access are supported on the Cisco RAS device.

Dark fiber efforts
NETS led the effort to interconnect BRAN to ICG fiber at the Boulder ICG POP. This was a long-term goal of BRAN and has allowed added flexibility in planning and expansion of BRAN to include fiber paths from Boulder to Denver, Boulder to Fort Collins, adding the Center Green campus, the OC12 (622 Mbps) ICG path for Commodity Internet services, and the ICG Primary Rate Interface (PRI) telecommunication and RAS services provided to UCAR. These efforts allow for future strategic directions to be taken for the expansion of dark fiber, additional cost-effective ICG services, distributed storage area networks (SANS), and business continuity goals.

NETS is in the process of installing fiber to the Jeffco site. NETS also worked with the University of Wyoming to secure fiber from Laramie to the FRGP and to provide a northern protect fiber ring in Colorado and Wyoming. NETS is working to secure a private diverse fiber ring in downtown Denver to protect the FRGP access locations. NETS is working on the Western Lights project to investigate fiber paths in the western US including Utah to provide a protect path to the FRGP. NETS has also had discussions of fiber to other FRGP members such as Colorado School of Mines, University of Colorado at Colorado Springs, University of Northern Colorado and Denver University.

Wide Area Network (WAN) projects

Front Range GigaPOP (FRGP)
The Front Range GigaPOP (FRGP) is a consortium of universities, non-profit corporations, and government agencies that cooperate in an aggregation point called the FRGP to share Wide Area Networking (WAN) services connecting to the Commodity Internet, Abilene/Internet2, and to each other. UCAR operates the Front Range GigaPOP under contract to the other members. The primary FRGP networking facilities are located in the Auraria Campus computer room of the University of Colorado at Denver and in the Level3 Denver POP located at 1850 Pearl Street in Denver. The current partners are (* = new this year):

  • CSM - Colorado School of Mines
  • CSU - Colorado State University system
  • DU - Denver University
  • FLC - Fort Lewis College
  • *Ithaka (provider of online scientific journal articles)
  • NCAR - National Center for Atmospheric Research
  • NOAA - National Oceanic and Atmospheric Administration
  • State - State of Colorado
  • UCB - University of Colorado at Boulder
  • UCCS - University of Colorado at Colorado Springs
  • UCD - University of Colorado at Denver
  • UCHSC - University of Colorado Health Sciences Center
  • *University of Utah
  • UW - University of Wyoming

There are several GigaPOPs similar to the FRGP throughout the U.S., and a number of advantages are gained by sharing services through such GigaPOPs:

  • Costs for WAN services are reduced for each partner
  • Expertise among partners can be shared
  • A higher level of services can be purchased than the individual institutions could afford
  • There is more buying power among a consortium
  • There are great economies of scale
  • Quilt membership and their negotiated discounts with Commodity Internet providers are available only to GigaPOPs

UCAR has provided the engineering and Network Operations Center (NOC) support for the FRGP, with the service costs incurred by UCAR being shared by all FRGP members. NETS believes that the greater service and bandwidth obtained through the FRGP are important enough for UCAR to operate the FRGP by providing engineering and NOC services, and the FRGP has agreed that NETS has the most qualified engineering and NOC staff to provide the very best engineering and NOC services for the FRGP.

The FRGP is a critical service for UCAR staff as well as all of the other FRGP partners, and the FRGP has proved to be an extremely successful technical project as well as an excellent collaboration with the Colorado, Wyoming, and Utah research and education community.

NETS made many significant improvements to the FRGP during 2004. These include:

  1. WilTel Commodity Internet Service at GigE.
  2. Qwest Commodity Internet Service at GigE.
  3. A peering agreement was created and approved, and we signed Comcast and the FRII as our peering partners at no cost to UCAR or the FRGP.
  4. FRGP formed a consortium with Utah and signed NLR, LLC Participation Agreement.
  5. NETS is developing an FRGP/BPOP redundancy plan called DREAM.
  6. The FRGP is developing a northern fiber ring for redundancy.
  7. New 5-year FRGP agreements were developed and executed.

The Quilt
The Quilt is a project whose participants are nonprofit advanced regional network organizations dedicated to advancing research and education in the United States by:

  1. Providing a broad range of advanced networking services to their constituents, including network engineering, management, and operation; regional connectivity and exchange; and promotion and coordination of regional activities

  2. Facilitating innovative and successful projects and productive working relationships

The Quilt's specific purposes and objectives are to:

  1. Provide advanced network services to the broadest possible research and educational community

  2. Promote end-to-end continuity, consistency, reliability, interoperability, efficiency and cost-effectiveness in the development and delivery of advanced network services by means which, at the same time, foster innovation and reflect the diversity of its members

  3. Represent our common interests to backbone network service providers, industry, government, standard-setting organizations, and other organizations involved in or influencing the development and delivery of advanced network services

The FRGP joined The Quilt in June 2001, and NETS represents the FRGP on the Quilt Steering Committee and the Quilt Executive Committee. NETS actively serves on the Network Facilities Project, the Regional Fiber Project, the Peering Project and the hiring committee. The FRGP continued to take advantage of Quilt discounted pricing for Commodity Internet services, which significantly decreased prices paid by FRGP for Commodity Internet services this past year.

National LambdaRail (NLR)
National LambdaRail, Inc. (NLR) is a consortium of leading U.S. research universities and private-sector technology companies building and operating an optical multi 10 Gbps LAN-PHY network across the country. NLR's fundamental mission is to provide an enabling network infrastructure for new forms and methods for research in science, engineering, health care, and education as well as for research and development of new Internet technologies, protocols, applications and services. As evidence of the commitment to this mission, NLR will devote 50 percent wave allocation to network research. NLR puts the control, the power, and the promise of experimental network infrastructure in the hands of our nation's scientists and researchers. NLR aims to re-energize innovative research and development into next-generation network technologies, protocols, services, and applications.

NETS has played a leading role in developing NLR throughout the year. NETS also worked to build an FRGP/Utah consortium to join NLR.

vBNS+
The vBNS+ network is a production ATM network owned and operated by MCI. The vBNS+ interconnects the NSF supercomputing centers, universities, and other customers. UCAR remains connected to the vBNS+, and NETS has continued its support of the vBNS+, as long as MCI continues to provide this service to UCAR.

Internet2/Abilene
Internet2 operates the Abilene network, a SONET 192 (9.6 Gbps) national backbone supporting high-performance connectivity and Internet innovation within the U.S. research university community. The set of advanced services supported include IPv6 and multicast. As of October 2004, Abilene had 44 connected entities serving 228 institutions. UCAR joined UCAID and attached to Abilene via the FRGP in FY2000.

Other network projects

Projects listed in this section are ones that don't neatly fit into the Research, LAN, MAN, or WAN project classification scheme.

Telecommunications
NETS is responsible for all voice communications at UCAR. Responsibilities include engineering, maintenance, configuration, and operation of the telecommunications cabling plant, the VoIP system, and telephone handsets. Telephone directories and UCAR telephone reception are also NETS responsibilities. The PBX was decommissioned and removed this year. See the VoIP subsection of the LAN section above for more information on work in this area.

Westnet
Westnet is an affinity group that grew out of the NSFnet regional network called Westnet. NETS' long-term participation continued, and NETS continues in its leadership role. NETS is a member of the Westnet Steering Committee and leads the effort to plan and run bi-annual meetings that include technical presentations from members and vendors. Westnet provides powerful political and technical contacts with Rocky Mountain area universities that are UCAR members and that share common information technology concerns. The current Westnet members include:

  • CU-Boulder
  • CU-Denver
  • Colorado State University
  • University of Denver
  • University of Wyoming
  • University of Utah
  • Utah State University
  • Brigham Young University
  • Arizona State University
  • University of Arizona
  • University of New Mexico
  • New Mexico State University
  • New Mexico TechNet
  • Idaho State University
  • Boise State University
  • South Dakota School of Mines and Technology
  • UCAR

Project-tracking system
The project-tracking system continued in full production in FY2004. NETS work requests and projects are opened, tracked, and closed with this project-tracking system. The use of project-tracking tools is necessary because of the vast number and variety of projects. It would be unwieldy to manually track even just the personnel assignments for these hundreds of projects, much less track progress details of so many projects. In 2004, NETS completed more than 934 work requests and 30 projects.

Property-tracking system
NETS used the Remedy-based asset manager system to track its small property equipment. NETS is also responsible for tracking all large NETS equipment, including the VoIP phone system. Presently, NETS tracks approximately 360 equipment items. (See: NETS Inventory).

Business continuity
It is UCAR's policy to develop and maintain a corporate-wide contingency program, known as the Business Continuity Plan (BCP). This plan aims to provide the corporation with every opportunity to withstand a catastrophic event, be it accidental, man-made, or natural, and resume total operations in an efficient and effective manner. It is expected that, with the BCP in place, management will be able to provide the swift and decisive leadership that will be necessary for a successful recovery. Also, because of the guidance provided by the BCP, it is expected that employees will be able to carry out their tasks and responsibilities efficiently and effectively. NETS actively contributes related material to this document and process.

Documentation
NETS has an ongoing effort to create and maintain documentation in support of project management and network support.

NETS NCAB project
The Network Coordination and Advisory Board (NCAB) consists of appointed technical representatives from the NCAR divisions and UCAR. The purpose of NCAB is to advise NETS concerning network strategy, planning, policy, expansion, and management issues for all of NCAR and UCAR. The work of NCAB continues to be indispensable to the success of networking at UCAR. NCAB meets monthly.

Strategic plan
At the request of UCAR management, NETS formulated a comprehensive strategic plan outlining technical and budgetary requirements for UCAR networking for the next several years. This document was completed this year and received NCAB and ITC approval.

Conferences
NETS continues to provide networking support for classes, demonstrations, meetings, and conferences throughout NCAR/UCAR. This work involves the design, construction, configuration, and operation of the network components required for these activities. NETS also sponsored and hosted the following events in 2004:

  • January 2004 Westnet Meeting
  • June 2004 Westnet Meeting
  • October 2003 CyRDAS Focus Group Meeting
  • The Quilt Fiber Workshops
  • I2 IPv6 Reception
  • CyBerSecurity Summit Organizing Committee

Outreach support

NETS personnel attended and presented at several conferences, meetings, and training sessions, preparing numerous trip reports and presentations. See Education and Outreach for more detail.

Committee support

NETS representatives have attended and supported the following NCAR/UCAR committees:

  • The Network Coordination Advisory Board (NCAB)
  • The Computer Security Advisory Committee (CSAC)
  • Advisory Committee for Central Infrastructure Service (ACCIS) - Jim Van Dyke chair
  • Information Technology Council (ITC)

SCD committees:

  • The SCD Executive Committee
  • The Computer Room Planning Committee (CRPC)
  • The Machine Dependencies Committee
  • The SCD Security Policy Committee (SSPC)

External committees:

  • The Quilt Executive Committee
  • The Quilt Steering Committee
  • The Westnet Steering Committee
  • The Front Range GigaPOP Management Committee (FMC)
  • The Front Range GigaPOP Technical Committee (FTC)
  • The BRAN Technical Committee
  • The BRAN Management Committee
  • The CyRDAS Committee
  • The NLR Board of Directors
  • NLR Engineering Committee
  • HOPI Design Team

Assistance and Support for NCAR's Diverse Research Community

SCD's User Support Section (USS) provides leading-edge software expertise for the climate, atmospheric, and oceanic research communities to facilitate their high-performance computing endeavors at NCAR. The section furnishes a variety of services to users, both local and remote, that enable them to pursue their research within SCD's end-to-end high-performance computing environment. USS has direct contact with NCAR's user community, and with a staff of 26, represents a continuing divisional investment in focused user support services tailored to their specific needs.

Consulting Services and User Support

Consulting staff assisted users in transitioning models and large codes from chinook to bluesky and in transitioning data analysis jobs from dataproc to tempest. This work occurred during the June-September timeframe. SCD also carried out two important computing campaigns:

  1. CGD's IPCC campaign running models 7x24 on 18 nodes of bluesky
  2. The High Resolution WRF Real Time campaign April 15-July 31

SCD also ran memory-affinity experiments on bluesky before the IPCC campaign that showed a 7-10% speedup in throughput with this feature, and accordingly decided to adopt it for the production environment.

Online Documentation

USS staff has developed hardware and software documentation for the following computers:

  • lightning - our new IBM Linux cluster
  • tempest - our 128-processor SGI 3800 data analysis server

User guides for these two machines have been developed and placed online. The lightning documentation will be augmented as SCD gains experience with the Linux cluster.

Staff also revised and redesigned two primary guides to all user information about NCAR computers:

In addition, we produced online documentation for Fishpack90, a new version of our popular math software library for our users.

Security and related issues

To support new security requirements for the NCAR computing environment, staff worked with UCAR and NCAR security stakeholders and published new SCD supercomputing Security web pages, providing guidelines, instructions, and procedures for user passwords and for using the new CrytpoCARD one-time password tokens.

Web-based services and activities

The SCD website was instrumented with targeted search tools that give users new, more effective ways to access specific computing information.

We also designed and developed four new websites in support of SCD, NCAR, and UCAR activities:

COMSCI: Staff developed this site to support a UCAR-wide staff-based initiative to increase the communication skills of NCAR and UCAR scientists and technical staff.

CSAC: Staff redesigned this web interface and redeployed content for UCAR's Computer Security Advisory Committee.

SCICOMP12: Staff designed and launched a new website as part of SCD's successful bid to host the 12th annual meeting of SCICOMP, an international organization of scientific/technical users of IBM systems. (For further information on SCICOMP, please see: http://www.spscicomp.org/).

Cybersecurity Summit 2004 (restricted URL): Staff created a promotional web site to advertise an NSF-sponsored conference on security issues as they affect federal laboratories. SCD coordinated and managed all aspects of this very successful, two-day conference, held from September 27-28, 2004, in Washington D.C.

Staff also participated in the design and troubleshooting of the new MySCD Portal web interface. The portal is a user-customizable, database-driven web site that provides SCD users with information they require on running jobs, mass storage allocations, project status and more.

Finally, staff were part of a 25-member, cross-divisional effort to create a database-driven infrastructure that transformed our web presence into a more dynamic, interactive experience. (See: http://www.ucar.edu/) The Web Outreach, Redesign, and Development (WORD) project resulted in a new web presence for all of UCAR, NCAR, and UOP. The new site was launched in spring 2004.

Testing hardware and software

USS staff performed consistency testing for all operating systems, compilers and other major product upgrades. Consistency testing involves a set of programs and models that are run before and after the upgrades. Both sets of results are then compared to verify upgrades. Staff developed stress tests for the new batch system Load Sharing Facility (LSF) deployed on lightning, and ran various benchmark programs during lightning's Acceptance Test Period. The machine successfully passed the 168-hour mark in September. USS has also developed a portable, statistics-based node analyzer for cluster computers, for the purpose of locating slow nodes and processors on the cluster.

Training

We substantially increased our user training efforts in FY2004 by providing an introduction to Unix to the SOARS protégés, and a class in supercomputer optimization to the Advanced Studies Program (ASP) Fellows. Both of these classes were delivered in June. USS staff provided introductory training and discussion of supercomputing security following the intrusion incidents of March-April, when the security environment at UCAR radically changed. This training was delivered to NCAR divisions ACD, ATD, CGD, HAO, MMM, and also at COLA in Maryland, during the May-June timeframe. During October, SCD provided user training for the lightning Linux cluster, including LSF training and an overview of hardware and architecture.

Software applications development and maintenance

USS staff maintain the NCO and IDL products on its SCD data analysis machines dave and tempest. These are products that have been specifically requested by a large number of users. Staff developed and released the Fishpack90 software library for solving elliptical partial differential equations, which provides corrections to the older Fishpack library that had several FORTRAN language problems that manifested themselves more severely on modern compilers.

Resource Allocations and Accounting

USS staff manage and maintain all users' resource and project accounting databases, implement allocations for use of computing resources, and assist users with mass storage file management.

A total of 1,529 people representing 169 institutions used SCD computing resources during FY2004. Of these 169 institutions, 102 are universities in the U.S. University projects used over 300,000 GAUs in FY2004. (One GAU is equivalent to 4 hours on one processor on the IBM POWER4 computer bluesky or 10 hours on the IBM POWER3 computer blackforest. (Click on the images for larger versions).

University Allocations

All of the large university allocations are for projects funded by NSF awards in the atmospheric and related sciences area. SCD's Advisory Panel met in April 2004 and September 2004 to review these large requests and to advise the SCD and NCAR Directors. There were 39 large requests from university researchers during FY2004 totaling 1.3 million GAUs. The largest request approved was 100,000 GAUs while the median request was for 10,000 GAUs. At the September panel meeting, there were requests for 2.4 times the available resources. In the end, with so many high-quality proposals, the panel over-allocated university resources by 20%. The SCD job scheduler isolates the three allocation groups: university, NCAR and CSL so that overuse by one group will not slow turnaround by another group. Allocations of up to 1,800 GAUs are made available to NSF-supported researchers without panel review, usually within three days of their application. During FY2004 NSF-supported projects from 62 institutions received SCD computational resources. (See Table 1).

During FY2004 SCD provided over 11,000 GAUs of computational resources to 48 graduate students, and 40 postdocs in the atmospheric and related sciences who do not have NSF awards. Graduate students and postdocs may request allocations of up to 600 GAUs. Faculty members who have received their Ph.D. within the past five years and do not have an NSF grant may request allocations up to 1,500 GAUs. A list of the 29 institutions receiving graduate student and postdoctoral researcher support in FY2004 is given in Table 2.

Very small allocations (10-20 GAUs) are given to geoscience researchers at U.S. universities for data access.

In total, SCD made 216 allocations in FY2004 for data access, graduate students, postdocs, new faculty, and NSF-supported projects requests of <1,800 GAUs. This is an increase of 20% over last fiscal year.

Distribution of Research Areas

In FY2004, community use (defined as university and NCAR projects) was dominated by climate-related research. This included the completion of IPCC experiments run by NCAR using 12 upgraded nodes of bluesky (SCD's most powerful computer) that were funded for this research. Starting in October 2004, these nodes will be reallocated to the university, NCAR and CSL communities. (Click on image at left for a larger version).

University use is also dominated by climate (60%), with oceanography coming in a strong second (19%). (Click on image at right for larger view).

NCAR Allocations

NCAR researchers are allocated approximately 50% of the computational resources, the same amount as the university community. In FY2004, NCAR researchers used a larger share of the computing resources when there was no university work waiting to be run. (Click on image at left for a larger view.)

Climate Simulation Laboratory Allocations

Since 1994, in collaboration with the NSF Atmospheric Sciences, Ocean Sciences, and Mathematical and Physical Sciences Divisions, NCAR has operated a special-use dedicated climate system modeling computing facility known as the Climate Simulation Laboratory (CSL). The CSL provides high performance computing, data storage, and data analysis systems to support large, long-running simulations of the earth's climate system that need to be completed in a short calendar period.

The laboratory is open to all principal investigators funded or supported by a U.S. university or U.S. federal or U.S. private not-for-profit laboratory agency, including their international collaborators, who are performing research in support of the multi-agency Climate Change Science Program. Allocation of the CSL resources is done via a peer review process that includes a panel of experts in climate and/or large, long-running simulations. Large, collective group efforts, preferably interdisciplinary teams that address broadly posed sets of questions, such as the Intergovernmental Panel on Climate Change (IPCC) issues, are particularly encouraged to apply.

A new Announcement of Opportunity for the Climate Simulation Laboratory (CSL) computational resources was released in April 2004. The CSL Allocation Panel (CSLAP) met in July 2004 to review these proposals. The CSLAP allocated over 2 million GAUs of computing resources effective September 1, 2004 and continuing for 16 months to 12 projects. The projects receiving CSL allocations are shown at: http://www.scd.ucar.edu/csl/alloc.2004.html

During FY2004, 209 CSL researchers used SCD's computational resources (see http://www.scd.ucar.edu/csl/cslcomp.html) and consumed over 1.45 million GAUs. The largest project was the Community Climate System Model, which used over 1.1 million GAUs. In addition, SCD provided many computer hours of standby time for CSL projects when extra time was available. CSL projects were used heavily by both university and NCAR scientists. The accomplishments of the CSL researchers with their use of SCD computing resources are detailed at: http://www.scd.ucar.edu/csl/cslcpuchg0404.html

Accounting activities

In FY2004 our Data Base Services Group (DBSG) supported the following six databases using Oracle software:

  • Database to track computer resource use and project contract information
  • Database used as a back end for the SCD trouble ticket system and MySCD portal web pages, for SCD asset management and GLOBE school data
  • Database for tracking mass store file information including mass store computer accounting
  • New database created to support Enact project management software
  • Test database for the portal project
  • Test database for the SCD trouble ticket system

During FY2004, database processes were moved to a larger database server machine with twice the processors and twice the memory. DBSG installed Oracle 9i and upgraded the Oracle database software for the Remedy trouble ticket, asset management system, and the GLOBE school data to 9i.

DBSG participated in the evaluation of the LSF accounting software and began working with SSG on the use of the LSF accounting software to streamline computer accounting processing on SCD's high performance computers, beginning with the SCD's new Linux cluster.

DBSG created and/or modified scripts, programs, and the database to:

  • Provide web reports and graphs of NCAR computer resource use for various time periods on a regular basis including job accounting information for each job run on bluesky and blackforest
  • Provide reports and graphs and generate web pages in preparation for the SCD Advisory Panel meetings twice a year and the CSL panel meeting
  • Track assignments of one-time password electronic cards used in securing access to NCAR high performance computers
  • From mid-March through June, DBSG focused almost 100% of their time on developing and implementing new security procedures for SCD's high-performance computers in concert with the other groups in SCD. This included:
    • Taking requests, assigning, and working with SCD's OIS section to distribute 1,149 CRYPTOCards to users of SCD's high performance computers and 78 CRYPTOCards to system administrators and others in the institution so they could investigate moving to the CRYPTOCard system of one-time passwords
    • Reassigning passwords for all of SCD's high-performance computers and associated systems for each user
    • Processing requests for central authentication passwords which all university users of SCD's high performance computers now need
    • Adding three phone lines for the Database Services contact number so that DBSG could respond to the very high volume of questions regarding CRYPTOCards, passwords, removal/reactivation of user accounts as well as other new security procedures and policies

Transitioning Users During Equipment Upgrades

USS typically has early access to SCD equipment upgrade plans and uses this information to inform users on what will change and how to move their work to other equipment. In general, and depending on the type of upgrade or decommissioning, we:

  • Announce equipment upgrades in advance, as far as advantageously possible, to our computing community. These changes are posted via the SCD News, the SCD Daily Bulletin, and in SCD local web pages. In certain cases, USS broadcasts email to its various user populations. Last spring, USS taught security changes via seminars presented to selected UCAR divisions and COLA.
  • Develop and advertise strategies to help users anticipate change, outline the impact to their workflow, and show how to minimize the impact. USS uses the same advertisement mechanisms as above for this purpose. When new supercomputer upgrades are commissioned, SCD provides new user guides for the system.
  • Work one-on-one as necessary to assist with any code conversion required for the transition.
  • During a decommissioning process, USS archives user home directories to prevent loss of work and to expedite retrieval of files required for future work.

Visualization and Enabling Technologies

SCD's Visualization and Enabling Technologies Section (VETS) has a primary focus of advancing the knowledge development process. Our activities span the development and delivery of software tools for analysis and visualization, the provisioning of advancing visualization and collaboration environments, web engineering for all of UCAR, R&D endeavors in collaboratories, the development of a new generation of knowledge and data management and access, Grid R&D, novel visualization capabilities, and a sizable outreach effort.

VETS grew to 23 staff members in FY2004, which includes student contributors and new positions coming from external funds. We were awarded continued NCAR funding for a Cyberinfrastructure Strategic Initiative (CSI), and this covers two additional staff positions. The CSI has been a fine success, with excellent milestones reaching across all areas of endeavor. The Web Outreach Redesign and Development (WORD) project has delivered a beautiful and highly usable new web environment for NCAR, UCAR, and UOP. The Community Data Portal (CDP) in conjunction with its companion project, the Earth System Grid (ESG), is now serving internal science projects and serving a broad range of data to our community. We continued our efforts to deploy AccessGrid (AG) collaboration technology, integrated another SCD divisional system, and supported some exciting new applications of the technology.

We wrote and contributed to a number of proposals this year, particularly in the area of knowledge and data systems. We made a successful bid for a 2-year renewal on our Earth System Grid project, and the proposal was awarded high marks. ESG is now in production, serving CCSM, PCM, and IPCC data to a community of more than 100 registered users. Toward the end of FY2004, we also received notice that NSF intended to fully fund our bid to their NSF Middleware Initiative (NMI) program named "A Virtual Solar-Terrestrial Observatory." We will be collaborating with PI Fox in HAO on this exciting foray into semantic knowledge systems.

Building on the excellent success and community impact of our NCAR Command Language (NCL) application, we made a new foray into frameworks for geoscientific analysis and visualization: PyNGL. PyNGL is a Python interface to a refactored instance of our software, and it opens up myriad opportunities for expanding capability and leveraging open source efforts. After a successful recruiting campaign, we also made excellent progress on our new NSF ITR-funded VAPoR effort, which promises to deliver new tools for visually exploring and analyzing very large turbulence datasets.

VETS continued a very strong outreach program, providing dozens of presentations in our Visualization Lab. We continued our program where UCAR's Public Visitors Program (PVP) prepares and delivers highly visual presentations to visiting educational groups, and the results thus far have been very positive. Through teamwork, we are able to accommodate a much greater number of visitors with only a modest impact on SCD technical staff. We had a strong presence at the SC2003 conference and showed off a number of our new projects, including ESMF. In addition, we are engaging in research on cognitive issues in visualization with ASP and DLESE via a shared staff member.

Responding to recent NSF reviews of SCD, NCAR, and UCAR, we have continued an aggressive effort to define and gather metrics of performance. We are generating good data for web services, visualization services, visualization and analysis software, and ESG. In FY2005 we will bring metrics for the CDP online. Across all of these areas, we expect to be able to have a much more complete understanding of the value and impact of our many activities and to use that information to prioritize our efforts.

FY2005 promises to be an exciting year where we begin building major bridges across a superb portfolio of related projects: CDP, ESG, the NASA-sponsored GridBGC project, our AccessGrid efforts, VSTO, NCL, PyNGL, VAPoR, and our VizServer project. Considered as a whole, they not only represent a substantial impactful contribution to our community but also offer a splendid platform for building common Collaboratory infrastructure.

Cyberinfrastructure Strategic Initiative (CSI)

CSI Overview

The Cyberinfrastructure Strategic Initiative (CSI) was funded to dramatically advance our organizational web presence, our institutional web-based data presence, and the integration of collaboration technologies into the fabric of our day-to-day interactions. In FY2003 we completed our initial statement of work on AccessGrid (AG) environments, and NCAR/UCAR now has a large collection of AG nodes spread throughout the organization. During FY2004, we made excellent progress on our other two goals, which are manifested in the Web Outreach Redesign and Development (WORD) and Community Data Portal (CDP) efforts. Across all of these areas, the Strategic Initiative has accomplished one of its core missions: to fundamentally change how we share information among ourselves and with the world, how we manage and share scientific data, how we collaborate, and the scope of our future opportunities. The accomplishments of the WORD and CDP efforts are detailed below.

Web Outreach, Redesign and Development (WORD) Project

The WORD group launched a new UCAR, NCAR, UOP umbrella website in May 2004, bringing together news, research, and resources from all three institutions into a unified website designed to engage scientists, educators, students, and the public alike. The 25-member team of employee volunteers from across divisions created a staffing strategy that will support the ongoing maintenance of a significantly more dynamic top-level web presence without the need to hire additional staff.

The new site organizes information thematically, presenting a more natural interface for non-employees who previously had to browse through our organizational hierarchy to find items of interest. The site more actively features NCAR's research and key resources with five automatically rotating items on the home page compared with the previous site's one feature. A brand new "Our Research" section consisting of over 50 pages was the project's largest content creation undertaking and provides an excellent introduction to the major issues in atmospheric science.

The site is careful to meet the highly individualized needs of scientists, educators, students, and the public, as well as UCAR members and employees with "especially for" pages for each of these audiences as well as "Our Research" topic pages that link to a universe of related information throughout our web presence for each audience.

Figure 1: Snapshot of the new WORD-developed "umbrella" site for our organization

Underpinning the new web presence is a completely new software architecture called VAVOOM that keeps the site current by storing metadata about news and other high-value information objects in a database and delivering them to the appropriate pages automatically. VAVOOM will continue to transform our overall web presence in the months and years to come with centralized catalogs of scientific publications, posters, instruments, applications, and other key information that has traditionally been scattered throughout individual division and program websites.

The WORD project and team is ongoing. The larger team continues as a UCAR web content advisory group that will reconvene occasionally to discuss institution-wide issues. A team of two WORD co-chairs and a Managing Editor steers the ongoing work of the WORD production team and convenes new subgroups when necessary to accomplish objectives. For example, a Laboratories subgroup was formed to launch five new NCAR laboratory websites in less than two months, bringing together all the appropriate divisional web developers for the task. The WORD team will continue to perform this integrative function across the three institutions and their divisions and programs.

You may visit the new umbrella website at http://www.ucar.edu/ The WORD team also maintains a project intranet site at http://word.ucar.edu/

The Community Data Portal (CDP)

The NCAR Community Data Portal or CDP (http://cdp.ucar.edu/) is an NCAR Strategic Initiative aimed at developing a central institutional gateway to the large and diversified data holdings of UCAR, NCAR, and UOP. The ultimate goal is to provide a state-of-the-art data portal with a broad spectrum of functionality ranging from data search and discovery to catalogs and metadata browsing, and from high performance and reliable data download to analysis and visualization.

Building on a conversion to a new THREDDS schema, we added a substantial number of new datasets to the CDP during FY2004:

  • ACD/BAI MEGAN data
  • ACD model evaluation data: several ACD field campaigns including TRACE-P, SOLVE, ACCENT, CRYSTAL-FACE, and POLARIS
  • ATD IHOP 2002 field campaign
  • CGD/CAS top-level data catalogs: ECMWF data, satellite data, climate indices, NCEP data, and surface data
  • CU/ENLIL heliospheric model data
  • VEMAP data (Vegetation/Ecosystem Modeling and Analysis Project)

During 2004 the following major technical tasks were accomplished:

  • The underlying metadata architecture of the portal was upgraded to conform to the newly released THREDDS v1.0 specification, which includes geosciences-specific tags for search and discovery of datasets.

  • A new, powerful search functionality was implemented that allows the user to perform structured queries on the content of the enriched THREDDS metadata. The search interface is complemented by "power-browsing" capabilities.

  • High-level data services for aggregation and subsetting of NetCDF data were developed in collaboration with the Earth System Grid project and were deployed on the CDP web portal for the WACCM and MEGAN (Model of Emissions of Gases and Aerosols from Nature) model data.

  • Other important developments include a package for storing and analyzing detailed portal usage metrics, and for managing the portal data cache for connection to the NCAR MSS via the Storage Resource Manager (Grid-enabled middleware from the ESG collaborators at LBNL).

Two very important collaborations were initiated and are currently actively pursued:

  • CDP and other SCD staff are working to establish NCAR as an early DCPC (Data Collection or Production Center) contributing to the first prototype GISC (Global Information System Centre) developed as part of the long-term data and information strategy promoted by FWIS (Future WMO Information Systems).

  • The CDP and GIS NCAR strategic initiatives are collaborating to share technologies and data holdings between the two portals, and to provide high-level interoperability between the specific areas of functionality.

Finally, a new Advisory Panel was established for the CDP, and the first meeting and review of the effort was undertaken during summer 2004.

Data and Knowledge Systems R&D

The Earth System Grid

The Earth System Grid (ESG) is a DOE-funded project focused on building a DataGrid for climate research that facilitates management and access to terascale climate model data across high-performance broadband networks. ESG is a collaboration of NCAR (SCD, CGD, and HAO), Argonne National Labs (ANL), Oak Ridge National Labs (ORNL), Lawrence Livermore National Labs (LLNL) Program for Climate Model Diagnosis and Interpretation (PCMDI), the University of Southern California Information Sciences Institute (USC/ISI), and Lawrence Berkeley National Laboratory (LBNL). This broad collaboration builds on one of the more exciting recent developments in computational science: Grid Technologies and the Data Grid. The Data Grid is a next-generation framework for distributed data access in the high-performance computing arena, and it addresses security, transport, cataloging, replica management, and access to secondary storage. The diagram below gives a general idea of the basic topology of our Grid and partner organizations.

Figure 2: Project and data topology for ESG

ESG realized enormous progress in FY2004 and released a production version of the Earth System Grid web portal in summer 2004, coincident with the release of CCSM V3 and the subsequent CCSM Workshop. When the CCSM V3 announcement was made, new datasets for control scenarios were available for use by the Climate Modeling Community. Since then we have been publishing new model results constantly. The portal allows easy access to the latest CCSM (Community Climate System Model) v3 data, including simulation runs for IPCC scenarios, as well as to almost all of PCM (Parallel Climate Model) data holdings. Users may browse the data catalogs hierarchically, perform searches on metadata, download full files or subset the virtual aggregated datasets. The portal currently indexes 49 TB of data located at NCAR and other national data centers (e.g. ORNL, NERSC), and serves a community of approximately 100 registered scientists and researchers. More than 100 GB of data have been downloaded since the portal release in July 2004. The following image shows a shapshot of the ESG portal interface:

Figure 3: Earth System Grid Web Portal

Our multidisciplinary team had many accomplishments during the year, and we summarize a few of these below.

  • Authored a successful bid for two years of continuation funding, including an increment to support adding Los Alamos National Lab to ESG. LANL will provide high-resolution ocean simulations to our Grid.

  • Developed a robust, powerful dataset publication service and deployed it. CGD ESG contributors published 49 TB of distributed data composed of more than 100,000 files. We have supported a steady data publication process since our July release.

  • Developed a new authentication and authorization system that allows us to conveniently register users and assign permissions such that they can access data according to DOE and CCSM Scientific Steering Committee policy. This was a collaborative effort between Argonne and NCAR.

  • Developed and deployed a new metadata system based on the emerging U.K. OGSA-DAI (Open Grid Services Architecture. Data Access and Integration) package. We found that the early version that was available was not capable of handling the tens of thousands of files we have with CCSM, so we had to construct a workaround service. This was a good learning experience and provided valuable feedback to the OGSA-DAI developers, and we'll test-drive a new version in FY2005.

  • Developed and released a production version of our Grid Web Portal, as per above.

  • Developed an alpha version of our Virtual Data Services architecture, which combines Aggregation Services and OPeNDAP-g. This will be fully tested and released early in FY2005.

  • Began design and development of our new DataMover-lite (DML) client.

  • Began development and integration work of the IPCC Working Group 1 ESG Portal which will be housed at LLNL/PCMDI to support their commitment of delivering this data to the world.

  • Provided various demonstrations at SC2003.

Our primary goals for FY2005 include continued data publication, usability studies and portal enhancement, improved metrics gathering and reporting, enhanced security infrastructure, and the production release of the IPCC WG1 site at LLNL/PCMDI.

CDP

See the Community Data Portal (CDP) section above.

GridBGC (Grid Biogeochemistry)

NCAR and CU were awarded a three-year grant through NASA's Advanced Information Science and Technology (AIST) program, one of 11 awards made nationally. The effort is aimed at developing a new modeling system that encompasses the remote execution of a biogeochemical model (a future component of CCSM), the Grid-mediated movement of initialization and output data, and associated web portals and analysis tools. This is a close collaboration among CGD (PI and scientist), SCD CSS and VETS, and CU's Computer Science Department.

During 2004 formal development work began on the project. During the first phase of the project, the system and software requirements were developed using several techniques. The primary technique used was interactions with the PI acting as user surrogate. During the process, several graphical user interface prototypes were developed to visualize the user experience in a concrete form. The requirements were refined based on feedback received during the process into a baseline document that the implementation could begin from.

After the baseline requirements were established, formal development began on the implementation. The initial implementation work had two primary focus areas. The CU staff developed the grid-based job management services and integrated Data Mover for data transfer between systems. The second area of development was the implementation of the graphical user interface. The user interface is being developed as web based portal using Java/Struts technology. After a skeletal implementation of the GUI is complete, it will be released to select beta testers for review. The results and feedback gained from the evaluation will be integrated into the application as development continues. The project website is http://www.gridbgc.ucar.edu/

The Virtual Solar-Terrestrial Observatory

Toward the end of FY2004, we were pleased to receive notice from NSF that they had ranked our Virtual Solar-Terrestrial Observatory proposal very highly, and intended to fully fund the proposed three-year project. A collaborative effort led by HAO and to be funded by the NSF Middleware Initiative (NMI) program, this forward-looking effort is aimed at developing the next-generation knowledge environments for this realm of science. It builds on our earlier Knowledge Environment for the Geosciences (KEG) work, the Earth System Grid, and the Community Data Portal. HAO and SCD will be collaborating with a range of ongoing efforts like CISM and CEDAR and our partners at Stanford's Knowledge Systems Lab.

Proposals and Strategic Opportunity Development

In addition, we prepared a number of other proposals this year in the areas of data and knowledge systems, including Chronopolis, a joint endeavor with the San Diego Supercomputing Center (SDSC). Chronopolis is still under review at NSF.

Data Analysis and Visualization R&D

NCL: Community Software for Data Analysis and Visualization

The NCAR Command Language (NCL) is an internally developed scripting language that provides robust file handling, data analysis, and publication-quality two-dimensional visualizations. The visualizations of NCL are based on those produced by NCAR Graphics, a collection of low-level C and Fortran routines for generating a wide variety of two- and three-dimensional visualizations.

The use of NCL continues to grow in the scientific community, as evidenced by the number of people visiting the two main NCL websites, the increasing number and complexity of questions being posted to the NCL email list, the unsolicited positive feedback we get from our users, and the growing interest in NCL's lab-based workshops. Seven of these workshops were held in FY2004, of which four were local, and three were at the Stennis Space Center, Purdue University, and the National Ocean Service. Since 2000, 23 workshops have been held with over 260 users in attendance. Continuing with other metrics, the estimated number of downloads of NCL from the web has increased 300% in the last four years, and the number of email subscribers has increased by 70% since 2003. As a side note, NCAR Graphics has been open source under the GNU Public License (GPL) since 2000, and it is averaging about 400-500 downloads a month.

Graduate students, professors, and researchers use NCL in areas of atmospheric science, mechanical engineering, geography, geology, geophysics, electrical engineering, earth sciences, biogeochemistry, and physics. It is used for data analysis and visualization of a wide variety of model runs, in teaching graduate-level atmospheric-science-related courses, for automatic generation of web pages, for use in real-time web sites, for converting file formats, and for publication-quality graphics in journals, presentations, and dissertations. NCL users hail from around the globe including Australia, Argentina, Brazil, Canada, China, Denmark, England, France, Germany, Greece, India, Italy, Japan, Korea, Norway, Portugal, Senegal, South Africa, Switzerland, Taiwan, United States, Vietnam, and Yugoslavia.

In FY2004, algorithms for contouring unique grids and triangular meshes were added to NCL. This functionality is considered by many scientists to be the most significant feature added to the software since its initial release in 1995. Some of these unique grids and meshes include:

  • A grid from the High-Order Multiscale Modeling Environment (HOMME)

  • A geodesic grid from Colorado State University's atmospheric general circulation model

  • An adaptive grid from the Department of Atmospheric, Oceanic, and Space Sciences at the University of Michigan

  • A grid from the International Satellite Cloud Climatology Project (ISCCP) in the World Climate Research Program at NASA

  • A global ocean mesh (ORCA) from an Ocean General Circulation modeling system at the Laboratoire d'Oceanographie Dynamique et de Climatologie

  • A triangular mesh from the Chesapeake Community Model Program Quoddy model

  • A triangular mesh from a shelf-scale model run by an oceanography division at the Naval Research Laboratory

  • A grid from the ARPEGE Atmospheric General Circulation Model developed by the French climate community

  • A Parallel Ocean Program (POP) grid from Los Alamos National Laboratory, being used in the Community Climate System Model

  • A satellite swath grid from UCLA

Figure 4: NCL/PyNGL rendering of the HOMME grid

Figure 5: Visualizations of finite element mesh

In addition to enhancements, bug fixes, and general code maintenance, many other major features and updates were added to NCL in FY2004. In the area of data I/O, many updates have been made to NCL's GRIB reader, making it one of the most powerful GRIB readers available. Several functions for writing Vis5D files were added and NCL's OpenDAP capabilities were extended, giving users more file I/O capabilities, and the ability to pull over subsets of data served on the web.

NCL continues to be available on a wide variety of UNIX systems, and in FY2004, it was ported and tested on several new systems including FreeBSD, a variety of 64-bit Linux systems, and a MacOS X system using new compilers. These ports additionally provided invaluable testing and feedback for systems staff and users testing new platforms.

In FY2005, efforts are either underway or planned for adding vector and streamline visualization drivers for the special grids and meshes mentioned above, releasing a new command line interface for NCL, adding an HDF5/HDF-EOS5 reader, and researching a potential replacement for our outdated display model. A new display model will open the door for importing raster images into NCL, for saving visualizations to other raster formats (like PNG), and for getting new capabilities with colors that our current model can't handle.

Toward Frameworks for Analysis and Visualization - PyNGL

An alpha and a beta version of PyNGL (Python interface to the NCL Graphics Library) was released in FY2004. Python is a mainstream object-oriented programming language often compared to Tcl, Perl, and Java, and is gaining popularity in the atmospheric sciences community. PyNGL is a Python module that gives Python users easy access to the same high-quality two-dimensional visualizations available in NCL. This endeavor will make the functionality of NCL available to a much wider audience, and has opened the door to several potential collaborations and projects.

In FY2005, we plan to release the first version of PyNGL to the scientific user community. To supplement the visualization module, progress has been made toward creating a module of NCL's powerful I/O capabilities and providing Python wrappers to many of NCL's unique climatological functions. In addition, augmenting PyNGL with three-dimensional capabilities by coupling it with a Python module like VTK will be researched.

PyNGL is a pivotal step in our larger strategy to develop generalized frameworks for geoscientific analysis and visualization. By refactoring the computational and data I/O engines underneath NCL and providing Python integration, we not only offer a high-impact new software application, but we also position ourselves to be able to leverage the open source movement and couple in diverse functionality from other development efforts. These include data management, Grid environments, and interactive 3D visualization, and we have already begun to engage in some exploratory efforts.

VAPoR

The VAPoR project is an NSF ITR-funded, Open Source software development effort focused on improving the ability of earth sciences fluid flow researchers -- particularly those working in the area of numerically simulated turbulence -- to explore and analyze terascale-sized datasets. This project is one component of a larger time-varying data research-and-development effort whose collaborators include NCAR, U.C. Davis, and Ohio State University. With NCAR leading software development, and our university collaborators taking the lead on research, a mutually beneficial relationship has arisen; NCAR provides a steady supply of interesting research challenges to our university partners, who in turn provide solutions to some of our toughest visualization problems. One goal we are meeting with this partnership is reducing the lag between the time when visualization research results are first published and when they become available in end-user applications.

Much progress was made on the VAPoR development effort in FY2004. An international steering committee was assembled with representation from within and outside of NCAR. The steering committee worked actively with development staff to produce a rigorous Software Requirements Specification (SRS), detailing all of the needs of our target user group: working scientists. Following the creation of the SRS, developers crafted a Graphical User Interface design to accommodate the needs outlined in the SRS. A working GUI mockup was developed thereafter. Advanced GUI features present in the prototype, such as undo/redo capability and context-help, are but one example of steps taken to maximize the software's usability -- one of the forefront goals of the project deemed necessary for making it a success. We also hosted a U.C. Davis grad student this summer, whose activities focused on integrating his research results on stretched grid volume rendering into the application. Early in CY2005 we anticipate releasing a first alpha release of the software to steering committee members. A general release of the software is expected later in the year.

Figure 6: Screen snapshot of prototype VAPoR application

Visualization Lab and Services

Computational Facilities

VETS operates a state-of-the-art visual supercomputing facility, offering powerful interactive visual computing platforms, batch platforms for data postprocessing tasks, and a shared, high-performance Storage Area Network (SAN) for containing large working sets of data. FY2004 brought few significant changes to the Vislab, with no major equipment upgrades performed. Much of the systems activity in FY2004 was undertaken in response to a heightened security status, following an NCAR-wide security breach. Though no Vislab systems were compromised, numerous proactive changes were effected in response to the threat. Measures taken included:

  • Revising and developing new security policies encompassing user authentication, inactive accounts, provision of services, post-upgrade/install procedures, and attack response, to name a few

  • Deployment of PAM-based user authentication for services and interactive logins

  • One Time Password (OTP) authentication for root access

  • OS and application revisions on all platforms

Not all Vislab activities were aimed at security improvements. Vislab technical staff fruitfully invested time in exploring, and in some instances deploying, commodity-based technologies offering significant price/performance advantages in a couple of key areas. RAID storage systems comprised of inexpensive ATA drives were investigated and successfully deployed in support of Vislab as well as non-Vislab related projects. Much effort was also spent evaluating and experimenting with commodity-based graphics/visualization systems. This research led to acquisition of two commodity clusters that will serve as replacements for aging, expensive SGI technology (see Experimental Visualization Cluster).

Access Grid Facilities

VETS continues to maintain and operate several Access Grid Nodes in the Division, including the main node in the Vislab and a portable node that can be moved on short notice to other, local meeting facilities. Staff maintained a testing platform for new AG functionality and supported experiments with new videoconferencing tools and environments in the context of scientific collaboration and visualization. The AG staff continues to track and implement new Access Grid updates and functionality, and they provide consulting and tutorials to new users on a per-request basis. VETS also provided consultation and planning for the integration of the divisional Access Grid and the A/V systems in the SCD conference room, and they helped coordinate the remodel between the NCAR Multi-Media Group, and a third-party contractor.

Events and Metrics

VETS continues to provide support for meetings and demonstrations in the Vislab. Including the ongoing partnership with the Public Visitor Program which provides outreach to public and school groups. VETS supports demos to government and scientific visitors, video conferencing for researchers and software developers, and meeting space for review teams and advisory panels.

The following table lists the number of events and participants that were recorded this fiscal year:

 

AG Meetings

Vislab Demos

Vislab Meetings

No. of Events

66

88

13

No. of Participants

240

1,489

155

Experimental Visualization Cluster

The performance of commodity graphics chips has increased at a rate exceeding Moore's law for the last several years, and inexpensive gaming chips now offer performance far surpassing traditional high-end graphics systems. With the advent of programmable Graphics Processing Units (GPUs), the capabilities offered by these platforms is unprecedented. However, due to the limited memories of 32-bit PCs, tackling large data problems with these systems required clustering machines together; an exercise often fraught with frustration due to limited software availability for these distributed memory systems.

The emergence of 64-bit PC processors in FY2004 has changed the visualization system landscape once again; it is now possible to build an inexpensive, powerful graphics platform with a large memory and a simple programming environment. Vislab staff have been evaluating commodity systems for several years and concluded that with these recent processor developments the time was right to enter the commodity visualization system space. A pair of inexpensive 64-bit graphics workstations was acquired at the end of FY2004 to replace the Vislab's aging SGI visual supercomputers. This marks a strategic step for the Vislab as we move away from costly proprietary systems.

Vizserver and Metrics

The VizServer project provides an advanced image-delivery service permitting users from inside and outside the NCAR campus to access centralized, visual supercomputing services from the convenience of their own office. This capability is essential to interactive visual analysis work as it enables the spontaneous access needed to support visual data exploration. This service, initiated in FY2003, entered into production mode in FY2004 with relatively modest activity. Two major software upgrades were performed and steps were taken to improve the security of the system, which supports users outside of the NCAR firewall. Vislab staff also began exploring alternatives to the proprietary, SGI-only image-delivery service currently employed, looking at promising new technologies with better platform portability, including support for our new 64-bit commodity graphics cluster.

For FY2004, monthly average usage metrics for the VizServer system include: 7 unique sessions and 160 total sessions.

Collaboration Environments R&D

VETS staff completed the deployment of a Collaboration Environment at Howard University and continued participation in the Scientific Workplaces of the Future project as a testing partner for new collaboration technologies.

Howard University AG

NCAR sponsored the acquisition, deployment, and testing of an Access Grid node at Howard University, a historically African-American university in the Washington DC area with a growing and vibrant atmospheric sciences program. The installation was made in the new computer lab of the Howard University Program in Atmospheric Sciences (HUPAS) and completed in early 2004 providing HU with new videoconferencing and research collaboration capabilities.

Vislab Staff worked with IT personnel at Howard University and a third-party Access Grid supplier, InSORS, to coordinate the planning, purchase and deployment of this system, which included projection systems, microphones, video cameras, computer system, and audio gear.



Figure 7: Howard University's new AccessGrid facility

Scientific Workspaces of the Future

VETS continued to participate as an unfunded partner with the Scientific Workplaces of the Future (SWOF) Project. Staff met on a biweekly basis over the Access Grid with partners from Boston University, ANL, EVL, NCSA, and LANL to discuss new tools and innovations for application in Access Grid environments.

VETS provided testing and feedback for new functionality including the ANL Shared Movie Player, ScView, Rasmol, and audio/video tracking software.

Enterprise Web Services

Web Facilities

The Web Engineering Group (WEG) completed its migration of numerous services to a new high-performance backend cluster in FY2004. The new cluster will meet the needs of our increasingly dynamic and computing-intensive websites, applications, and services. A new test cluster, expected to be finished in 2004, is still under development due to the impact of UCAR security issues this year. The test cluster will enable developers to test their sites in a production-like environment before going live. In FY2005, we will replace our front-end cluster, interactive sessions server, and install a remote desktop administration server to increase system administrator productivity. This will complete our new architecture.

Web Tools and Services

The WEG was an integral part of the WORD group's launch of the new UCAR-NCAR-UOP umbrella website. The WEG created the underlying software and hosting architecture for the site. This innovative architecture, called Vavoom, which an NSF review panel recommended the WEG share with other institutions, applies the metadata cataloging approach that has been so successful in the arena of scientific datasets to create centralized catalogs of high-value information objects scattered throughout the UCAR web presence. The Java-based architecture enables division and program web developers who host their site with the WEG to utilize the metadata holdings in their own sites. It also caches data to maximize performance and minimize server load. The potential for this architecture to transform our overall web presence is significant, and we are just now beginning to see its effects.

The People Search web application developed by the WEG has been a resounding success and was enhanced this year with failover capability. The application now enables employees and system administrators to modify profile information as well.

A new system aggregates all our web logs for improved log management. We began beta testing a new web traffic analyzer service that provides web developers with the ability to drill down into statistics to get very specific metrics for their sites.

Web Outreach, Advocacy, and Community Building

The WEG gained several new high-profile customers in FY2004, continuing its mission to centralize web-hosting services for the institution. These included ESIG, GLOBE, HAO, HIAPER, and MMM. In the interest of improved customer service transparency, the WEG began publishing its work request queue on the WEG website.

The WEG was presented with the unique opportunity to host the Space Science Institute's MarsQuest Online website for the Mars Land Rover missions. We worked with several institutions to put together a hosting solution to meet the high traffic and database replication needs of this site. The WEG cluster served over 44 million page views for MarsQuest this year, with over 20 million in July alone.

The Web Advisory Group, which advises the WEG, completed its implementation plan, identifying several key priorities for the overall institution. These are a test cluster (already under development), search engine optimization (already underway), next-generation authentication and access control, metadata directories and catalogs (already in production and under ongoing development), collaborative portals (Swiki pilot project underway), and a Content Management System.

WORD Project

See Cyberinfrastructure Strategic Initiative

WEB Metrics

The WEG web cluster served over 158 million web pages in FY2004, a 50% increase over last year, and a total of 5.8 terabytes of content. UCAR websites hosted by the WEG received over 10 million visits. Peak load has reached over 3 million pages per day, nearly 4 times last year's peak.

Special Projects in Science, Technology, and Collaboration

VETS continued to collaborate on many different modeling efforts and research projects across the organization and the broader community. Staff members work directly with NCAR researchers and scientists inside and outside the organization to jointly develop new technologies and visualization solutions to their problems. The examples below summarize VETS FY2004 collaborative visualization efforts.

Middleton, Scheitlin, and Wilhelmson (NCSA) collaborated with editors Johnson and Hansen (University of Utah) to develop a new book for the visualization and science community: The Visualization Handbook. This team authored a chapter entitled Weather and Climate Visualization, and drew on examples from R&D projects in universities, centers, and commercial firms from around the world. Academic Press will publish the new text in November 2004.

Mendoza continued collaborations with HAO's Rast on the production of numerous visualizations of Compressible 3D Starting Plumes and Compressible Convection, utilizing over 5 TB of turbulence data.

Clyne worked with Jack Herring (NCAR, Emeritus) and Yoshi Kimura (Nagoya University, Japan) on exploration of "pancake" structures found in high-resolution, stratified, decaying turbulence simulations.

Clyne continued collaborations with HAO's Qian Wu and Tim Killeen on the production of numerous visualizations of TIMED Doppler Interferometer (TIDI), Global Scale Wave Model (GSWM), and Thermosphere-Ionosphere Electrodynamic General Circulation Model.

Scheitlin worked with Frank Bryan (CGD) to generate several visualizations from the high-resolution 0.1-degree POP model. Visualizations show sea surface temperature and sea surface height animated with daily timesteps.

VETS provided several climate visualizations that were used in an American Museum of Natural History September video feature. CGD scientist, Jim Hurrell, was interviewed about the North Atlantic Oscillation, and he used VETS animations as a backdrop for his discussion.

Serving on the IEEE Vis 04 Conference Committee, Middleton co-organized the first annual Visualization Challenge. A new WRF simulation of Hurricane Isabel was chosen for the competition, and the NCAR team of Kuo, Scheitlin, Wang, and Bruyere postprocessed the model data, converted it to a simple structure, and published it on the web. Middleton and Kuo served as judges for this new effort, which generated a lot of interest, participation, and exciting results.

Middleton served on the National Research Council "Committee to Develop a Long-term Research Agenda for the Network for Earthquake Engineeering Simulation (NEES). Working with his fellow committee members and experts in the earthquake engineering community, they authored a new report entitled "Preventing Earthquake Disasters: A Research Agenda for the Network for Earthquake Engineering Simulation (NEES)." Middleton focused on the information technology aspects of the endeavor.

Research Data Support and Services

Stewardship and Curation

SCD's Data Support Section (DSS) manages and curates an important research data archive (RDA) containing observational and analysis reference datasets that support a broad range of meteorological and oceanographic research. Careful curation of the archive extends existing datasets and adds new ones to the collection. Stewardship work improves the RDA by capturing and improving documentation, creating systematic organization, applying data quality assurance and verification checks, and developing access software for multiple computing platforms. All activities are founded in the principles of acquiring the necessary data and making it readily available for scientific research.

Routine Archive Growth

Routine curation builds existing datasets into longer time series. Most often this is simply done by adding files that contain the most current information available, but sometimes it includes extending a time series backward in time, filling data void periods, and replacing segments with reprocessed data. DSS has 79 datasets that receive routine curation at least annually, with most (48 datasets) receiving new data by network transfer. Some data (18 datasets) are still delivered by CDROM and tape. Slightly fewer than half are updated monthly or more frequently (26 monthly and 3 daily). These frequently updated sets are typically global observations and model outputs from NCEP, various SST analyses, and near-real-time observations and model data from the Internet Data Distribution system.

New datasets are added to the RDA for many reasons, but two major purposes are to preserve historical observational or analyzed collections against loss and to make new products available that are necessary for leading-edge research at NCAR and the university community. The 25 new datasets for the past year are shown in Table 1.

Table 1: New datasets added to the SCD Research Data Archive in FY2004

Dataset ID Number¹

Dataset Title

ds117.0 ds117.1

ds117.2 ds117.3

ds118.0 ds118.1

ds119.0 ds119.1

ds119.2 ds119.3

ds120.0 ds120.1

ds123.0

ERA-40 Output Analyses 13 different product datasets

ds277.5

Extended Reconstructed Sea Level Pressure, 1854-1997

ds335.0

Historical Internet Data Distribution Gridded Model Data, beginning May 2003

ds336.0

Historical Internet Data Distribution Global Observational Data, beginning May 2003

ds351.0

NCEP ADP Global Upper Air Observations, BUFR format, start April 2000

ds435.0

Monthly Mean Rawinsondes, from NCEP/NCAR Global Reanalysis, beginning 1948

ds461.0

NCEP ADP Global Surface Observations, BUFR format, beginning April 2000

ds506.0

U.S. Surface Hourly Data (1928-1948) from NCDC

ds530.0

Supplementary (Add-on) Datasets

ds533.1

Marine surface meteorological and actinometric observations from Russian RV

ds540.6

GTS, SEAS, and keyed Marine Surface Data from NCDC

ds556.0

Global Monthly Stream Flow Time Series

ds744.8

QSCAT & ADEOS-II / NCEP Blended Ocean Winds

¹ The web address for each dataset is constructed by substituting the Dataset ID Number into the following template: http://dss.ucar.edu/datasets/dsxxx.x/

Important New Research Data and Initiatives

ECMWF Re-Analysis

Curation and stewardship are often combined, and the recent acquisition of the ECMWF Re-Analysis (ERA-40) serves as a major example. The ERA-40 project has produced a comprehensive global analysis for the 45-year period covering September 1957 to August 2002. The project utilized an atmospheric model with 60 vertical levels, T159 spherical harmonics representation for the basic dynamic fields, and a reduced gaussian grid with an approximate uniform spacing of 125 km for surface and other fields. The reanalysis model ingested multiple archives of in situ and satellite observations.

As a partner in the project, DSS is the sole distributor of the ERA-40 data products in the US and to UCAR members in North America. The products include atmospheric model resolution analyses and forecasts, ocean wave analyses and forecasts, many parameters at lower resolution (2.5 degree latitude by longitude grids), monthly mean analyses, vertical integrals of atmospheric energy, mass, and fluxes, analyses on isentropic and potential vorticity surfaces, and special products for research in chemical transport and radiative transfer.

The total archive is over 30 TB in size. The primary analyzed data and locally computed supplementary products (about 10 TB) will be maintained online and organized into smaller product subsets to promote easier access for the public research community. Currently, there is about 2.1 TB online. A web interface for each product provides direct access for both large and small data downloads initiated in real-time. Users may also design their own subsets for some products by selecting variables, time, and levels through a web interface. These custom requests are typically processed in a delayed mode with user notification by email when the data are ready. Users with SCD computing accounts have unlimited access to the data on the data server. For those users, all ERA-40 data are available from the MSS, and the product web pages need only be used to ascertain the MSS file names and acquire software to read the data. A comprehensive description of the ERA-40 archive and access to the data are available at http://dss.ucar.edu/pub/era40.

The stewardship work to prepare ERA-40 is not complete. Some software has been developed to decode, regrid, and reformat the data. More software and subset products are planned, and additional products will be computed to aid scientific research, e.g. process the spectral harmonic data (T159) on both model and pressure levels and create T85 Gaussian grids (128 x 256).

User access to ERA-40 is just beginning, and early indications are it is useful for many research projects, with usage from both the MSS and the online data server being comparable. The users and data amount statistics for the first nine months and a linear prediction for the 2004 annual values are given in Table 2. It is likely that in the first year, ERA-40 will have 125 unique users accessing nearly 20 TB of data. It must be noted that these statistics reflect the minimum usage, because a single copy provided by SCD can be used by many in multiple-member research groups.

Table 2
ERA-40 Data distribution metrics for January-September 2004, and a 2004 annual estimate

 

Unique Users

Data Amount (TB) [2004 est.]

Data Server (Web and FTP)

49 [61]

8.8 [11.0]

MSS

51 [64]

6.8 [8.5]

Total

100 [125]

15.6 [19.5]

North American Regional Reanalysis

NCEP completed the North American Regional Reanalysis (NARR) model run in April 2004. Temporally, it is three-hourly and spans 1979 through 2003. Spatially, it is 32km and 45 levels with a coverage that is quite large; from extreme northern South America to the Canadian Arctic and includes vast regions of the eastern Pacific and western Atlantic Oceans. It is planned that the time series will be extended into 2004 and beyond as an ongoing near-operational product.

The scientific focus for NARR is high resolution with significant improvements in moisture fluxes (e.g. precipitation) and budgets in both the atmosphere and soil. It is largely believed to have achieved this goal and is therefore an important data product for the research community.

Our plan was first to use magnetic tape to transfer the data from NCEP to NCAR, but circumstances and an unwillingness for further delay made us adopt FTP network transfer for NARR. The transfer was begun in June 2004, and as of October, 2.8 TB including data for 1987-2003 have been moved to the NCAR MSS. That transfer rate is in lock step with the rates at which NCEP has been making the data available, and the transfer will go on to include the period back to 1979 in the coming months. All totaled, the NARR primary analyses will be about 5 TB. Currently, we are considering what selected parts of the forecast data, which are not included in the analyses total, might be important for our users and as a backup archive for NCEP.

We are developing plans to prepare the NARR archive, maintain it, and make it readily accessible. It is expected that NARR will be a very useful research dataset because of the high temporal and spatial resolutions and from the fact that the code was expressly developed to capture the precipitation and atmospheric water budget. As an example, NARR will be a primary research dataset for the Global Energy and Water Cycle Experiment (GEWEX) Americas Prediction Project (GAPP).

Research Data Archive Database

Database technology has been implemented with objectives to improve DSS internal operations and external RDA user support. In the broadest terms, the RDA Database (RDADB) combines metadata information from DSS and SCD sources, has applications that quantify how the RDA is used, is used to monitor the RDA integrity, and will provide users with information that improves access and service. The system is schematically shown in Figure 1. Active and future implementations are distinguished by color and labels, data sources and applications are labeled, and arrows show the data flow. User utilities, which have not been implemented yet and are an external provision from the RDADB, will be developed once all the internal features have been completed. The system is currently about 30% completed.

Figure1
Data Support RDADB Schematic

Research Data Archive Metadata Development

The DSS has been capturing and storing dataset and data file metadata in systematically organized ASCII files for over 20 years. The content of the RDA has been made more visible by extracting basic metadata from the ASCII files and writing a Dublin Core compliant form that has been compiled into THREDDS catalogues and hosted on the UCAR Community Data Portal (CDP). This is a first step toward improved data discovery of the RDA within the context of all UCAR data holdings.

Improvements on the first step will be through a DSS project to standardize the capture and storage of metadata across all datasets and data files. Standardized metadata will make it possible to create better dataset and data file descriptions, enhance data search capabilities, and ensure that the RDA metadata is easily mapped to standards being used in the CDP and other data portals, e.g. the NASA Global Change Master Directory.

Work is underway to develop XML formats for storing dataset and data file metadata and create applications that automatically extract and store metadata. To begin the effort and prioritize the work on the large (more than 500-member) RDA, the datasets have been ranked by usage frequency over the past year. The datasets in the top 15% are being modified first, and all new datasets will be included. The expertise of all DSS staff is necessary to accomplish this goal, which has involved improvements to the dataset summary texts, and will require verification and editing of discovery and use metadata in a system that has a constrained standard vocabulary. The strategy is such that improvements will be factored into our operational system as soon as they are ready, providing the users with the earliest possible benefit. Moreover, as improvements are made they will be mutually represented on the CDP the RDA data server.

Continuing Research Data Development and Services

International Comprehensive Ocean-Atmosphere Data Set

The International Comprehensive Ocean Atmosphere Data Set (ICOADS) was updated to include new data for 1998-2002. This is Release 2.1 and now covers 1784-2002. ICOADS is the largest available set of in situ marine observations. Observations from ships include instrument measurements and visual estimates while data from moored and drifting buoys are exclusively instrumental. The ICOADS collection is constructed from many diverse data sources, and made inhomogeneous by the changes in observing systems and recording practices used throughout the period of record, which is over two centuries. Nevertheless, it is a key reference dataset that documents the long-term environmental state, provides input to a variety of critical climate and other research applications, and serves as a basis for many associated products and analyses.

The observational database is augmented with higher-level ICOADS data products. The observed data are synthesized to products by computing statistical summaries, on a monthly basis, for samples within 2 degrees latitude x 2 degrees longitude and 1 degree x 1 degree boxes beginning in 1800 and 1960, respectively. For each resolution the summaries are computed using two different data mixtures and quality control criteria. This partially controls and contrasts the effects of changing observing systems and accounts for periods with greater climate variability. The ICOADS observations and products are freely distributed worldwide. The ICOADS project is a three-way cooperation between the National Oceanic and Atmospheric Administration (NOAA) -- its Climate Diagnostics Center (CDC) and National Climatic Data Center (NCDC) -- and the National Science Foundation (NSF) National Center for Atmospheric Research (NCAR).

NCEP/NCAR Global Atmospheric Reanalysis - Reanalysis 1

NCEP/NCAR Global Atmospheric Reanalysis (R1) is a very important component of the RDA. R1 covers the period 1948 to 2004 and is updated on a monthly basis, adding about four GB of data each month. The four-times-daily analyses provide products on various atmospheric reference frames (pressure and isentropic levels) and at the surface. Simple monthly means are computed, and related derived products (e.g. monthly mean rawinsondes from R1) are available to augment the research. CDROMs for each year -- 2003 is the most recent -- are created to aid in the data distribution of the most-used data. All monthly mean data products and the most recent past 12 months analyses on pressure levels and fluxes at a 6-hourly interval are available on the RDA data server. Table 3 shows the users and data amounts distributed for 2004. It is remarkable that this archive first became available, with a limited number of years of data, in 1995 and is still very popular with several hundred unique users and over 8 TB of data distributed in 2004.


Table 3
NCEP NCAR Global Reanalysis data distribution metrics for January-September 2004 and a 2004 annual estimate

 

Unique Users

Data Amount (GB) [2004 est.]

Data Server (Web and FTP)

n/a

616 [770]

CDROM

22 [28]

148 [185]

MSS

146 [183]

5483 [6853]

Individual Request

26 [33]

458 [573]

Total

194¹ [243]

6705 [8381]

¹ Low estimate because number of unique users were not available for the Data Server

NCEP Final Global Analyses and Global Observational Files

The operational weather model and observational archives from NCEP are now more readily available to users. The past 12 months of data from the NCEP operational Global Final (FNL) Analyses at 1 degree x 1 degree resolution and the companion upper air and surface observations are now available online (dataset ID numbers ds083.2, ds353.4, and ds464.0). These online data have greatly assisted MMM and WRF model users. The FNL ranks number one, and the surface observations rank number three for largest data amounts downloaded from the RDA data server.

Real-time Data from the Internet Data Distribution System

The UCAR Unidata staff operates and maintains a software system as a primary node on the Internet Data Distribution (IDD) system. SCD provides the site and administrative support for the system. As part of the collaboration, some of the real-time model and observational data are preserved in RDA datasets. The full period of record, model data beginning December 2002 and observations May 2003, are available from the MSS. Researchers that need near-real-time data further benefit because the most current 90 days of both data streams are kept online. The model data are used by some (approximately 320 unique users and 31 GB of data downloaded for January - September 2004) while the observational data, both upper air and surface, are more popular. It ranks number eight among all datasets downloaded from the RDA data server and has about 1,700 unique users that have received 100 GB over the first three quarters of 2004.

Service

The RDA is an important collection of reference datasets. With simple metrics we quantify the number of users and amount of data delivered. These measures are an assurance that we are providing a valuable service in the most general sense. They also provide one indication of where we should focus our stewardship resources.

Metrics for Users

The RDA users can be separated into two categories; those who are known (by name, email, etc.) and those who are anonymous. Known users contact the DSS to submit individual data requests or use an SCD computing account to access data from the MSS. Anonymous users access the RDA from the data server and use web browsers, anonymous FTP, and script codes to gather the data they need. These two user categories are individually significant, but cannot be jointly summarized on an equivalent scale.

Known users with SCD computing accounts have access to all files in the RDA from the MSS and can find them by using online documentation. Users without computing accounts who need parts of the RDA that are not available from the data server or who have access limitations must contact DSS to arrange for their requests to be filled. We have been tracking the number of unique users who make individual requests since 1983 and those accessing the RDA from the MSS since 1990 (Figure 3). Since 1997, about 800 unique users are served annually by these methods.

Figure 3

Unique Users of the Research Data Archive

Known users identified through individual requests to the DSS or by SCD computing account information. The 2004 values are estimated by adding 25% to January - September measured amounts.

Anonymous users are only identified by unique IP addresses captured in web and FTP log files for the RDA data server. The log file information is filtered to reasonably approximate data access only. The filtering mechanisms are; all data are in a controlled directory structure, only full data file downloads are tabulated, and automatic web engines and robots are excluded. Annually since 2002 about 10,000 unique IP addresses connect to the RDA server and download data files (Figure 4). Downloads of web pages, images, documentation, and other general information are not summarized here. These values are typically two orders of magnitude greater than the data downloads and represent the ancillary service that accompanies the data itself.

Figure 4

Anonymous Users of the Research Data Archive

Anonymous users, identified by IP address, that have accessed data files from the RDA data server. The 2004 values are estimated by adding 25% to January - September measured amounts.

Metrics for Data Usage

RDA usage will be quantified by the total amount of data read by users and by identifying the top datasets leading to those amounts. Users gain access to the RDA in three ways; from the MSS, by individual requests to the DSS, and by online files from the RDA data server. Since 1997 the total amount of data delivered to all users grew from 10 TB to 39 TB (Figure 5). The usage from the MSS for 2002 (20 TB) was anomalously high relative to 2001 and 2003. This was largely due to data processing increases of NCEP/NCAR Global Reanalysis by CGD staff. In 2003, data usage through individual requests grew from about 3 TB to 10 TB. This significant step increase was the result of several large data deliveries on DLT magnetic tape. The 10 TB amplitude grew to 13 TB for 2004 in this category. This was not due to futher magnetic tape delivery but rather a shift to ERA-40 data delivery (11 TB) over the network. The RDA data access from the data server also showed large growth from about 2 TB to 6 TB between 2003 and 2004. The main component (over 3 TB) of this change was due to making more NCEP observational and operational model data available online for MMM and WRF model users. (See NCEP Final Global Analyses and Global Observational Files for details.)

Figure 5

RDA Data Accesses

Amount of RDA data accessed by three ways; from the MSS, by individual request to the DSS, and by online files from the RDA data server. The 2004 values are estimated by adding 25% to January - September measured amounts.

Finally, we look at what datasets from the RDA are the most significant contributors to the usage statistics for 2004. This is summarized for the data taken from the RDA data server and the MSS. The top ten datasets acquired from the RDA data server, ranked by the amount of data downloaded during January-September 2004, are shown in Table 4. By far the leading dataset is the NCEP Global Tropospheric Analyses, as discussed above, with over 3 TB downloaded. The top three are rounded out with related NCEP ADP observations as third and the ever-popular NCEP/NCAR Global Reanalysis. Note that datasets that have restricted access (e.g. ERA-40) are treated as individual requests and are not tabulated here with the unrestricted data.

Table 4
Top 10 datasets downloaded, ranked by data amount, from the Research Data Archive data server for January-September 2004

Dataset ID¹

Download (GB)

Title

ds083.2

3114

NCEP Global Tropospheric Analyses (FNL), 1°x1°

ds090.0

560

NCEP/NCAR Global Reanalysis

ds464.0

302

NCEP ADP Global Surface Observations

ds744.4

318

QSCAT/NCEP Blended Ocean Winds from Colorado Research Associates

ds540.0

125

International Comprehensive Ocean Atmosphere Data Set (ICOADS), Global Marine Surface Observations

ds090.2

123

NCEP/NCAR Reanalysis Monthly Mean Subsets

ds759.3

102

NGDC, ETOPO2 Global 2' Elevations

ds336.0

100

Unidata Internet Data Distribution (IDD) Global Observational Data

ds010.0

58

Daily Northern Hemisphere Sea Level Pressure Grids

ds083.0

50

NCEP Global Tropospheric Analyses, 2.5°x2.5°

¹ The web address for each dataset is constructed by substituting the Dataset ID Number into this template, http://dss.ucar.edu/datasets/dsnnn.n

The top ten datasets, acquired from the MSS and ranked by the amount of data downloaded during January-September 2004, are shown in Table 5. As expected, because model output files are larger than observational data files, the top four datasets are from reanalysis projects (at NCEP and ECMWF) and operational model output. The fourth-ranked dataset here is the same as the first-ranked from the data server. The ERA-40 is showing growing strength, with four products in the top 10, and quite possibly will overtake the top level in this tabulation in coming years. The number of unique users per dataset varies greatly, but the high numbers (greater than 80) for four different ones indicate that many users access multiple datasets in their research (total number of unique users over all RDA datasets from the MSS is 360 for this period).

Table 5
Top 10 datasets, ranked by data amount, from the Research Data Archive data accessed from the MSS for January-September 2004.

Dataset ID¹

Users

Access (GB)

Title

ds090.0

145

5458

NCEP/NCAR Global Reanalysis

ds118.1

21

3904

ERA40 2.5-Degree Upper Air Analysis on Pressure Surfaces

ds117.2

5

1391

ERA40 Model Resolution Upper Air Analysis on Model Levels

ds083.2

81

1065

NCEP Global Tropospheric Analyses

ds609.2

92

758

GCIP NCEP Eta operational model analyses

ds464.0

91

706

NCEP ADP Global Surface Observations

ds118.0

91

703

ERA40 2.5-Degree Surface and Single Level Analysis

ds117.0

9

539

ERA40 Model Resolution Surface, Vertical Integrals, and Other Single Level Fields

ds091.0

11

178

NCEP/DOE Global Reanalysis

ds111.2

43

177

ECMWF Operational 2.5-Degree Global Surface and Upper Air Analyses

¹ The web address for each dataset is constructed by substituting the Dataset ID Number into this template: http://dss.ucar.edu/datasets/dsnnn.n.

The breakdown of RDA users of MSS files by UCAR, NCAR divisions, and university shows our largest service is going to the universities as a whole (317 users) followed by the NCAR Divisions, CGD, MMM, and ACD (Table 6). This is good support for the university researchers, and we will continue to promote this type of service. As datasets become larger, the SCD facilities, storage, and computing will become more important as the practical place to manipulate data.

Table 6.
Research Data Archive Usage from the MSS by user group; University, NCAR Divisions, and UCAR Programs. Values for January-September 2004 and a 2004 annual estimate.

 

Users

Access (GB)[2004 estimate]

University

254 [317]

8,822 [11,027]

CGD

32 [40]

5,855 [7,319]

MMM

30 [38]

307 [384]

ACD

11 [14]

350 [438]

Others (10 entries)

33 [41]

544 [680]


Noteworthy Support

The skill and knowledge in DSS reaches beyond simple RDA curation and stewardship to support some projects in UCAR and at related data centers nationally and internationally. The activities can involve the mechanics of moving data, data archive backup, or assistance with user support. Table 7 shows some noteworthy support during 2004.

Table 7.
Noteworthy Support

Collaborating Organization

Support Activity

Max-Planck-Institut für Meteorologie

Courtesy updates of NCEP/NCAR Reanalysis

NCAR CGD

Provide tape import device and server for 33 TB of IPCC data, 315 SDLT tapes

NOAA Climate Diagnostic Center

Full 2.5-degree ERA-40 archive set

Cosmic Suominet

Data backup and 80 GB data recovery

CSU, Garrett Campbell

Transfer the ISCCP DX time series to the RDA

RAP, Greg Thompson

Provide user download for three years of RUC and Eta model data

NOAA Climate Diagnostic Center

NCEP/NCAR Reanalysis data (23 GB) for data recovery

 

 

Table of Contents | Director's Message | Executive Summary | SCD Achievements
Education and Outreach | Community Service | Awards | Publications | People | ASR 2004 Home

National Center for Atmospheric Research University Corporation for Atmospheric Research National Science Foundation Annual Scientific Report - Home Atmospheric Chemistry Division Advanced Studies Program Atmospheric Chemistry Division Climate and Global Dynamics Division Environmental and Societal Impacts Group High Altitude Observatory Mesoscale & Microscale Meteorological Division Research Applications Program National Center for Atmospheric Research Scientific Computing Division