Table of Contents | Director's Message | Executive Summary | SCD Achievements
Education and Outreach | Community Service | Awards | Publications | People | ASR 2004 Home

SCD Achievements

High Performance Computing

Maintaining NCAR's production supercomputing environment

The production supercomputer environment managed by SCD for NCAR has evolved over the years. During the last 20 years, SCD has brought NCAR's science into the multiprocessing supercomputer world. Prior to the introduction of the 4-CPU Cray X-MP in October 1986, all modeling was performed with serial codes. Since then, the focus has been on redeveloping codes to harness the power of multiple CPUs in a single system and, most recently, in multiple systems.

(Click on the image for a larger version.)

Supercomputing systems deployed at NCAR

During the last 20 years, SCD has deployed a series of Parallel Vector Processor (PVP) systems ranging from a 2-CPU Cray Y-MP to a pair of 24-CPU Cray J90se systems. Massively Parallel Processing (MPP) systems included the Cray T3D with 128 processors and the Thinking Machines CM2 and CM5 systems. Most recently, Distributed Shared Memory (DSM) systems have been deployed; these include the Hewlett-Packard SPP-2000, SGI Origin2000, Compaq ES40 cluster, SGI Origin3800, IBM SP POWER3 and POWER4 systems, and now Linux clusters.

The diagram at left shows the systems that SCD has deployed for NCAR's use since its inception. The systems shown with blue bars are those deployed for production purposes, those shown in red were (are) considered experimental systems.

In 1986, with the first multiprocessor system (the Cray X-MP/4) on NCAR's floor, SCD could deliver on average approximately 0.25 GFLOPS of sustainable computing capacity to NCAR's science. In the roughly 20 years since, that sustained computing capacity has grown significantly to over 587 GFLOPS, with a peak capacity of 12.1 teraflops (TFLOPS). The image at right illustrates this trend. (Click on image for a larger version.)

FY2004 production system overview

In FY2004, Phase III of the current Advanced Research Computing System (ARCS) was delivered to NCAR. This was an expansion of the IBM cluster (bluesky) by 14, 32-way p690 SMP servers with each server based on the POWER4 micro-processor and operating at a clock frequency of 1.3 GHz.

 

Each server included 64 GB of memory. The system expansion also included 10.5 TB of formatted disk storage, which was added to the existing disk subsystem, thereby increasing bluesky's total disk capacity to 31 TB. Of the 14 servers, only 12 were added to bluesky, the remaining two servers are temporarily being used for a special SCD testbed project. At end-FY2004, bluesky is comprised of 50 POWER4 38 Regatta-H Turbo frames, making it the single largest system of this type in the world.

The 12 additional 32-way P690 SMP servers were used to support CCSM for contributions to the IPCC process, as reported in SCD's Annual Budget Review. The installation of the blueksy system and its subsequent augmentation has doubled the capacities of both the Climate Simulation Laboratory and Community computing.

Further, there were several major system software upgrades performed on all supercomputers.

Supercomputer systems maintained during FY2004

Distributed Shared Memory (DSM) systems:

  • SGI Origin2100 (chinookfe), with 8 processors, was used in the Climate Simulation Laboratory.
  • SGI Origin3800 (chinook), with 128 processors, was used in the Climate Simulation Laboratory.
  • SGI Origin2000 (dataproc), with 16 processors, was used by both Climate Simulation Laboratory and Community users.
  • SGI Origin2000 (mouache), with 4 processors, was used as a test platform by SCD for evaluation of new Irix systems, libraries, and compilers prior to their installation on the production SGI platforms; all interested users now have access to mouache.
  • IBM SP (babyblue), with 64 processors, was shared by the Climate Simulation Laboratory and the Community.
  • IBM SP (blackforest), with 1,308 processors, was shared by the Climate Simulation Laboratory and the Community.
  • IBM NightHawk2 (dave), with 16 processors, was shared by the Climate Simulation Laboratory and the Community.
  • IBM p690 Regatta (bluedawn), with 16 processors, was used as a test and development platform for the integration of the IBM POWER4 Cluster 1600.
  • IBM Cluster 1600 (bluesky), with 1,600 processors, was shared by the Climate Simulation Laboratory and the Community.
New supercomputer systems added during FY2004

DSM systems:

As an element of its five-year strategic plan to aggressively evaluate and deploy potentially more cost-effective new computing technologies, SCD acquired a large-scale Linux-based supercomputer cluster.  Following a competitive procurement process, IBM was selected to deliver a 256-processor e1350 Linux cluster. The system, called lightning, was delivered in July 2004 and uses 2.2 GHz AMD Opteron processors, has a peak computational capacity of 1.14 teraflops, 0.5 terabytes of memory, and 7 terabytes of disk.  The CAM and POP benchmarks demonstrated that lightning will outperform bluesky by a factor of 1.3 or more on a per-processor basis.  This system brought the total computational capacity at NCAR to 12.1 teraflops.

Production system performance and utilization statistics

At the end of FY2004, the production supercomputer environment managed by SCD for NCAR included five IBM supercomputers and four SGI supercomputers. The following tables provide average utilization and performance statistics for the production supercomputer systems SCD operated in FY2004.

In addition, SCD publishes monthly usage reports at http://www.scd.ucar.edu/dbsg/dbs/. These reports provide summary information on system usage, project allocations, and General Accounting Unit (GAU) use.

End—FY2004 production supercomputer systems

The SCD supercomputer resources are comprised of two separate computational facilities: the Climate Simulation Laboratory (CSL) and Community Computing facilities. Some systems, such as the IBM SP systems, the dave system, and the dataproc system are shared between these two facilities. The following sections describe the supercomputing systems available in these two facilities.

CSL facility:

The Climate Simulation Laboratory facility provided the following supercomputing resources at the end of FY2004:


Climate Simulation Lab facility, FY2004 configuration


 

System


# CPUs


GB
memory


Peak
GFLOPS


Notes


Dedicated:

IBM SP (blackforest)

560

280

840.0

1,120 total system batch CPUs; 560 dedicated to CSL

Dedicated:

IBM SP
(bluesky)

704

1408

3660.8

1,408 total system batch CPUs; 704 dedicated to CSL

Dedicated:

SGI Origin3800 (chinook)

124

64

124.0

124 CPUs dedicated to CSL

Dedicated:

SGI Origin2100 (chinookfe)

8

8

4.0

Front-end system for chinook

Shared:

IBM SP (babyblue)

48

24

72.0

Shared new-release test platform; available for user use

Shared:

SGI Origin2000 (dataproc)

16

32

8.0

Shared with Community for data analysis and post-processing applications

Shared:

IBM NightHawk2 (dave)

16

32

24.0

Shared with Community for data analysis and post-processing applications


Community Computing facility:

The Community Computing facility provided the following supercomputing resources available at the end of FY2004:


Community Computing facility, FY2004 configuration


 

System


# CPUs


GB
memory


Peak
GFLOPS


Notes


Dedicated:

IBM SP (blackforest)

560

280

840.0

1,120 total system batch CPUs; 560 dedicated to Community

Dedicated:

IBM SP
(bluesky)

704

1408

3660.8

1,408 total system batch CPUs; 704 dedicated to Community

Shared:

IBM SP (babyblue)

48

24

72.0

Shared new-release test platform; available for user use

Shared:

SGI Origin2000 (dataproc)

16

16

8.0

Shared with CSL for data analysis and post-processing applications

Shared

IBM NightHawk2 (dave)

16

32

24.0

Shared with CSL for data analysis and post-processing applications


Key maintenance activities

During FY2004, SCD provided ongoing maintenance activities to ensure the integrity and reliability of existing computational systems and improved the quality of service to the NCAR user community. Some of the key areas were:

Maintain supercomputer operating systems
SCD stayed apprised of major software releases from IBM and carefully scheduled upgrades to the production systems and product set software based on the judged stability of those upgrades in the NCAR production environment. SCD also continued to provide major system support for the SGI Origin3800 and Origin2000 systems.

Maintain stability and reliability of systems
One of the most significant attributes of the NCAR computational environment is its overall stability and reliability. For instance, the NCAR Mass Storage System has a reputation for reliability, and SCD has in the last year deployed a number of high-availability fileserver systems. This reliability and stability does not come easily; it stems from a combination of choosing reliable, stable vendor products and using proven, fail-safe system administration and maintenance techniques. SCD will continue to focus on ensuring, in whatever ways possible, highly stable and reliable systems and systems operations.

System monitoring
Over the years, SCD has developed a large number of system monitoring procedures, techniques, and tools. SCD continued to enhance and utilize its collective experience to maintain the stability of the existing production systems through this proactive monitoring. In addition, SCD continued to enhance its monitoring tools, techniques, and procedures, and SCD automated a number of procedures for detecting system failure or trouble. This automation was integrated with commercial alphanumeric paging technology to provide more rapid alert mechanisms to SCD operations and systems staff and thus reduce the amount of time that systems are unavailable to the NCAR user community when they do fail.

Computer Security and Divisional Threat Response

SCD manages a diverse computational and data storage environment containing high-end computers, mass storage subsystems, data archives, visualization, e-mail, DNS, authentication and web servers, and networks (including IP telephony). Not only are these systems valuable monetarily, they comprise vital scientific research tools and business continuation systems used by the UCAR/NCAR organization and university communities.

In response to a major cybersecurity incident that involved multiple high-performance computing sites in March 2004, SCD rapidly developed and deployed a long-term solution for protecting the supercomputing and mass storage systems at NCAR. SCD now requires one-time password tokens, arbitrated via encryption devices issued to all users, to access these systems. Security procedures were updated and published to provide all users with guidelines and instructions for working within the secure supercomputing environment.

One of the problems encountered during the March 2004 incidents was a lack of effective communication among the affected institutions. SCD proposed a conference to bring together stakeholders from the nation's research and high-performance computing centers to prepare a coordinated response for future incidents.

With funding from the National Science Foundation (NSF), SCD planned, organized, and hosted a two-day Cybersecurity Summit near Washington D.C. Attended by over 120 cybersecurity experts from some of the nation's leading research institutions, the summit explored the competing needs of having an open, collaborative research environment while protecting the security and integrity of its computing and data assets.

Sites participating in the Cybersecurity Summit

The map shows the locations of the sites participating in the Cybersecurity Summit. This broad-based collaboration aims to coordinate strong response plans for threats against research computing and data.

Cybersecurity Summit 2004 was the first step in laying the foundation for responding to future large-scale security breaches and reducing the disruptive impact of such incidents on the nation's research agenda. These research institutions are increasing their cooperation on security policies, procedures, and incident response to better protect the nation's scientific computing and data resources.

Data Archiving and Management: The Mass Storage System (MSS)

The NCAR Mass Storage System (MSS) is a large-scale data archive that stores data used and generated by climate models and other programs executed on NCAR's supercomputers and compute servers. At the end of FY2004, the NCAR MSS managed stored data exceeding 25 million files of over 1,247 unique terabytes (TB), and the total holdings exceeded 2,149 TB (2.1 petabytes) when including duplicate copies. The net growth rate of unique data in the MSS was approximately 30 TB per month.

On average, 160,000 cartridges are being mounted each month, approximately 1% (1,000) of these by operators and 99% in the StorageTek Powderhorn Automated Cartridge Subsystems (ACS). The StorageTek Powderhorn ACS systems (also called "silos") use robotics to mount and dismount cartridges. On a daily basis, the MSS handles approximately 41,000 requests resulting in the movement of over 3,900 GB of data. During FY2004, data transfers servicing user requests to and from the MSS exceeded 1,400 TB.

While some of the data stored on the NCAR MSS originate from field experiments and observations, the bulk of the data is generated by global climate-simulation models and other earth-science models that run on supercomputers. SCD therefore faces an increasing demand to archive the data generated by increasingly more powerful supercomputers. As supercomputers become larger and faster, they generate more data to be archived. Ever-greater demands for archiving data will result from the growing use of coupled atmospheric/oceanic simulation models.

MSS history

The NCAR Mass Storage System has evolved over the last 18 years. Prior to late 1989, mass storage at NCAR was comprised strictly of offline, manual-mount media. In November 1989, the first STK Powderhorn "silo" was acquired, commencing a new era of mass storage at NCAR. The following figure illustrates the various technologies that have been used to store critical datasets throughout NCAR's history.

(Click on the image at left for a larger version.)

During FY2003, the NCAR Mass Storage System grew from 20,340,049 files with a total of 880  unique TB to 25,121,621 files with a total of 1,247 unique TB. Total holdings grew from 1,500 TB (1.5 PB) to over 2,149 TB (2.149 PB) This was an average net growth rate of 30 unique TB (60 total TB) per month during FY2004.

The MSS Today

MSS Access Methods

During FY2004, the technology used to access MSS data continued to undergo substantial change. A migration is underway from the use of the older, non-commodity, High Performance Parallel Interface (HiPPI) technology to the use of Gigabit Ethernet (GigE) and Fibre Channel (FC) technologies.

The HiPPI technology provides direct storage-device access via the High-Performance Data Fabric (HPDF). The data fabric consists of HiPPI channel interfaces to host computers, non-blocking HiPPI switches capable of supporting multiple bi-directional 100 MB/sec data transfers, and protocol converters that connect the HiPPI data fabric to the IBM-style device control units. To utilize the HPDF, SCD staff wrote a file transport type of interface to enable users to copy files between their host systems and the MSS. At the end of FY2004, the HPDF data fabric supports 12 independent file transfer operations between the tape devices and the compute servers sustaining 10 MB/sec each, for an aggregate total of 120 MB/sec.

HiPPI technology continues to be deployed only in a niche market. It has not shown signs of spreading into the commodity marketplace, and as a result the cost of HiPPI technology has remained high and the number of HiPPI vendors is small. The lack of availability of and support for HiPPI technology is becoming a critical issue to the continued operation of the MSS.

To alleviate these issues, SCD staff wrote the UNIX-based Storage Manager (STMGR), which replaces the HPDF as the method used to access data by host systems. STMGR isolates the client host systems from directly accessing the storage devices, simplifying the code SCD has to write and maintain for each type of host operating system. It also eliminates the need for HiPPI channel interfaces and device drivers on the client hosts. In place of HiPPI, commodity TCP/IP networking is used to access STMGR from the client host systems. Client host systems can use any available network interface at any speed to access files on the MSS. Currently, when using GigE, data rates in the range of 30-60 MB/sec are easily achievable with recent computer hardware. Using high-speed Ethernet as the client system interconnect means that future deployment of higher-speed GigE will automatically raise the capacity of the client system interconnect.

The use of UNIX systems for STMGR allows SCD to deploy the latest storage hardware and software technologies to manage MSS data. STMGR server systems initially use a FC Storage Area Network (SAN) to access RAID and tape drives via a high-reliability switch. Fibre Channel is currently available in versions that support either approximately 100 MB/sec or 200 MB/sec bidirectionally. Multiple FC connections may be made between STMGR servers and storage devices, and aggregate I/O rates approaching 1 GB/sec are possible with commodity components on a single STMGR server. The use of FC RAID plus journalling file systems allows STMGR to improve the robustness and flexibility of the disk cache. Also, MSS administrators can have STMGR reallocate resources between disk cache partitions or add space to disk cache partitions on the fly without interruption to MSS clients.

Near the end of FY2003, STMGR was placed into production as a replacement for the old IBM 3390 disk farm. The old disk farm could store approximately 180 GB, was used to buffer files that were smaller than 15,000,000 bytes long, and supported an aggregate transfer rate of 12 MB/sec. During the initial deployment, the STMGR disk cache stores approximately 500 GB and supports an aggregate transfer rate approaching 120 MB/sec. During FY2004, the STMGR disk cache was increased to approximately 8 TB to buffer files up to 50 MB in size. In FY2005, the STMGR disk cache will grow to approximately 60 TB, will buffer files of all sizes, and will support an aggregate transfer rate approaching 400 MB/sec. A disk cache of this size will permit newly written files to reside in the cache longer, which will reduce the number of tape mounts and tape I/O. STMGR will also, with further improvements in MSS software, allow better tape utilization by allowing files with differing storage requirements to be segregated on separate tape media. Both of these improvements will reduce the total number of tape drives that will be required to support the aggregate data rates between the MSS and the client host systems.

Also during FY2004, the use of HiPPI was reduced for newly written tape files when STMGR assumed the role of providing tape access. HiPPI can then be decommissioned in FY2005 once all data has oozed off the StorageTek 9840A media. New tape devices, such as the StorageTek T9940B Fibre-Channel-attached drive, store up to 200 GB and support I/O rates in the neighborhood of 30 MB/sec. This will be an improvement of 3 times in both storage density and transfer rate over the current tape devices. These improvements will allow the MSS to expand into the multi-petabyte range while reducing the latency to access MSS files.

MSS Storage Hierarchy

The NCAR MSS currently uses two levels of storage: online and offline. The most frequently accessed data are kept on the fastest storage media, which is the online storage devices: 8 TB of Fibre Channel RAID storage, and five StorageTek Powderhorn ACSes. The Powderhorn ACSes use StorageTek 9840A and 9940A, as well as StorageTek 9940A technology. Currently, the NCAR MSS has five ACSes providing a total online capacity of approximately 2 petabytes. The total capacity of the online archive will exceed 6 petabytes utilizing 9940B 200 GB cartridges.

Expansion of the MSS storage hierarchy is planned over the next five years with the introduction of new tape technologies, new ACSes, and with the integration of a multi-terabyte disk farm cache. Simulations of the MSS workload indicated that a 60-TB disk farm cache can reduce the amount of tape readback activity by as much as 60%. The disk farm cache would not only reduce the number of tape drives required in the system but also provide a much-improved response time to read and write requests. In addition, the MSS Group will continue to evaluate hardware and software solutions being developed by vendors throughout FY2005 and how they might be integrated into the NCAR MSS.

MSS Import/Export Capability

Another important capability of the NCAR MSS is the ability to import and export data to and from external portable media. Importing data involves copying data from portable media to the MSS data archive, while exporting data involves copying data from the MSS data archive to portable media. Import/export allow users to bring data to NCAR with them, as well as take data away. Import also allows data from field experiments to be copied to the NCAR MSS archive.

Options to exchange data with smaller satellite storage systems are being investigated. Using this technique, data generated at NCAR could be transferred to remote sites for further analysis. The NCAR SCD storage model would thus be geographically distributed, rather than centrally located and administered.

In addition to 3480 and 3490E cartridge tapes, the NCAR MSS also offers import/export to single and double-density 8mm Exabyte cartridge tapes. The deployment of an MSS-IV Import/Export server in FY2000 provided the ability to support many more device types, such as CD-ROM, DAT, and newer Exabyte media, to name a few.

MSS Accomplishments for 2004

Disk farm cache simulator

To aid capacity planning and performance tuning of the MSS, a simulator that includes all the major hardware and software components of the MSS was developed in 2003. The simulator enables the MSS group to consider different design alternatives for new software and hardware components and estimate how the different designs will perform before the components are added to the actual system. Simulation studies were conducted in 2004 using an earlier version of this simulator (that only simulated the disk cache component of the MSS) to aid in configuring and sizing the STMGR disk cache system.

In addition, simulator output was combined with MSS warehouse information to help measure the effectiveness of external data caches which were deployed to avoid rereading data from the MSS, thus avoiding the abuse of a data archive as a file server. The external cache deployment resulted in as much as a 60% drop in such re-reads.

StorageTek 9940B Technology

Initial deployment of 20 StorageTek 9940B tape drives was completed in FY2004. Managed by the STMGR, these drives are servicing the files offloaded from the disk cache and local system backup files. An additional 20 9940B tape drives will be installed in early FY2005, and with the expansion of the disk cache, a data ooze will be started in FY2005 to replace the 9940A technology.

User Education

As a result of the SCD-held user forum on computing issues, MSSG compiled for SCD's Consulting Office a short list of do's and don'ts regarding the NCAR Mass Store to help guide users toward efficient and proper use of the MSS.

New MSS Hosts

The IBM eSeries Linux Cluster, named lightning, was provided with Mass Store connectivity in 2004.

MSS Growth

NCAR Mass Storage System growth during FY2004 increased over FY2003. Average net growth rate during FY2003 was 27 TB per month, whereas the average net growth rate during FY2004 was 30 TB per month. This increase in the growth rate can be attributed to several factors, such as new MSS hosts coming online, increased amounts of local disk storage on several machines (which increases the size and number of MSS backup files), the IPCC initiative, and 14 computing nodes added to the IBM POWER4 cluster (bluesky). Further increases in the net growth rate are expected in FY2005 with the addition of two Linux clusters in early FY2005. Projecting this growth into the future, it is not difficult to realize that new storage paradigms and user education will be required, since without this the growth in just three to five years will be untenable.

The following table compares year-end statistics for FY2000 - FY2004.

Mass Storage System Growth Statistics


 

 

eFY2000


eFY2001


eFY2002


eFY2003


eFY2004


Total storage unique bytes (TB):

273

379

 519

 880

1,500

Total files (millions):

8.3

10.9

14.4

20.3

25

Net growth (unique bytes TB per month) at eFY:

5.2

10

13.3

27

30

Data read/written (TB per month):

25

 37

49

83

118

Data migrated internally (TB per month):

25

74

68

83

90

Manual tape mounts (number per month):

18,000

10,000

11,000

6,800

1,000

Robotic tape mounts (number per month):

54,000

95,000

110,000

123,200

160,000

Offline cartridge count:

142,000

126,000

70,000

20,000

24,000

Sustained GFLOPS on NCAR computing floor:

~75

~75

~140

~388

~500

 

Future Plans for the MSS

Key issues to be addressed over the next four years include:

  • Managing data growth and integrating new storage technologies to keep pace with the projected growth in computing power, finding ways to reduce the MSS growth.
  • Providing web-based MSS tools and interfaces to handle the unique problems of large-scale MSS file management and folding at least some of these into the SCD Portal.
  • Develop Quality of Service metrics from warehoused and other data to measure and report system performance.
  • Explore possibilities for collaborative research topics pertinent to the management and performance of large-scale data systems.
  • Implementing a multi-terabyte "internal" disk farm cache to be positioned in front of the MSS tape archive to improve overall response time and reduce tape traffic substantially.
  • Integrating multiple "external" disk caches to further reduce the load on the MSS and pave the way for a global, shared front-end fileserver.
  • Deployment of a new Metadata Server to replace the Master File Directory (MFD). The new Metadata Server will use a commercial database with the capability of scaling beyond the limitations of the current MFD.
  • Warehouse ongoing MSS performance data for subsequent accounting, analysis, and reporting.
  • Investigating solutions to address disaster recovery.
  • Use the newly developed MSS simulator to aid in capacity planning and performance tuning of the system.

Computational Science Research

CSS's mission is to help realize the end-to-end scientific simulation environment envisioned by the NCAR Strategic Plan. To this end CSS's role is to benchmark and evaluate computer technology, learn to extract performance from it, pioneer new and efficient numerical methods, create software frameworks to facilitate scientific advancement, particularly through interdisciplinary collaborations, and share the resultant software and findings with the community through open source software, publications, talks, and websites.

Applied Computer Science Research Activities

In 2004, CSS-applied computer science efforts have centered on three activities: studying experimental, massively parallel architectures such as Blue Gene/L; benchmarking and evaluating Linux clusters as part of an SCD procurement; and porting applications to Linux-Itanium systems as part of the Gelato Federation.

Blue Gene/L Application Research

IBM has developed a novel, low power, densely packaged, massively parallel computer system called Blue Gene/L. Each node of Blue Gene/L consists of dual PowerPC 440 cores running at 700 MHz. Each core is capable of two floating multiply-adds per clock cycle, and 1,024 nodes can be packed into a single 19-inch rack. Thus a single rack of Blue Gene/L processors has a peak speed of 5.6 teraflops. This is achieved while consuming about 15 kW of electrical power, a tiny fraction of that consumed by conventional massively parallel systems. Apart from its low power and dense packaging, Blue Gene/L has several interesting architectural characteristics, for example a dedicated tree reduction and synchronization network, as well as a toroidal interconnect.

IBM Blue Gene/L

Figure 1: 512 nodes (1,024 processors) of an IBM BlueGene/L system (photograph courtesy of IBM Research).

In 2004, scientists in CSS, in collaboration with researchers from CU-Denver and CU-Boulder, submitted a proposal to the NSF's Major Research Infrastructure program. The objective was to acquire a 1,024-node Blue Gene/L system to study the performance of scalable applications on it, and to evaluate its production capabilities. This proposal was recently funded by the NSF, and SCD is currently in negotiations with IBM to obtain a Blue Gene/L supercomputer for evaluation. The system will be used for high-resolution studies of moist physical processes, employing the cloud-resolving convection parameterization (CRCP). Because of its scalability, the primitive equations dynamical core for these studies of CRCP physics will be the section's prototype spectral element model, HOMME. Throughout the past year, members of CSS, working closely with IBM computer scientists, have been benchmarking a 512-node Blue Gene/L prototype located at IBM's T.J. Watson Research Center. The benchmarks have been chosen to measure the system's performance on key algorithms drawn from our proposed atmospheric science projects. All-to-all, point-to-point, and global reduction communication benchmarks have been used to measure the capabilities of the Blue Gene/L's networks, and prototype CRCP physics packages have been ported and optimized.

Benchmarking, Porting and Performance Modeling Activities

CSS has also been extensively involved in evaluating and benchmarking clusters for the recent procurement by SCD of a 256-processor Linux-based system. This procurement resulted in the acquisition of an Opteron/Myrinet Linux cluster, which achieved performance levels 1.3-1.4 times higher than an equivalent number of IBM 1.3-GHz POWER4 processors. In 2004, CSS performed extensive testing and evaluation of the IBM "Federation" interconnect.

CSS has continued to expand its engagement with computer science students. Dr. Henry Tufo in CSS has played a key role in exploiting this opportunity by leveraging his joint appointment as a Computer Science professor at the University of Colorado to involve four graduate students in NCAR research problems. Students of Dr. Tufo are working in the areas of application porting and tuning, Linux cluster system administration, and Grid computing applications. CSS staff also provided technical support to computer science students in a course taught by John Halley at the University of San Diego, in which NCAR applications were ported to a variety of platforms.

Gelato Membership

The Itanium very long instruction word (VLIW) architecture represents an important departure from the traditional superscalar RISC microprocessor and CISC-like Pentium architectures used in the geosciences departments at most universities today. Since the VLIW relies on the compiler rather than on-chip circuitry to extract parallelism from the instruction stream, developing robust optimizing compilers for Itanium is critical. As Itanium microprocessors become plentiful in the geosciences community, access to reliable compilers, ported modeling applications, and open-source high-performance mathematical libraries optimized for this architecture enable scientific progress on Itanium Linux systems.

In the past year, CSS's role as a member of the Gelato Federation, an organization devoted to the advancement of the Linux-Itanium technical solution, has been to "beta test" the Intel Fortran and C++ compilers on the Intel Itanium and Itanium-2 processors by porting and tuning a variety of applications, such as CAM2 and MM5 to this platform. In this capacity, CSS has closely collaborated with SGI to port CCSM to the Altix (Itanium-based) shared-memory architecture. As a result, CCSM has recently been validated and has successfully demonstrated exact restart capability on this platform.

Applied Mathematics Research Activities

The research activities of the Computational Science Section (CSS) at NCAR are focused on three broad goals. First, work sponsored by the Department of Energy's Climate Change Prediction Program (CCPP) is developing a new generation of accurate, efficient, and scalable general circulation models, based on high-order methods and suitable for use by the atmospheric research community. To this end, CSS has conducted applied mathematical research, tested novel numerical algorithms using the standard test cases of the atmospheric science research community, and has created efficient software implementations of these algorithms.

CSS has also been working to integrate two physics packages into these models: the physics in the Community Atmospheric Model Version 2 (CAM2), recently used for IPCC simulations as a component of the Community Climate System Model (CCSM) (Blackmon, et al. 2001), and a Cloud Resolving Convective Parameterization (CRCP) sub-grid scale physics scheme acquired through a collaboration with the Cloud Dynamics Group in the MMM division at NCAR.

New Semi-Implicit Implementation

In 2004, CSS completed re-implementing a semi-implicit time step for the spectral element primitive equations. As before, the 3D governing primitive equations were specified in curvilinear coordinates on the cubed sphere combined with a hybrid pressure vertical coordinate. The new non-staggered formulation eliminates the interpolation for nonlinear terms that caused problems for the staggered semi-implicit during year seven of our research. The new dry dynamical core, based on a non-staggered weak formulation, has been validated using the standard 1,200-day Held Suarez test problem.

The semi-implicit solver of this model is based on vertical eigenmode decomposition and an iterative conjugate-gradient elliptic solver. In tests, the performance of the solver has been greatly improved using a simple preconditioner proportional to the determinant of the metric tensor. The vertical eigenmode with the largest velocity is the last to converge and effectively controls the rate of integration. To be useful, the longer time-step allowed by the semi-implicit method must overcome the additional cost of the Helmholtz solver. Preliminary tests indicate that the semi-implicit integration rate is at least three times faster than the explicit spectral element dynamical core on a single processor. Scalability tests of the new formulation are planned for later in 2004.

Adaptive Mesh Refinement of Non-conforming Spectral Elements

Year 2's success rests on the outstanding work of Amik St-Cyr and John Dennis, two very promising young scientists at NCAR. In the past year they successfully developed and implemented a multi-level AMR version of the section's spectral element dynamical core. This is a fully parallel code based on the geometrically non-conforming SEM of Kruse and Fischer combined with a novel tree management strategy for AMR on the cubed-sphere called HAMR (HOMME AMR). Time-stepping restrictions caused by refinement are partially alleviated by employing the novel nonlinear operator integration factor splitting (OIFS) scheme of Thomas and St-Cyr. (As an added benefit, the resulting 3D equations are well-posed under AMR as OIFS does not require local time-stepping.) Our refinement/de-refinement technology is based on the error estimator work for spectral element methods of C. Mavriplis. Though validation testing is not complete the current release of the code has been validated on several of the shallow water test cases of Williamson, in particular test case 5. Other highlights of year 2 include numerous journal publications, conference presentations, and the involvement of several CU graduate students in the project.

Test output

Figure 2: Adaptively refined non-conformal spectral elements tracking a cosine bell test shape in shallow water equations.

After investigating the currently available packages to support AMR, the decision was made in September 2003 to build our own package to support AMR on the cubed-sphere. Using the static non-conforming code developed in year 1 as a guide, an entirely new AMR implementation was developed for HOMME. The HOMME AMR implementation (HAMR) is based on the TFS communication library of Tufo. TFS is a scalable direct stiffness summation package with low setup cost. It uses unique global IDs to pair shared degrees of freedom in a distributed environment. HAMR is designed around the concept of a distributed graph, while a lightweight bit-shifting tree algorithm is used to maintain inheritance properties among the spectral elements. The topology of the cube-sphere necessitated that a minimum of six separate trees, one tree for each face of the cube, be maintained along with the connectivity information between each tree. Because of the unavoidable need for graph management, it was decided that all spectral element connectivity information be maintained in graph form (versus tree form). This decision allows for an arbitrary select of the underling base grid. The distributed graph is updated each time a spectral element is refined or coarsened. Local graph query functions are used to set the proper global degree of freedoms for the TFS library. HAMR has been demonstrated in parallel to support both refinement and coarsening for multiple levels of refinement and achieves load-balancing via element migration.

In 2003, St-Cyr and Thomas developed a novel time-stepping scheme to ameliorate the time-step restrictions encountered under AMR and to maintain well-posedness of the 3D equation set. Merging the OIFS time-stepping required major revisions to the Krylov solvers, as generation of the preconditioning matrices on the fly is non trivial. In addition, HOMME implementation was generalized to remove unnecessary edge rotations. In the earlier version, a special treatment of vector quantities was necessary when on an edge of the cubed sphere. This change was necessary to use TFS library for the direct stiffness summations. The inter-element trace matching is generalized, and the masks necessary to eliminate doubled corner contributions are generated automatically.

As stated earlier, the OIFS time-stepping approach needs more aggressive preconditioning techniques. Martin J. Gander is collaborating with the team to determine whether an optimized Schwarz preconditioner can be used in the P_N - P_N (non-staggered) version of HOMME. Recent results obtained by Gander and St-Cyr include a proof that changing the preconditioning matrices in the Dryja-Widlund form of the additive Schwarz procedure leads to the optimized iterates. This result will help the community in accepting these novel preconditioning techniques.

Integration of Spectral Element Dynamics with CAM Physics

In 2004, CSS began Integrating HOMME explicit dynamics with CAM physics from version cam_2_0_2_dev69. The API between the dynamics and the CAM physics and the necessary CAM program management units was identified. Inconsistencies and incompatibilities with respect to the grid structures were identified and resolved. Most issues related to the initialization of an "Aqua Planet" [Hyashi86] experiment have also been resolved.

Integration of Cloud Resolving Convective Parameterization (CRCP) with CAM Physics

In FY2004, work began interfacing a Cloud-Resolving Convection Parameterization (CRCP; a.k.a. super parameterization; Grabowski Smolarkiewicz 1999; Grabowski 2001, 2003) with the HOMME dynamics. CRCP is a novel technique for representing clouds in atmospheric models. The idea is to embed a 2D cloud-resolving model in each column of a large-scale model to represent small-scale and mesoscale processes. Khairoutdinov and Randall (2001) have tested this approach in the Community Climate System Model (CCSM). A stretched vertical coordinate has recently been implemented in the CRCP code, facilitating direct coupling to a pressure vertical coordinate.

Conservative Advection using Discontinuous Galerkin Method

The Discontinuous Galerkin (DG) Method is a hybrid of finite-element and finite-volume methods, and it provides a class of high-order accurate conservative algorithms for solving nonlinear hyperbolic systems. This method is known for being highly parallelizable and the for being able to capture discontinuity of the exact solution without producing spurious oscillations.

In FY2004, a DG conservative transport scheme has been developed on the cubed-sphere (Nair 2004). This scheme has been further extended to a nonlinear flux-form shallow water model (SW) in curvilinear coordinates on the cubed-sphere. The spatial discretization employs a modal basis set consisting of Legendre polynomials. Fluxes along the element boundaries (internal interfaces) are approximated by a Lax-Friedrichs scheme. A third-order total variation diminishing Runge-Kutta scheme is applied for time integration, without any filter or limiter. The model has been evaluated using the standard SW test suite proposed by Williamson et al. (1992). The DG scheme shows exponential convergence for shallow water test case 2 (flow over a mountain). The DG solutions to the shallow water test cases are comparable to that of a standard spectral-element model. Even with high-order spatial discretization, the solutions do not exhibit spurious oscillations for the flow over a mountain test case. However, a spectral-element model or a global spectral model produces spurious oscillations for this particular test.

The model conserves mass to the machine precision. Although the scheme does not formally conserve the global invariants such as total energy and potential enstrophy, these quantities are better preserved than in existing finite-volume models. Currently, the DG transport scheme is being implemented in the NCAR/SCD High-Order Multiscale Modeling Environment (HOMME).

Radial Basis Functions (RBFs)

CSS has been focusing on two areas of research within RBFs. The first is examining the interpolation properties for oscillatory Bessel RBFs: these are an entirely new group of RBFs with interesting properties. For example, it has been shown very recently that pseudospectral (PS) approximations are just a subclass of RBF approximations in the flat basis function limit, i.e. as the parameter that controls the shape of the RBF goes to zero. Not only do oscillatory Bessel RBFs possess unconditional nonsingularity of the interpolation matrix for any scattered node distribution, but they are the only class of RBFs immune to divergence of the interpolant in the limit that the shape parameter goes zero. To further explore the relationship between PS and RBF approximations, it is important to understand the accuracy of oscillatory Bessel RBF interpolation in multi-dimensions. Dr. Natasha Flyer in CSS has proven in one-dimensional space that an oscillatory Bessel RBF expansion on an infinite lattice will exactly reproduce an n-dimensional polynomial of any order. She has gone on to provide an extension of the proof to arbitrary n-dimensional space. This is a great leap forward in RBF theory, as this is the only class of RBF functions known to possess this property. Dr. Flyer is working with Dr. Elisabeth Larsson of Uppsala University to extend this result to scattered node locations rather than a lattice.

The second RBF research area is to develop a theory applicable to spherical geometries. The importance of this research is to develop a new grid-free approach using RBFs to solve time-dependent PDEs in spherical domains. Such an approach is singularity free (due to its independence of any surface-based coordinate system), spectrally accurate for arbitrary node locations on the sphere, and naturally permits local mesh refinement. No other discretization method currently in use in spherical geometries can attest to these properties.

Modeling Solar Coronal Mass Ejections

The CSS collaboration with HAO studying Coronal Mass Ejection involves three related efforts. The first project is to extend recent results related to magnetic-field confinement in the solar corona. The second project, with Mei Zhang of the Chinese Astronomical Observatory, is to show that there is an upper bound on the amount by which the total magnetic energy in the force-free field for a dipole field configuration can exceed the Aly limit, which is defined by the amount of energy needed to completely open the solar magnetic field (i.e. have one end of a line of force anchored to the sun and the other running out to infinity). Dr. Flyer in CSS has been able to show numerically that not only does such a bound exist, but that it is 8.33%. This number has been guessed by some physicists in the field, but never before verified either numerically or analytically. The last project is to solve the hydromagnetic equations describing magnetic fields in realistic three-dimensional geometry, both in the force-free state and in force balance with plasma pressure and gravity. The general 3-D case is far more demanding computationally, featuring four coupled PDEs in three space dimensions, and is the subject of a recent CSS proposal submitted to NASA. This will be a cross-collaborative effort with HAO, CU-Boulder, and University of Wisconsin-Madison.

Shallow Water Flows Develop Singularities on the Sphere

In 2004, research demonstrating that certain shallow water test cases on a non-rotating sphere develop singularities was completed, and a paper on this topic has been accepted for publication [Swarztrauber 2004].

Development Activities

CSS development activities are aimed at providing modeling frameworks and mathematical libraries that support the research community's efforts to create portable and efficient models and scalable and efficient post-processing tools.

Earth System Modeling Framework

The Earth System Modeling Framework (ESMF) is building software infrastructure for climate, weather, and data assimilation applications. Collaborators include NCAR SCD, CGD, and MMM, NOAA GFDL, NOAA NCEP, MIT, the University of Michigan, DOE ANL, DOE LANL, and NASA/GSFC GMAO. The project is organized around a series of 11 milestones, the first five of which were submitted during FY2002 and FY2003.

The sixth and seventh ESMF milestones, submitted during FY2004, marked the public release of Version 2.0 of the ESMF software, the demonstration of three interoperability experiments using the framework, and the Third ESMF Community Meeting held at NCAR in Boulder, Colorado on July 15, 2004. The day-long Community meeting included a discussion of features in the release, a brief tutorial on adopting ESMF, and presentations describing how ESMF has been used to create applications from existing components developed at GFDL, MIT, NCEP, NASA and NCAR. ESMF Version 2.0 code and documentation can be downloaded from the ESMF website, http://www.esmf.ucar.edu/

The ESMF Partners and active collaborators list expanded to include groups at the DOD Naval Research Laboratory and the DOD Air Force Weather Agency, as well as existing partners at the Goddard Institute for Space Studies, UCLA, the Center for Ocean-Land-Atmosphere Studies, and the NASA GSFC Land Information Systems project. ESMF continues to coordinate with the European Programme for Integrated Earth System Modeling (PRISM) and the DOE Common Component Architecture (CCA) projects.

Spectral Toolkit Development

Work developing high-performance portable, highly efficient, open source numerical libraries for use by the mathematical and geosciences communities made steady progress in first part of FY2004, but this effort slowed later in the year due to staff hour cutbacks in this area to support CRCP physics integration activities.

In particular, development of the Spectral Toolkit library continued with the completion of the multithreaded spherical harmonic transform. Also developed were new distributed memory (MPI-based) 2D and 3D FFTs, using a generic pairwise generic transpose algorithm developed for the library. To date, completed components of the Spectral Toolkit include:

  • General mixed radix real and complex FFTs that are highly portable and among the fastest available.
  • Complex number class and operators that enable very efficient native complex arithmetic comparable to Fortran.
  • Multithreaded 2D and 3D real and complex FFTs scalable on both dual processor workstations and large enterprise-class servers.
  • Distributed 2D and 3D real and complex FFTs that enable scalable transforms on clusters using unique generic pair-wise transpose.
  • Multithreaded spherical harmonic transform that uses recurrence relations for spherical harmonics for small memory footprint, along with cache-blocked matrix multiplication for efficient multiple-instance transforms.
  • Associated Legendre function accurate evaluation at arbitrary points for generating spherical harmonic or spectral element reference points.
  • Gaussian quadrature very accurate Gauss-Legendre quadrature that provides high-resolution grids without requiring quad precision, also includes Gauss-Lobatto points and weights for spectral element grids.
  • Multidimensional generic array allocation that provides very efficient multidimensional array support comparable to Fortran, including triangular arrays for storing spherical harmonic coefficients.

Development Activities

Work continued on the collaborative Earth System Modeling Framework (ESMF). A much anticipated release of ESMF software Version 2.0 occurred in July 2004. The ESMF Version 2.0 release includes software for representing and manipulating components, states, fields, grids, and arrays, as well as a number of utilities such as time management, configuration, and logging. It runs on a wide variety of computing platforms, including SGI, IBM, Compaq, and Linux variants.

The Grid-BGC project completed a top-level user interface design, selected a GIS technology for handling maps and geographical information, began implementing Globus protocols, made implementation decisions regarding the software framework, completed "look-and-feel" designs for static and dynamic visualization tools, and performed data transfer and computational capacity testing on existing parallel hardware.

The Earth System Grid (ESG) moved into production mode for climate model research data with dedicated service for IPCC services in the area of coupled climate model data.

SCD completed work on the Web100 and NET100 projects.

Computing Center Operations and Infrastructure

Applications

Released new versions of the MySCD portal which provides, for the first time, customizable GAU charging information directly to the users. SKIL was upgraded to include modify/delete functionality. Two collaborative projects are ongoing with the University of Colorado: the METIS event-based workflow system evaluation is nearly complete, and a group of students are modernizing the room reservation system.

MySCD Portal version 2.0 Release

The new version of the MySCD portal was released. This particular release was delayed as a result of security changes. These security changes required that the Portal system be retooled to support one-time passwords. The September release adds significant functionality that for the first time allows users of SCD's computational resources to get a summary view of their allocation usage including total percentage charges for computational and mass storage usage.

GAU usage graphs




GAU usage graphs

Additional accomplishments for applications:

  • Remedy 5.1.12 upgrade
  • SKIL release 2.0: addition of modify/delete functionality
  • Metis Workflow System: completed analysis and evaluation of the final Metis system in the NCAR environment
  • Room Reservation System: a senior capstone project with the University to develop a next-generation reservation system

Computer Room Infrastructure

SCD is developing both short and long-term plans to meet the demands of future computing systems. Multiple options are being developed that include building a second data center to expand, this work will continue into FY2005. The Mesa Lab standby generators were commissioned and put into service, and some field modifications will be made to simplify their operation. The computer room has reached its maximum cooling capacity; FY2004 focused on a design and procurement process to upgrade the chilled water systems.

The standby power generation system was installed and commissioned in March 2004. The shakedown and familiarization with the systems continued well into the summer with some modifications to the control sequence as the result of lessons learned.

standby power generator

Additional accomplishments for infrastructure:

  • Chilled water expansion project: design was completed and the construction phase will begin in early FY2005
  • Data center expansion investigation: an extensive investigation is in progress to best meet the infrastructure needs of the organization that are being driven by the scientific demand for simulation capabilities. Options being investigated include:
    • Lease
    • Build
    • Collocate
    • Upgrade air handling units
    • Install Linux cluster

Computer Room Operations

Supported, set up, and distributed CryptoCards as part of a new responsibility associated with strengthened security requirements. Media conversions continue with the move to 200-GB media. Rotating schedules have been a success with much more exposure of all operators to the rest of SCD staff.

Operations stepped into a new support role that came about as part of new security requirements. During the March timeframe, the implementation of one-time passwords resulted in a new need to distribute and support CryptoCards.

Enterprise Services

Network and system security dominated the year. Several significant changes, including the introduction of one-time passwords, were completed very quickly to secure the supercomputing assets. Storage area networks were investigated along with several significant upgrades to production systems for data provisioning and web access.

During March a new security perimeter was quickly established, and the Distributed Systems Group (DSG) designed, implemented, and rolled out a one-time password solution to protect supercomputing assets from intrusion attempts. These efforts were instrumental in returning the supercomputer systems to the network for community use in a three-week period. Since then, there have been a number of attempts that have been successfully turned away.

In addition to the security work, a Storage Area Network (SAN) testbed was put together. A cost-effective solution is under investigation that will utilize Serial ATA technologies and a SAN solution that works in a heterogeneous environment. At this early point, Linux and Sun systems have been successfully integrated into the SAN.

Additional accomplishments for enterprise services:

  • Server upgrades and additions:
    • Web cluster
    • DSS (huron)
    • Database server
  • Windows desktops:
    • AD migration to CIT AD or SCDAD
    • Automated virus updates and patches
    • Veritas for backup solution

Network Engineering and Telecommunications

The Network Engineering and Telecommunications Section (NETS) is responsible for all engineering, installation, operation, maintenance, strategy, planning, and research regarding the state-of-the-art data networking and telecommunications facilities for NCAR/UCAR. Support of these facilities requires NETS staff to:

  • Troubleshoot network hardware and software
  • Update network configurations
  • Monitor network performance, load, and errors
  • Expand networks to meet increased numbers of users, increased bandwidth requirements, and new standards
  • Design, engineer, and construct new cabling plant and wireless networking systems
  • Support over 167 logical networks, approximately 200 monitored network devices, and over 4,400 network-attached devices at NCAR/UCAR
  • Support Remote Access System (RAS)
  • Design, engineer, and maintain UPS infrastructure
  • Provide network-based security
  • Support telecommunications and voicemail
  • Provide telephone operator and directory services
  • Support all SCD internal networking needs
  • Maintain network documentation and databases
  • Track and coordinate all network project activities via a work-request/work-tracking system
  • Manage networking equipment inventory
  • Develop and manage network budget
  • Manage and engineer the Front Range GigaPOP (FRGP) and the Boulder Point of Presence (BPOP)
  • Help manage the Boulder Research and Administration network (BRAN)
  • Lead, participate and support the networking portion of advanced projects in high-performance network research and development that are a product of outside or interagency funding, including international networking project support, Web100, NPAD, Cisco Large Packet Grant, Net100, National LambdaRail (NLR), and the Quilt
  • Evaluate new networking technologies and equipment
  • Consult with network users about configuration and performance issues with network applications, network hosts, and network connections
  • Provide high-performance networking testbeds
  • Provide high-performance networking measurement tools
  • Tune network performance

In summary, NETS provides a vital service to the atmospheric research communities by linking scientists and supercomputing resources (including mass storage systems and other data processing resources) at NCAR to other resources and scientists throughout the university research community. Such high-performance networking activities are essential to the effective use of NCAR/UCAR's scientific resources, and they foster the overall advancement of scientific inquiry.

The primary NETS accomplishments in 2004 include:

  • Participating in the National LambdaRail (NLR) Project
  • FL0 and CG1 design
  • Dark Fiber Expansion
  • Participating in The Quilt Project
  • CyRDAS Report

These projects along with the rest of the NETS 2004 accomplishments are described in this Annual Scientific Report.

Networking research projects and technology tracking

Networking research projects
NETS is a principal collaborator in several nationally recognized research and development networking and data communications technology projects. NETS hosts and presents at national and regional meetings on a variety of networking projects. NETS was awarded an STI award for the Network Path and Applications Diagnosis (NPAD) project proposals in collaboration with the Pittsburgh Supercomputing Center (PSC), and NETS contributed to the NIH BRIN Lariat project. NETS assisted in the submittal of the NSF Chronopolis proposal and NSF IRNC proposal.

Steering Committee for Cyberinfrastructure Research and Development in the Atmospheric Sciences (CyRDAS)
Marla Meehl served on NSF's The Steering Committee (SC) for Cyberinfrastructure Research and Development in the Atmospheric Sciences (CyRDAS). This committee was responsible for assessing the opportunities for advances in atmospheric science research that are made possible by current or anticipated advances in information technology and computer science, the opportunities for advances in science that might result from collaborative research between atmospheric scientists and computer scientists, ways in which cyberinfrastructure can contribute to formal and informal education in atmospheric science and the cyberinfrastructure needs of the NSF-funded atmospheric science research community. The committee also made recommendations on strategies for NSF that will help the academic research community exploit the opportunities identified in the assessments above. Finally, the CyRDAS Steering Committee developed an implementation plan for a distributed cyberinfrastructure that will meet the needs of the academic atmospheric science research community and which includes the flexibility to grow smoothly as that research advances and CI needs grow.

The Hybrid Optical and Packet Infrastructure Project (HOPI)
When Internet2 was first organized in October 1996, one defining mission was to provide scalable, sustainable, high-performance networking in support of the research universities of the United States. The resulting infrastructure, comprised of campus, regional, and national components, is a successful and robust packet switched network. In the next few years, however, it must evolve to take advantage of new infrastructures, and the HOPI testbed will examine future network architectures. In planning on the Internet2 networking architecture needed beginning 2006, HOPI is considering a hybrid of shared IP packet switching and dynamically provisioned optical lambdas. The term HOPI (for hybrid optical and packet infrastructure) is used to denote both the effort to plan this future hybrid and a testbed facility that will be built to test various aspects of candidate hybrid designs. A white paper describes a plan for the HOPI testbed facility. The goal of that facility is "to provide a facility for experimenting with future network infrastructures leading to the next generation Internet2 architecture." The eventual hybrid will require a rich set of wide-area lambdas together with switches capable of very high capacity and dynamic provisioning, all at the national backbone level.

Web100 project
The Web100 project is funded by the National Science Foundation (NSF) as a collaborative project developing end-host TCP performance measurement and enhancement tools, which help end-hosts automatically and transparently achieve high TCP data rates (>100 Mbps) over the high-performance research networks. Software and tools have been developed for the Linux operating system in an open manner so they can readily be ported to other operating systems. Such ports are currently in progress. The Web100 principal partners are the National Center for Atmospheric Research (NCAR), the Pittsburgh Supercomputing Center (PSC), and the National Center for Supercomputing Applications (NCSA).

The Web100 project has achieved significant progress on several of its key project milestones during the past year. By releasing the Web100 software to the general user community, numerous individuals and groups have incorporated the software into a wide and diverse set of useful applications. In addition, the Web100 TCP Extensions MIB is on the Internet Engineering Task Force (IETF) standards track and is expected to be submitted for last call by the end of this calendar year. Most importantly, the Web100 software is currently being officially incorporated into the Linux 2.6 kernel development release for use as library functions in all Linux distributions. Microsoft has reported incorporating much of the Web100 software functionality into the next release of their .NET server software, and a BSD port of the Web100 software is also underway. This project was completed in August 2004.

Net100 project
The Net100 project is creating software extensions that allow computer operating systems to dynamically adjust to the available network bandwidth for large data flows. Net100 has just completed the final year of a three-year grant from the Mathematical, Information, and Computational Sciences (MICS) program in the Office of Science at the U.S. Department of Energy (DOE). The project is a collaboration between the Pittsburgh Supercomputing Center (PSC), the National Center for Atmospheric Research (NCAR), Lawrence Berkeley National Laboratory (LBNL), and Oak Ridge National Laboratory (ORNL). During this past year considerable additions have been made to the wide-area daemon (WAD) extension algorithms. The WAD code has also been extensively used and tested by several groups of external users, providing useful feedback to the Net100 team. The Net100 project has also implemented algorithms previously only run in simulations, providing useful theoretical feedback to the algorithm's author for additional modifications along its Internet Engineering Task Force (IETF) standardization process.

Network Path and Applications Diagnosis (NPAD)
A key missing piece of the end-to-end performance puzzle is that the current set of diagnostic strategies do not adequately account for the effects of path delay. The project team is developing extensions to existing diagnostic tools which will effectively take path delay into consideration, compensate for a variety of delay times, and test the effects of these new diagnostic tools with network users and operators, using actual high-performance applications. Due to recent insights gained from the Web100 and Net100 projects, we can show that the missing piece of the performance diagnostic puzzle is that the symptoms of most application and network defects scale with increasing path delay. For example, a minor defect in a campus LAN might have an insignificant or negligible effect on an application running on a 1-ms path across campus. However, that same defect has a greater impact on performance when running on a long path across the continent. In this project, we are developing new diagnostic techniques, which include a suite of diagnostic tools and strategies for their use to test applications locally with a 100-ms virtual path, and then test each successive segment of the actual path extended with a virtual path to a total path delay of 100 ms. Such testing is expected to expose otherwise hidden flaws and impediments that contribute to delay, since each component of the path and application can be tested in a context equivalent to an ideal end-to-end path, while ruling out other potential flaws.

Cisco University Research Program Funding - Investigating Large Maximum Transmission Units (MTU)
Over the last decade, we have witnessed a tremendous increase in raw network capacity. Today, we are seeing the ubiquitous deployment of cross-country 10 Gbps optical networks and early standards efforts in support of 40 Gbps and 100 Gbps networking technology. As a result, long fat pipes (elephants) [RFC1323] are no longer a rarity -- in fact they are becoming the norm. For example, NSF funded the 40 Gbps Distributed Terascale Facility (DTF) [RB02], UCAID has built a sonet-based 9.6 Gbps Abilene Network of the Future [Int02], and contractual negotiations have been completed on a dark fiber lambda network called the National LamdaRail. However, it is not clear whether these network capacity increases will actually result in comparable increases in application performance due to a number of specific underlying technical issues and limitations. It is well known that bulk transport application performance has not kept up with network capacity increases. We believe that application performance has fallen by about two orders of magnitude relative to the raw performance of the underlying technologies. In this proposal, we intend to identify, investigate, and address technical issues specifically associated with the underlying network infrastructure and application performance for next-generation Internet networks.

NOAA High Performance Computing and Communications (HPCC) Program
NOAA has asked NCAR to participate with their proposal to the NOAA High Performance Computing and Communications (HPCC) Program, with no funds proposed for NCAR. NOAA and NCAR are making excellent advances in supercomputing and planning for the development of enterprise network architecture. However, NOAA and NCAR currently do not have a distributed network test facility for evaluating new network technologies, other than in local environments, although the benefits of such a facility are numerous. NOAA and NCAR would be well served by developing a way to test and integrate new network technologies to gain vital experience for promoting NOAA and NCAR research and operations. We propose to develop an optical network testbed using Dense Wavelength Division Multiplexing (DWDM) by leveraging NOAA and NCAR's existing investment in a metropolitan fiber optic network. This network testbed will provide a platform for integrating DWDM technology, and for determining how DWDM will best serve NOAA and NCAR research and operations, but it will also address other important challenges. The optimization of data flows and server input/output at 10 Gbps will require study, testing and tuning, and the question of how to secure data flows at this rate will also be addressed. The proposed DWDM network testbed will include a pair of optical network switches with Gigabit Ethernet (GbE), 10 GbE, and DWDM components to be procured and located at NOAA Boulder and at NCAR. An optical tap will also be part of the testbed to allow passive monitoring of all data flows. In addition, four 64-bit computer systems will be procured to transmit, receive, monitor, and secure the data flows. Two systems will be located at NOAA and two at NCAR. NCAR will cosponsor 5% of Pete Sakosky's time and provide a rack of computer room space for this project. This proposal was submitted and is pending approval.

Access Grid
NETS participated in the network configuration, testing, and optimization of the SCD Access Grid. NETS will continue to participate in the deployment of operational Access Grids.

Earth System Grid project
NETS provided network-engineering support to the DOE ESG II project.

Network technology tracking and transfer
NETS tracks networking technology via networking conferences, training classes, vendor meetings, beta tests, technology demonstration testing, email lists, networking journals, attending user conferences, and by meeting and exchanging information with universities and other laboratories. NETS staff continued to utilize all of these technology-tracking avenues during FY2004. New technologies are integrated into the production UCAR networks on an ongoing basis after a technology has met our capacity, performance, connectivity, reliability, usability, and other maintainability requirements.

Local Area Network (LAN) projects

NETS supports both NCAR/UCAR network needs as well as the special networking needs of SCD itself. Therefore, all LAN projects are further subdivided as being either NCAR/UCAR LAN projects or SCD LAN projects.

NCAR/UCAR LAN projects

UCAR network infrastructure recabling projects
The common goal of all UCAR recabling projects is to provide each workspace with a standard set of dedicated data communications links. The overall plan calls for each workspace to be provisioned with a standard Telecommunications Outlet (TO) that connects with four Category 6 (CAT6) twisted-pair cables and two pairs of multi-mode optical fiber. Additionally, intra-building (trunk) cabling must be installed to concentrate all workspace cables to intermediate and central locations.

Concurrent with recabling, each network device is delivered 100 Mbps of dedicated bandwidth via a dedicated Ethernet packet-switch port. Such dedicated-port access offers substantial networking performance improvement over shared-media Ethernet access.

NETS designed permanent network infrastructure for the CG1 and FL0 buildings. NETS assisted in the extensive relocation and related cabling for FL4 staff. NETS recabled the FL2-FL3 interbuilding cabling due to FL0 construction.

NETS also participated in the following projects: CG bike path design, Jeffco hanger networking design and implementation, UNAVCO move and lease completion, and the Nextel cellular repeater design and implementation.

Network monitoring project

NETS continues to use HP Openview, flowscan, Prognosis and Cricket as its principal tools for network monitoring and statistics gathering.

Additionally, NETS has installed certain specialized network monitors at the request of two national network-measuring organizations, namely MOAT and Internet2. NLANR's MOAT organization has placed an OC3MON monitor at NCAR's Mesa Lab and installed an OC12MON in the Front Range GigaPOP equipment racks located in CU Denver's computer room. MOAT has also placed an AMP monitor at both the Mesa Lab and the FRGP as well. On behalf of UCAID's Abilene network, Internet2. (ANS) has placed a Surveyor network monitor at both the Mesa Lab and at the FRGP.

Local serial-access project
NETS supports several terminal servers for providing serial console access to various computer and networking equipment. Serial support is also provided for the very few serial terminals remaining at UCAR.

NETS CSAC support project
The NCAR/UCAR Computer Security Advisory Committee (CSAC) is chartered by the SCD Director to assess the state of computer and network security at NCAR/UCAR, and to make recommendations to assist NCAR and UCAR management in setting policies related to the security of computers and other devices attached to the NCAR/UCAR network. CSAC membership is composed of technical representatives located throughout the various NCAR/UCAR organizations.

NETS is involved with CSAC since nearly all security policies involve various types of network-connected devices located between the networks belonging to the external world and the UCAR networks that are being protected from the external world. These network-attached devices can operate as filters and/or authentication devices operating at one or more OSI (Open Systems Interconnection) layers, usually at the Network/Router Layer (Layer 3) and higher. Based on CSAC recommendations, NETS continues to implement significant new gateway router filters to improve network security for UCAR. Extensive testing and extensive coordination throughout UCAR is required to implement the recommended security filters. NETS also cooperates on wireless, RAS, and VPN security measures.

VLAN Splitting Project/Layer 2/3 Design
NETS is in the process of re-engineering our backbone to provide higher reliability and redundancy to network-based services. Previously, NETS allowed single subnets and VLANs to span across all campuses. This was a manageable design when NCAR was located in only two main campuses. The design offered increased convenience for users in being able to request any subnet to be activated anywhere at the Mesa Lab or Foothills Lab campuses.

NETS was driven to reevaluate this design with the addition of the third major campus, Center Green, and a desire of increased reliability for VoIP and business continuity. The old design did not include any redundant links, which would create network loops that needed to be dealt with using the spanning tree protocol. However, with the addition of the CG campus, NETS has built redundant links so the ML, FL and CG campuses are all connected in a triangle. While the spanning tree protocol works adequately in a simple, loop-free network, it is quite poor at handling redundant links and the loops they form. To take full advantage of the entire mesh of links between the campuses, it is necessary to use a more intelligent protocol at a higher layer.

At the IP layer, NETS has been using OSPF for a number of years. It is fully capable of handling the current topology, not only detecting and routing around link failures in a matter of a few seconds, but properly routing traffic over the shortest path between campuses. Presently, NETS has the new backbone fully deployed and is in the process of restricting subnets to a single campus. Completion of the project's final stage is expected before the end of the year.

Multicast support activities project
Multicasting is a technology in which a single outbound stream of data can be made to arrive at multiple destinations. The data stream is multiplied in a tree-wise fashion using both software and hardware to effect the multiplication. Multicasting technology is particularly useful for video conferencing and audio conferencing applications. NETS continues to support and enhance multicast services for UCAR.

UPS project
NETS has continued installing UPS (Uninterruptible Power Supply) units into all new telecommunication closets so all networking equipment will receive short-term standby power in the event of any short-term power failure. UPS units also help filter out damaging power spikes. Upgrading, expanding and maintaining these devices is an ongoing process.

In addition, SCD installed a generator at the Mesa Lab and NETS has tied their equipment into this in the computer room and in areas where safety and security support is critical. NETS is also in the process of tying their UPSs at FL into the facilities generator to provide additional business continuity.

Grounding
NETS is in the process of grounding all communications closets and NETS hardware to eliminate static issues causing hardware and phone failures. This is a technically difficult and time-consuming project.

Wireless
NETS supports wireless network access in all public areas and conference rooms, and is in the process of deploying wireless access in all office areas as well. NETS also designed, tested and installed a long-distance, high-speed, wireless link between CG2 and FL4 a