Enabling Researcher-Driven Innovation and Exploration
Announcement
09/30/2008 ITS has scheduled a maintainance on the router for the Internet 2 connection for Oct. 10 from 6 to 6:30 am. This should have no impact on the cluster. However, you may notice
reduced bandwidth if you are attempting to communicate with a I2 site as you will be routed over our regular internet connection.
09/26/2008 Update on storage array firmware issue: We have restored our storage arrays to full functionality by taking
advantage of a feature in GPFS which allowed us to move data off of
the affected storage arrays without requiring a downtime. We
rebooted the storage arrays, which then came up with both
controllers enabled and in their default fully redundant
configuration. We are currently using GPFS to restripe the data in
both the /home and /scratch filesystems back across all storage
arrays. Upon completion of the restripe (which is occurring in the
background and only when there is not other I/O to be done) we will
be back to the way things were prior to the discovery of the
firmware bug.
As previously announced on this issue on 09/04/2008: We have discovered a bug in the firmware
on the storage arrays we use for /home and /scratch on both the production and test
clusters. Currently this bug is only affecting one storage array in
each cluster.
This bug has disabled the primary controller on these two storage
arrays. Therefore, if the secondary controller in either of these
arrays were to fail, there would not be a redundant controller
available to pick up for it, which would cause a temporary loss of
access to any data which happened to reside on the affected storage
array. Please note, the data itself would not be lost, but the
access to it would be lost until the issue with the storage array
could be resolved.
We are working with the vendor of our storage arrays to implement a
fix for this issue. At present we do not have an ETA on that fix.
More information will be post to the web page as it becomes
available.