Work procedures for UCAR web systems

Non-Emergency Activities

User-visible activities - activity that involves change in the service that directly affects users.

Down time - any time that system is not capable of carrying out its normal, scheduled production workload, service, or function.

Scheduled work activity must adhere to the following guidelines:

  1. Notice of the work that has been performed should be posted on the webdoc site as soon as it is practical. (or in the web forum, or in the web mailing list)
  2. If the activity involves changes to service daemons (such as httpd binary changes), the test reboot must be performed within a week after the change has taken place. The test reboot is announced separately as a system down time.
  3. If the activity is user-visible, advance notice will be given to all web users at least 24 hours before the anticipated activation of the change. Use the groupmail script as described here. If change is to the Real service, notification goes to the rmserv notify mailman list. If change is to the Tomcat service, use the tomcat notify mailman list.
  4. If the activity involves a downtime , work will be performed during non-peak hours. Peak hours are 0600-1800 Mountain Time (MT), Monday through Friday (inclusive of the noon hour). The preferred time for down time is 1800-2300 MT weekdays or any time of day on weekends or holidays. Morning time may be used in cases where scheduling conflicts prevent use of the evening time.
  5. The work must be announced in the following manner:
  • If the activity is user-visible, announcement should be sent to web users with the following information: the system(s) affected , time of the anticipated activation of the change, description and impact of performed activity on users and system performance.
  • If the activity involves down time , a user downtime notice should be emailed to the appropriate mail alias and contain the following information: the system(s) affected by the downtime, the start and end date and time of the downtime, the staff involved, and the purpose of the downtime.
  • If the work involves a permanent change to a system , a Change Control notice should also be sent. Note that Change Control should not be used to announce routine reboots, diagnostic work, or things that do not involve a permanent change.
  • In case of a major system change, announcements must be sent to ncab@ucar.edu, nsag@ucar.edu, wag@ucar.edu, and published in "This week at UCAR" announcements.

Emergency Activities

This section applies to any emergency work, requiring down time on any system the effect of which has a direct, user-visible impact.

Emergency user-visible system down time must be reported as follows:

  1. Notice of any emergency down time should be given to users as soon after the incident causing the downtime as is practical.
  2. Emergency down time must be announced in the following manner:
  • The announcement should contain the following information: the system(s) affected by the downtime, the start and end date and time of the downtime, the staff involved, and the cause of, or reason for, the downtime and the remedial action(s) taken.