Historical status reports for 2006

09:40    AAPT seem to have stuffed up the BGP announcements, resulting
         in routes not being advertised. Fixed.

08:29    There seems to be a widespread connectivity issue.
         Seems to have started at 0300, most significantly affecting our DSL
         customers. Problem appears to be upstream of our upstream provider.
         Reports from other ISPs suggests a connect or optus problem. No ETR yet.

13:45    Due to fire and exploding gas cylinders in nearby buildings, the ALI
         offices have been evacuated. Due to the short notice, no phone diversion
         or other arrangements have been made for customer support.
06:30	 Intermittent but severe packet loss between telstra Albury and
	 Melbourne routers. This doesn't seem to be affecting everything,
	 and that which is affected is affected in unpredictable ways.
	 Reported to telstra, waiting on response.

05:00    Again, Glebe is off-air, resulting in loss of all DSL services.
	 Engineer has been sent to the site, updates as they come to hand.
	 Updated: 07:30. All equipment is powered-down, both primary and
	 secondary cisco routers are dead.
	 Updated: 11:00. Servers are running, Cisco routers are terminal.
	 Not withstanding its a public holiday in NSW, a replacement big
	 router has been sourced and is enroute to the datacentre now.
	 Anticipating restoration of services hopefully around 13:00

10:00    Techs have gained access to the site and determined that the same
	 interface card on both the primary AND BACKUP routers have failed.
	 A temporary fix has been achieved, but replacement parts from Cisco
	 may not be available until early next week, so there may be some
	 small outage in order to bring new equipment on-line then.
09:00    Something has happened at the data-centre in Glebe, loss of all
	 connectivity for adsl services. Engineers are at the site, but
	 their security cards are not letting them in, and the staff at
	 the security desk can't get in either. Waiting on an update.

00:23    One of the big APC UPSs has died, shutting down all the servers.
	        UPS replaced with a hot spare, servers back up. Just bedding thing
         in, making sure all services are running properly etc.
         (First time all the servers have been off for 4 years)
23:56    Major loss at our Albury office. Remote tests indicate most (but not
	        all) primary servers failed. Tech recalled to site.
04:25    At 04:25 (approx) there was a widespread ADSL outage. The cause is still
	        being investigated. Although many services re-established within a few
         minutes, a significant proportion required a power-cycle of the routers.

12:30	 Strange delays and packet loss from albury to and beyond melbourne.
121265350 Telstra advise "there are no outages, anywhere" and told me to reset
	 mo adsl modem, and seemed perplexed at what a "megalink" was. Have
	 taken details and promise a call back shortly.
	 Traceroute to www.abc.net.au:
	  2  albury-core.albury.NET.AU (  3.829 ms
	  3  Serial2-6.alb3.Albury.telstra.net (  9.529 ms
	  4  *
	  5  *
	  6  *
	  7  *
	 Before that:
	  2  albury-core.albury.NET.AU (  4.631 ms
	  3  Serial2-6.alb3.Albury.telstra.net (  16.222 ms
	  4  ATM6-0-0-4.lon-core2.Melbourne.telstra.net (  97.309 ms
	  5  TenGigabitEthernet8-2.lon55.Melbourne.telstra.net (  57.639 ms
	  6  *
	  7  *
	  8  *
	 Same results to various national and international sites.

09:25    Multilink group bundles were out of sync from end to end, and seemed
	 they could not recover. Required a "reset" of the Albury tigris to
	 get things back in sync. Seems to have only been the corryong link
	 that was affected, although a small number of users who were dialed
	 into the Albury Tigris at the time will have needed to redial.

04:19    Loss of connectivity to both Corryong and Wangaratta sites.
	 Appears to be a telstra ISDN problem affecting some links only.
	 Corryong has dialed back in fter 11 seconds, but not passing traffic.
	 Wangaratta came back on-line 08:05 after a call to the NAS prompted
	 it to bring up the link.
	 Still trying to resolve the issue with Corryong. Routing, call state
	 etc all looks perfect from the Albury end.

07:44    ADSL Services all restored. The exact cause is still being isolated, but
	        it appears the border router (Cisco 7200) lost power to both supplies,
         which were plugged into seperate rails on seperate circuits over two
         phases... and for reasons still being investigated, when power returned
         the router had lost its entire config, which required additional people
         and equipment to attend the site to restore. More information as it is
17:41    Loss of all DSL connectivity affecting all our ADSL customers and
	 some office services on IP addresses serviced by our own adsl link.
	 Due to a system error at AAPT wholesale, disconnection requests were
	 sent to telstra during internal churn of all our services. Restoration
	 of line codes on customers line being achieved as quickly as possible.

04:28    Loss of radio connectivity from office to Springdale AP resulting in
	 all radio sites down. Tech called to site. All equipment operating
         correctly but not passing traffic. Interface administratively marked
         as down, then up, everything working again at 04:46
18:05    Connectivity to Albury seems to have been restored through all services
	 although telstra are still advising it may be up to midnight before all
         services through the affected areas are back on-line. Damage to two
         different fibre-optic cables in two seperate locations caused the fault.
11:03    A major fibre cable cut is affecting all our Albury services.
	 AAPT, Telstra, Comindico/Soul are all down, data and voice.
         No restoration time advised yet, although aparantly both Albury
         and Griffith are affected (quite likely, elsewhere too. Parts of
         Wodonga are known to be down also)
14:12	 Strange memory corruption on nameserver has required a
	 rare reboot. Brief interruption (about 2 mins)

11:00	 Just discoverd that today is the day that all the default
	 timezone files on all our servers said daylight savings
	 ended. Alas, it was changed due to the commonwealth games
	 and now ends 2nd April. Main server fixed now, and I am
	 rolling out the changes to all other servers.

21:00	 Power restored to safe levels, site again operational.
	 Problem believed to be AVR (regulator) at Shelley,
	 which has now been forced into Manual mode until
	 repairs can be effected, hopefully tomorrow morning.

19:55	 UPSs have shutdown, mains still at dangerous levels
	 but with horrible waveforms.
	 Still no word from TXU. At this stage, all customers
	 using our Corryong POP are affected.

19:17	 Major power problems in Corryong. Mains voltage at 280V.
	 TXU called but have no idea of the fault or restoration
	 time. Site running on UPS for now.

12:10	 Brief interruption to service on all Albury digital services
	 while telstra re-configured the Onramp lines. (This had been
	 scheduled for 1/Feb/2006 but telstra let us down, again!)
	 (13:45, Wangaratta also re-configured, no users affected)

11:50	 Wangaratta site required urgent chassis change due to
	 catastrophic failure of 2 cooling fans. Chassis changed
	 and back on-air 12:05

12:32	 Power restored. Genset shutdown, refueled, ready again.

11:33	 Power failure in head office. Moderately widespread, much
	 of the immediate area was completely out, or mostly out.
	 We lost 2 of 3 phases, running computer room on genset.

08:20	 Power glitch, loss of temporary power supply in the Albury
	 wireless AP basestation. Proper 18V supply sourced and now
	 installed, site back to full operation 10am.

07:26	 Power restored to Corryong.

06:50	 Power loss in Corryong at 6:03, Running on UPS. Batteries
	 finally gave out, site down.

12:50	 Power loss in much of Albury, believed to be caused by
	 an "incident" by Abygroup with the freeway construction.
	 Power restored to most areas by 13:10, but some wireless
	 customers still out due to loss of our albury-central AP.
	 Took some time to gain access to the site because of no
	 keyholder. Finally gained access about 16:45 and was able
	 to identify the cause was a fried switchmode power supply
	 for the radio equipment. Replaced and site back online in
	 a matter of 3-4 minutes.

17:59	 Wangaratta restored.
17:05	 Lost Wangaratta site. Strong winds and storms, a
	 power fail is believed to be the problem. Trying to
	 get someone to attend the site.
	 Sites affected will be Yarrawonga, Benalla, Wangaratta