Historical status reports for 1997
08:25 Router reloaded and services restored. TIS are initiating a full
investigation into the repeated ISDN failures on this router.
08:15 Update: Marcello from TIS is calling David Woodgate (TIS 'GOD')
because the router has completely failed again. Restoration of
service is expected shortly.
??:?? Unknown link failure. A call has been placed to Telstra, we're
3408316 both working to identify the problem and restore service.
It does however look like their router upgrade 18/Dec/97 has not
fully prevented the lockup condition it was supposed to!
08:55 Mains restored at MtBeauty. All services operational.
Less than 5 minutes service interruption resulted.
08:50 Mains failure in MtBeauty at 0800. All services continued without
interruption until the UPS finally ran out of steam.
12:40 Router upgraded and reloaded, seems to be fully operational.
Telstra engineers advise the upgraded IOS should prevent future
dropouts of the interface, and some other minor bugs.
12:00 Loss of all ISDN links to Telstra. The fault has been reported.
3408074 Further discussion indicates their core router has failed AGAIN
and due to the serious nature of the router failure, an immediate
upgrade of the IOS (Router "operating system") is being performed
immediately. This will result in a slightly longer outage, but
should mean more reliable service in future.
21:27 I've got the main UPS back on-line. For reasons as yet unknown,
the UPS which didn't blink at all the horrible failures and partial
failures during the afternoon, went off-line without notice.
It's taken a little longer to restore all systems, as some servers
have not been down for 18 months, and they needed a reasonable
time to come back up and fsck their (large) disks!
More investigation as to why the UPS went offline is under way.
21:02 Total loss of power to our computer room.
ALL systems affected. ALL sites affected.
06:55 Links restored. Problem has been isolated to a faulty router at
the Telstra point of presence. It has been an ongoing fault, and
the fault has been escalated to priority fix/replace.
00:15 Total loss of our ISDN links to the Australian Backbone
3407193 Telstra have been notified. We have no estimated restoration time.
20:15 Severe disruption to international traffic, seems to be caused
3406955 by route flapping, or router problems, apparantly at the MCI end
of the Melbourne-Bloomington international link. Engineers have
already been recalled, and are apparantly about an hour away.
3406829 International traffic increasingly slow. Telstra advise that
due to the failure of the Perth 8mbps link, additional traffic
is being routed via the east coast, and some network degradation
is being experienced. MCI are working on the problem.
12:30 Problem has disappeared. Telstra are investigating.
12:05 Loss of contact with Telstra core router.
3406765 Problem has been reported, it appears to be an ISDN fault.
01:00 Problem appears to have been a Denial-of-Service attack.
We are working with Telstra engineers in an attempt to
isolate the source of the attack.
23:30 Core router seems to have become uncontactable.
3406758 Telstra engineers have been recalled to investigate.
Telstra have advised that their main news server will be
offline for major upgrades from the following times:
Outage-Start: 6/11/97 00:00 (Aust Eastern Time)
Outage-End: 6/11/97 17:00 (Aust Eastern Time)
11:45 Service restored. Cause was a failure of power to the
core router in the Lonsdale exchange. Power has been
restored and upgraded (moved) to a more reliable feed.
11:30 Interruption to service. Cause not currently known,
3406520 but has caused loss of connectivity to all national and
international sites. Appears to be at Lonsdale exchange.
22:00 Service technicians have been recalled to the exchange for
immediate repair (after all, it WAS their fault). Full service
has been restored (better late than never).
08:45 Sometime in the last 18 hours, Telstra managed to stuff up
a number-change order to our Culcairn facilities, resulting
in a total loss of service to all Culcairn numbers.
The matter has been raised to supervisor/manager level to
affect repairs earlier than the "sometime monday" quote
from Business faults. (This fault only affected indial numbers)
17:25 The core router we connect to had crashed and re-set.
Telstra engineers are investigating.
17:17 Loss of connection to the internet.
3406342 Our main links, while connected, are not passing any traffic.
16:00 The problem at Sydney PAD8 router has been corrected by Telstra
Internet engineers. Excessive processor load due to a configuration
error has been blamed for the problems.
13:00 Intermittent packet loss, particularly on international links.
340628 The problem is believed to be the main international router,
and is being worked on at the moment. No specific cause has been
identified, and no repair time has been given.
The problem is intermittent and varied, but seems to manifest
itself as alternately fast and slow international access.
15:58 Macrolink failure was caused by a faulty link at the exchange wiring.
Repaired/replaced but not expected to cause further problems.
14:57 Total link failure (again!). This time, it's the Macrolink, not the
3406114 core router. Telstra ISDN are working on the fault and will restore
service ASAP. No estimate yet except "hopefully this afternoon or
evening". (Great help guys!).
10:55 The problem has been traced to a total failure on the core router
at our connection to the backbone, along with collapse of the ISDN
lines. ISDN restored, router re-loaded and everything operational.
09:30 There appears to be a router problem with our connection into the
3406027 Australian Backbone. Engineers have been called to the exchange
and are working on it.
10:48 A compound problem (Why do these things always go wrong together?)
1. A main service fuse blew in the Lonsdale exchange, taking out
a large part of the Telstra backbone including international
links and many domestic circuits.
2. The new router we are connected to at the Telstra exchange had
a faulty ethernet cable causing intermittent loss of connection.
08:50 More problems with Telstras new router.
3405197 A fault has been reported and is being worked on.
19:50 It appears some residual routes may have existed in the Lonsdale
router resulting in all packets being routed to the wrong router.
We're hopeful Telstra have fixed the routing properly this time!
18:39 The wheels fell off. All external connectivity past the Melbourne
3405191 core router disappeared. Telstra technicians being recalled to
the exchange to work on the problem.
18:06 Routers re-configured and back on-line. Normal service resumed,
but additional bandwidth now available to subscribers.
18:00 Added additional 256K ISDN link, which resulted in a brief
period where service was unavailable.
17:52 ISDN services restored
16:21 Major exchange problem at the Lonsdale exchange saw a total
failure of all ISDN services. Technicians have been called to
the exchange to attend the problem.
12:00am Telstra have problems with their news server and have taken it
off-line for repairs. We have not been advised of an estimated
repair time. This problem only impacts news.
09:40am Power restored 10:10am.
09:40am A power-fail at Mt Beauty, approximately 9:20am, expected to last
at least an hour, saw us take down the MtBeauty server in order to
conserve UPS batteries, and determine how long the UPS will hold
up in real power-fail conditions. The system will be restored as
soon as power is returned.
12:35pm Finally, everything has been completed, the delays turned out
to be various problems with the exchange and new software that
the Telstra technicians use to alter line configurations.
11:00am The upgrade work scheduled for 9am today has run into problems
which Telstra are working on at the moment, we have no estimated
time of completion, but are assured they are working on the
problem as quickly as they can. It seems some of the exchanges
need upgrades as a result of the work being done.
09:00am Several of our out-of-town facilities are being upgraded,
resulting in temporary unavailability of lines. The work
is expected to take 2 hours.
15:56 ISDN links re-established. The fault was traced to
a central router failure in Melbourne, and affected
*all* ISDN links to Telstra/Melbourne.
15:40 Our ISDN links all failed.
Telstra is currently working on the problem.
A retoration time is not known at this stage.
12:24pm Reduced International Capacity
Telstra is currently experiencing problems Lonsdale - LCA.
Capacity has been reduced by 16M.
A retoration time is not known at this stage.
12:28pm Connectivity restored for the Melbourne - Los Angeles 32M systems.
10:01am Subject: Link Failure - Melbourne - LA
The 32Mbps circuit between Telstra Internet Melbourne and MCI
Los Angeles failed earlier this morning. Traffic is being rerouted
through the other circuits, but some increased level of congestion
will be visible to customers.
The problem appears to be communications controller equipment failure -
urgent remedial work to restore the circuit is underway.
16:00 Our newsfeed appears to have failed during the changeover.
Telstra Internet are working on the problem, but do not
expect it to be resolved until Tuesday morning.
14:00 We are pleased to report we have changed our connection into the
internet to be one router closer to the backbone. We now route
directly to the Telstra Melbourne PAD. This is expected to make
marginal improvement in performance, but is in readiness for
another increase to our bandwidth.
12:00 We are pleased to report several performance increases today!
1. We've added a further 20 modems to our pool. The busy tones
some people experienced, particularly mid-evening, should be a
thing of the past.
2. We've commenced upgrading modems. Our primary pool is now half
33K6 modems. If you call in on our primary number (060-40-3000)
and connect to a line starting with ttyA0, you will be on a 33K6
modem. Lines starting ttyA1 are currently 28K8.
3. Additional bandwidth is on order!
4. Additional international bandwidth came on-line this morning!
(See note following)
10:00 Date: Thu, 06 Feb 1997 09:54:43 +1100
From: Geoff Huston
Subject: Addtional 32Mbps of capacity installed
I am happy to announce the installation of a further 32Mbps of
International capacity has been switched on at 9:30 this morning.
The additional capacity is configured as a logical connection between
Melbourne and MCI in Los Angeles.
The current configuration is a dynamic load balancing of traffic
between the 2 x 32M systems in Sydney and Melbourne, so that while
outgoing traffic will take the closest exit path, incoming
traffic will take either path, depending on the US data
path used. At this stage we are still completing the internal
routing structure so some routing path changes should be expected
Further refinements of the routing design may be considered in the
coming weeks in order to ensure the best possible service delivery
setup for Australian customers of Telstra Internet.
Our current off-shore connection configuration is:
32Mbps - Sydney - San Francisco
32Mbps - Melbourne - Los Angeles
8Mbps - Perth - Los Angeles
6Mbps - Brisbane - Los Angeles
4Mbps - Adelaide - Los Angeles
4Mbps - Sydney - Auckland
14:33 Telstra have advised that pad11 at their Paddington exchange has
become inoperable. They are currently re-loading the configuration
and expect it to be back on-line within 15 minutes (15:00)
14:56 It is fixed and back on-line.
03:00 Scheduled Service Interruption: pad11.Sydney
Major upgrade work was performed on the router
pad11.Sydney.telstra.net at 0300hrs AEDT on
Tuesday 21 January 1997. This work took the expected 1 hour.