Historical status reports for 2000
10:00 Transformer has been replaced, power being restored to it,
instead of backfeeding from other substations on the ringmain.
18:00 Power restored, transformer is being replaced overnight.
No disruption to any of our services resulted, although
one office UPS has been destroyed (no customer equipment
was attached, or serviced by that unit)
16:00 Mains phase failure, multiple times. The 500KVA transformer at
the substation behind us has failed and an extensive outage is
being indicated by the power authority. We are running on our
own diesel generator.
10:00 Completed re-building our main proxy and doubled its memory while
we were at it, checked and returned to active service, backup proxy
now released and DNS returned to primary server. Will monitor to
ensure optimum operation and no unexpected side effects.
04:00 Done all I can, need parts. Going home for sleep, will return with spares.
02:20 Proxy server appears to have a major memory fault, services switched
to our backup proxy while repairs can be effected.
02:00 Our main proxy server has failed. Reason unknown. Tech recalled to site.
11:00 Seems the outage was caused by telstra ceasing to announce 203.0/10
to the world, and several hundred ISPs were effectively isolated
from the rest of the world.
09:30 International connectivity lost, telstra informed (but were, as
12518659 usual, unaware of the problem. Looking into it now.
12:50 UPS batteries nearly exhausted, shutdown while reserve exists.
Will monitor for startup when power is restored (whenever that is!)
12:00 Mains powerfail in Corryong. Running on UPS.
23:30 Telstra have had a major stuffup with an exchange upgrade.
SI27190 the albury AXE central processor was being upgraded, but at
about 23:20 it all turned to crap and they lost the entire
exchange, requiring major work. Over 30,000 telstra customers
were affected. Their outage report says severity: MAJOR and
indicate full exchange restoration at 03:54 am, 24-Nov-2000
Impact to us was that all our indial lines including ISDN
links to remote POPs were down, but core connectivity to
the internet was not affected.
16:00 Boy, it doesn't get much worse than this.
1. There's a break in the main international fibre link just
out of singapore. Telstra are trying to route around it,
but since over 50% of international capacity is gone, there
is significant congestion.
2. There's been an explosion and fire at a substation that powers
the telstra main paddington exchange. 20,000 homes without power
and serious routing problems there too.
3. Telstras NSW news server is broken (again) since 3pm.
12476814 Routing problems (yet again). Telstra upper-echelons were aware
14:25 but thought they could fix it before people noticed. (they didn't)
Problem still evident 15:35.... who knows!
08:40 Call from telstra melbourne to advise that the nsw server will be
offline for an indeterminate period of time and offering to switch
us to the WA news servers, before he'd realised we had already made
arrangements to switch to the Vic servers last night.
22:00 Call from someone at telstra to confirm the NSW news server was
"broken" and they were contacting someone to try to fix it.
No ETA though.
20:40 News seems to be not responding. Our server is ok, the telstra
12464701 server nsw.nnrp.telstra.net which feeds it seems to be dead.
Tried calling in a fault with telstra, but can get them to answer
their phones still....
13:55 We were a little late in starting the maintenance, but all went well
and everything is back on-line in 12 minutes.
12:00 We intend taking our MtBeauty facility offline for about 20 mins
late-morning thursday, 09-Nov-2000 for routine maintenance.
10:02 Long delays (>1100mS) to the USA via perth-paloalto links yet again.
12438488 Reported to Mauro at telstra for immediate action.
13:30 USA Connectivity is better, but BADLY congested, resulting in
>2 seconds delays. Reported to telstra and they are working on
it. Seems it's route flaps and route dampening, telstra have
"resolved" the routing issues by routing around AT&T but that
causes congestion on other routes. I expect some instability
for at least another couple of hours.
10:30 USA Connectivity is flakey at best. Reported to telstra
12422439 and finally convinced them to investigate. Awaiting reply.
19:30 Hi ping times again, same as before.
12416781 Logged again, this time with Jim Welsh
1.10pm High response times Perth -> Palo Alto again.
12414911 Fault logged with Mick.
9.45 Appears there is a major link down between Perth and
12400998 Palo Alto, USA. Traffic is being routed via Sydney,
which has become overloaded.
18:10 Update: telstra have just discovered where the break is
(just out of Bright), and are working on it, but at this
stage they still cannot offer any ETR. At last count,
there are 9 mobile phone bases off air, over 3000 phone
customers at bright/myrtleford/porepunkah off the air
and all "special services" (ISDN, data etc).
13:30 Mount Beauty has "disappeared" off the face of the
12398706 earth. Telstra have now determined that it is a
complete failure of the 144 megabit main bearer
somewhere between Wangaratta and MtBeauty they think.
Absoutely no idea of ETR.
14:24 New equipment installed in Corryong, a 6-minute
interruption to service was necessary to complete
the changeover. Full service resumed 14:30
19:29 Our server is back on-line. Significant damage has
been sustained to periperal equipment (UPS etc) and
replacements are being freighted in ASAP, however
all services are back up and running.
19:00 The situation as it unfolds seems to be that widespread
damage has been suffered in and around Corryong, and
our site has not escaped. Replacement parts have been
sought, and are being installed. Hopefully, we'll be
back on-line within an hour.
09:38 Either there have been other failures during the
night, or the UPS batteries are getting weak.
UPS gave out after only 20 minutes. Connectivity
lost until power returns.
09:20 Mains power failure in Corryong. Running on UPS.
Called Martin Dorman at Corryong Police station,
no idea of restoration time.
10:00 Software uploads took longer than anticipated, the
outage commenced at 10:00 and was completed 3 minutes
later. Everything looks great now.
9:30 We hope to perform upgrades to our Wangaratta POP
thursday morning. This will require a brief outage
hopefully no more than 5 minutes duration.
20:27 Mains failure on two phases - one phase 240V, two at 87V
UPSs holding, but under serious stress due to strange
power conditions. First equipment failed 20:40.
Technician on-site 20:50 to find everything in disarray.
Generator failed to start due to the mains conditions,
further mains degradation saw modem pools die shortly
afterwards. Main servers continued to run, except for
mail/web/auth which was on a seperate circuit, to isolate
it from other problems, and as murphey would have it,
failed first. REVISION of power systems ordered!
Power systems rerouted 21:05, power restored to site
15:30 Just when we thought it was behaving, the Annex died
again and required a reboot. Corowa/Henty/Holbrook
callers will have experienced a brief period when calls
were not being answered.
21:30 Slow international transit between AUS/USA, again via
12312780 telstra perth and palo-alto routers, but also this time
some impact on Sydney-Auckland routes too. Reported,
awaiting response. Chris.
12:10 Slow international transit between AUS/USA, again via
12303228 telstra perth and palo-alto routers. Reported, waiting
17:20 Slow international transit between AUS/USA. Reported to Zac,
12297424 being raised with the networking people.
08:24 It appears telstra have done an exchange reset, dropping ALL
users currently logged in on our 605170xx numbers. No warning
and as yet no explanation.
02:00- Due to errors in documentation, changes to the timezone
10:00 files on several of our servers was incorrectly interpreted
resulting in a premature entry to daylight savings time.
Calls and e-mail between 02:00 and 10:00 today will have
been attributed as though they had happened an hour later.
No call DURATION information is wrong however, as all call
information is stored as ending time and duration in seconds,
rather than start and finish times.
11:40 Perculiar DoS attack destined at our servers. Not causing
12182639 any problems at this stage, but telstra have been advised.
12:40- Widespread power fail at 12:05 affected our Corryong site
13:45 when our UPS ran out of battery. Power is now restored,
and the site is fully operational again.
08:00 alb-anx1 again required a reset.
00:50 lon-core2 (melbourne) router had died, required telstra
staff to physically attend to and reset the router. This
affected most of victoria, not just Albury.
23:20 The entire telstra network seems to have gone! Reported to
12133578 Dennis. They are recalling someone.
22:00 alb-anx1 not responding. Technician recalled to site.
Re-started and operational, accepting calls again.
14:50 International access via Perth has some performance issues.
12985385 Response has dropped from about 300mS to 500mS. Telstra have
been advised, and are investigating.
08:15 Albury Annex was locked, failed to respond to a remote restart,
so technicians attended on-site. Service restored.
22:00 We're back, but still no word from telstra as to the cause. Aparantly it was QUITE
widespread. Took out half of Melbourne too.
21:35 Telstras main Albury router has failed. They are recalling someone now to attend to it.
12018094 (Reported to Mick, immediate recall initiated, he's confirmed no connection to Albury)
Further information: Mobile services in albury also appear disrupted (112 only), and
large chunks of Australian IP space is being "black-holed".... it's a widespread fault.
09:01 Customer reported Corowa modem pool down.
Annex was indeed down, technician despatched to site
and service restored 9:37am. Also re-enabled the
fault-monitoring system on that site which had been
17:44 Once again, major packet loss Sydney -> San Francisco.
11984858 Fault reported, now we wait...
07:30 Power is restored to Corryong, although no indiction of
the cause at this stage.
01:00 Approximately 1am we lost connectivity to Corryong. It looks
like a power failure at midnight, but we've been unable to
get anyone in Corryong to confirm what's what but will try
again first thing in the morning.
15:00 Albury Analog modem pool was rebooted to resolve
operational issues. Outage duration approx 8 minutes.
22:00 Telstra have advised there will be a brief interruption to
our ISDN services to our Corryong facilities between 10pm
and 11pm on Saturday evening. The interruption is listed
as being for only 1 minute. Sorry, there's nothing we can
do about this.
21:30 FINALLY telstra have fixed the Sydney problem. No details
yet, except it was "congestion" on that link.
11948798 International transit via SYDNEY now stuffed!
Telstra advised, awaiting a call back.
07:30 After several more phone calls, still no ETR or call back,
but melbourne seems to be restored now. It apparantly was
a router failure/problem, but no more detail yet.
Pop42 in lonsdale exchange core router collection config
had become corrupted and they needed a backup. Took far
too long obviously. (Finally got the call back at 7:58)
05:15 Major connectivity problems at Melbourne. Telstra were to
11948368 have done a replacement of a critical router, with minimum
downtime expected, but still down at 6:50am. Reported but
no call back from telstra at this time and no ETR.
14:40 The problem was (again) one of two 45 meg links from the USA
to Perth failed (and again, un-detected by telstra). Thanks
again to James Deane for his quick response to fix it.
14:00 Slow international transit via Perth. Telstra engineers have
11946962 been advised and are investigating the cause.
22:20 James Deane has again worked his magic. Seems one of the two
45 megabit links from Paloalto to Perth was down, at the
Paloalto end of the link. This has caused some bizare routing
issues which has only affected some address ranges. All looks
fine again now, James is looking into the issue further.
15:30 Significant delays to SOME international transit.
Cause unknown at this stage.
11:15 The problems with sydney will probably continue for another
2-3 weeks, until a further 3 x 155 megabit links between
Aust and the USA are brought on-line. Telstra are attempting
to load-balance their existing links better in the interim.
09:00 Problems with Sydney -> San Francisco link again. Fault
11919548 logged with telstra.
12:00 Several hours of connectivity problems while RW was in Perth
11900147 Not sure the exact timings, but it was first via Sydney, then
Perth, and lasted several hours on and off.
05:00 Telstra advise they will be disrupting our primary digital
indial ISDN services for an exchange upgrade. They expect
a one minute interruption to services, during which time
existing calls will probably be dropped. This will affect
all ALI subscribers dialing our 6051-7000 number range.
11900226 Telstra appear to have issues in Melbourne. All traffic
currently being routed to Hobart! International connectivity
14:00 Significant packet loss Sydney-San Francisco. Telstra have recalled
11899781 an engineer.
08:10 Telstra have confirmed they had a router fault, now fixed.
07:50 International transit via Sydney has become very slow. Reported
to Telstra who are working on it now.
01:15 International transit via Sydney very poor (>2sec)
11865098 Unable to get telstra to answer their phones to report until 9am
but it's being actioned now.
02:35 International transit now restored. Sydney router problems.
01:05 Telstra have lost all international connectivity, and their
11850279 fault reporting system (so can't even give me a fault no!)
Widespread problem, they are recalling people now.
14:15 Delays on international ingress, and domestically to sydney.
11850110 Reported to telstra, they're looking into it.
15:05 5 minutes ago, most national, and all international transit
11849635 has slowed down significantly. Telstra have been advised and
are recalling technical staff to resolve the issue. Our early
testing shows a problem at the lonsdale routers in Melbourne.
05:00 Telstra advise of one 1-minute interruption to services
affecting our Corryong link ISDN links. This should have
minimal affect on our subscribers.
09:50 Terminal server back on-line.
09:25 Alb-anx1 isn't responding. Tech recalled to site.
10:45 We've now got everything back and operational. No further
problems are expected, especially now that everything is
delivered over SDH/VCTS/Fibre.
08:30 Bruce has arrived from Telstra to do more diagnostic work
in an effort to resolve the second OR30 problem.
22:30 Still no luck, despite changing the port at the exchange.
Telstra staff are expected on-site tomorrow morning.
16:30 The line issues below have continued, and one half of one
of our OR30 links isn't operating properly. Callers may
experience calls answering, but failing to connect. We've
got telstra exchange staff working on the fault. In the
mean time, we've got that half of the OR30 locked out until
we can resolve the problem.
15:00 Seems this mornings efforts from telstra have caused some
11805624 serious issues, notably they have in effect cut our line
capacity in half due to some configuration error. Telstra
have recalled technicians to work on the problem, we have
no idea yet when it may be resolved, but it's being worked on.
06:30 Apart from some exchange difficulties, the transition went
reasonably smoothly. Most calls were automatically passed
off to our overflow equipment, so users will not have
experienced any problems, although MtBeauty was off-air
until 6am, and Corryong didn't return until 6:30am
05:00 PLEASE NOTE: We are anticipating a transition of two
of our ISDN-PRI interfaces from copper to fibre this
morning, which could take up to 1 hour. This will affect
all users dialing into our "60517000" albury number, and
will also affect MtBeauty and Corryong sites.
05:00 Telstra advise a one minute interruption to our Corryong ISDN
links to perform network upgrades. We expect minimal impact
to our customers, as this is an internal link and does not
directly effect dial-in users.
18:00 The Annex failed to re-boot properly, multiple times.
Complete disconnect/reconnect from internal power supply
has returned modem control signals, and a re-boot has
restored the server to operation. Investigation to cause
and long-term prevention has begun.
17:15 Our Albury Annex server has stopped responding.
Engineers have been recalled to the site. No ETA as yet.
00:00 Telstra advise one 30 minute disruption to some services
for an exchange software update. This will affect only
our Wangaratta facilities, but is NOT expected to cause
any disruption to services for our subscribers.
04:15 Larry at telstra has taken my report re slow traffic back
11767396 through Sydney again! (Reported 8:35)
17:15 Telstra performance through Sydney is again horrible, has
11765760 been reported to Dan at telstra, awaiting feedback.
19:20 Telstra advise the sydney problem was caused by a faulty
11760868 interface on one of the paddington cisco routers, which
was re-set and is now operating normally. Telstra also
confirmed my long-held belief that they do NOT have any
pro-active monitoring, which explains why they take so
long to recognise a fault exists. (Graham)
11:40 International transit via Sydney is now slow.
11760868 Telstra have been advised and are recalling a tech to
investigate and work on the problem. (Dan)
10:30 International transit via Perth restored. Telstra report
11755319 there were actually two faults, one corrected reasonably
quickly last night, the other which had escaped their notice
was a 45 megabit bearer was down, resulting in massive
overloading of that link. (Vince)
18:20 Slow international transit on SOME addresses, appears to
11755319 be only traffic routed via Perth-Paloalto. Seems the perth
to Paloalto link is way overloaded, telstra are trying to
re-route some traffic via Sydney to alleviate the issues.
-TBA- There will be a brief interruption to all services calling
our primary Albury digital pool, and all users calling into
our MtBeauty and Corryong sites, during an ISDN service
upgrade scheduled for Friday. Interruption should be for
only about 5 minutes, but could take up to 30 minutes while
telstra reconfigure exchange equipment.
16:00 Scheduled upgrade of our primary DNS. Should not cause any
interruption to services as the new server should instantly
take over from the old.
12:27 There will be a brief interruption to some hosting services
today to facilitate a system upgrade. 10mins maximum.
15:13 There MAY be brief interrruption to traffic during some router
reconfiguration by telstra, necessary within the next 15 minutes.
12:00 International connectivity seems poor, although telstra claim
11692774 to be unaware of any problems. Others throughout Australia are
reporting similar problems, so perhaps telstra will pay some
attention to and fix, the the problem.
09:15 The analog terminal server had become quite upset, and required
remedial attention. It's now operational again, investigation
into the cause is continuing.
09:00 For reasons yet undetermined, our entire Analog pool for Albury
has become isolated from the network.
00:00 Telstra advise one 10-minute disruption to services in Wangaratta,
and Mount Beauty, for Exchange software upgrades. We are unsure
what effects this will have on ALI customers/callers.
09:52 Our main proxy is back on-line and operating normally. The
problem was identified to a problem with small files causing
too much memory to be needed to maintain tables. (The proxy
was maintaining over 30 million files). Measures are being
taken to prevent a reoccurance.
08:20 The proxy is again operating outside normal parameters.
We've put our hot-backup server into service while the main
server is thouroughly investigated.
06:30 Our proxy is operating outside normal parameters. The system
is swapping heavily, causing reduction in performance. Techs
have re-started the proxy, performance is nominal, but full
investigation into the cause has commenced.
15:30 International transit problem. Telstra routing issue.
07:45 Telstra advise they will be briefly interupting our backbone
connection for up to 30 mins for power and network upgrades.
19:10 Wangaratta ISDN links were up but not passing traffic.
Technicians have re-initialized the link, now ok.
10:35 Wangaratta links have been re-routed through our Albury
Tigris. This resulted in a short interruption to service for
Wangaratta-connected customers, but should restore full
link speed and capacity to our Wangaratta facilities.
17:07 Funny routing, reported to telsta, they're working on it now.
18:00 Replacement parts should be here tomorrow, but in the mean time,
we've re-routed Wang traffic via alternate links back to Albury.
Wangaratta traffic will be marginally slower during this time.
16:45 Argh! A total failure of our OnRamp services to the Albury Tigris
has left that modem pool, and all Wangaratta services isolated.
13:00 Looks like telstra have broken their news server again!
11496815 Fault has been reported, awaiting an update.
23:03 Our Corryong modem server had a runaway process that in the
space of 3 minutes, took disk space from 23% used to 108% used
which resulted in technical staff performing an emergency
shutdown of the server to clean up and remedy the problem.
Everything is now operational although performance degradation
will have been experienced by Corryong users for about 30 mins.
10:20 We're updating our Radius Accounting server. During this time,
on-line account and whitepages enquiries may not work.
8:47 Wangaratta link restored. Fault analysis commenced by Ericsson.
08:35 Multilink/Compressed ISDN link to Wangaratta stopped responding.
Ericsson technicians have been called to analyse the failure.
10:30 International traffic routing is now complete.
No network disruption was experienced.
Sometime Due to telstras ongoing pathetic network performance, we are
expecting some major network routing changes during the day
today, and are expecting a significant improvement in our
international transit performance. Unfortunately, there MAY
be some instability during the transition. Please bear with us.
15:30 We're in the process of moving our albury mail server.
If you havn't got mail recently, please check your POP server is
set to mail.albury.net.au and try again.
15:40 There are international connectivity latency problems, telstra
11428679 have been advised. We have no further information.
18:55 Admin staff have re-started the daemon, and are investigating
the cause of the problem. During this outage, customers from
outside Albury/Wodonga will have been unable to log in.
18:35 Our secondary RADIUS daemon has stopped responding.
08:20 5 minute scheduled outage of our main mail and DNS servers
for a re-boot to clear a memory leakage accumulated over
the last 4 months.