Historical status reports for 2004
00:15 Problem seems "fixed" although still no word from
114053568 telstra. Slight increase in latency to everywhere,
suspect they've changed things significantly.
15:30 Packet loss between Albury and Melbourne on the
114053568 telstra network, causing major dramas for incomming
mail and DNS.
Tried to report for hours but telstra are not even
answering their helpdesk.
Response by 19:25 tonight.
Numerous calls to chase up whats happening, they
are claiming it is a ribbon cable that has failed
that connects the ATM interface.
19:00 had to kill sendmail as the packet loss was
crippling the box (load avgs >40 due to 600+ active
mail processes). Tried restarting it several times
but problem instantly re-appeared. Once packet loss
was down to about 4% it was acceptable and mail
was working fine from about 23:30.
09:00 TXU advise interruption to power in Corryong from
09:00 to 15:00 (removal of overhead power lines
in main street). We're trying to arrange genset
for this period.
13:00 After many hours frantic work, Mark and Ross have
rebuilt an access point, re-located the whole
site and got it back on-line and running.
Action to be escalated to the AFP for action.
15:13 Complete loss of our microwave network.
Criminal damage (police are persuing) to our main
access point, approx $10K of wanton destruction.
All microwave access to clients lost. Skycam gone.
16:30 Brief interruption in service to Mount Beauty POP
while a UPS was exchanged.
18:00 Power supply fan in secondary DNS has failed,
resulting in a few unexpected reboots due to
overtemperature. Replacement scheduled ASAP.
11:20 Loss of ISDN to Corryong. Widespread outage, has
disrupted telephone, ISDN and mobile services to
the area. Seems to be telstra only. Also affected
Lavington mobile out, Jindera out. No response or
information from telstra (surprise!).
Restored (without notice) 12:30
21:10 Loss of all Albury microwave sites.
Tech recalled. Main storage cell flat, replaced.
(Due to lack of solar radiation, steps being taken
to provide early-warning in future)
04:15 Loss of all Albury microwave sites.
Tech recalled, problem found and fixed - loss of power
at CPE2 access point. PSU replaced, services restored.
17:24 Loss of all ihug satellite services, affecting Falls,
112311197 Hotham and Beauty. Traceroute dies at ken-core4 in
in Sydney. Reported to telstra. (Seems ok from our
servers in the USA)
12:33 Loss of DSL connectivity and some strange DSL type issues.
Eventually found the problem - some idiot has re-assigned
OUR fixed IP address to another user in Sydney, thus causing
all manner of wierd routing issues.
10:30 Loss of wireless communications from CPE1.
Communications restored after a few minutes, investigations
into the cause continuing.
08:30 A new, dual-redundant power supply/battery/power shelf
has arrived and is scheduled to be installed. Probable
interruption for up to 10 minutes while re-wiring of
the communications rack is undertaken. Will affect all
users in Corryong, Wangaratta, those calling 604946xx
and diversion facilities in Corowa, Culcairn and Holbrook.
06:23 Lost power to Microwave Access Point. Tech called to identify
the cause, and take whatever actions are required.
17:42 Jury-rigged supply is running, all services back on line.
Interruption affected Corryong, Wangaratta, some diversion
sites and people calling the Albury Tigris. MtBeauty,
Albury Max and some diversion sites unaffected.
Replacement redundant supply due tomorrow.
17:01 Bang! from the back of the office area.
A quick sniff in the computer room confirms "something" is wrong.
Tigris Power supply has let out the magic smoke.
14:52 90 minutes later, still no response from telstra (how typical!).
Our tech attended the site, reset the telstra equipment, unplugged
and re-plugged the cables, and error lights have gone off, site now
back on-line. Waiting for explanation and investigation by helstra.
13:22 Mount Beauty ISDN services all seem to be down.
111855889 Taking forever to get someone at telstra to accept
and action the fault. No ETR yet.
Almost impossible to understand "Roshni" in faults
centre who seems to have no idea of what "onramp 30"
Response *REQUIRED* within an hour.
Interruption to Construction-Cam expected while
it is relocated. Expected duration 30-60 minutes.
Brief interruption (approx 2 minutes) expected for
our Albury Microwave Network while new power circuits
are brought on-line. This work should increase the
long-term reliability of the site.
19:41 Mains restored in Corryong, everything back up.
18:12 Mains failure in Corryong, UPS batteries have given up,
corryong site is currently off-line.
08:40 Changeover completed, less than 2 seconds downtime.
Don't you love it when a "best case" scenario works out?!
08:30 Telstra are changing routers at the Albury exchange. Our
primary link will be off-line for an estimated 2 minutes
during this work.
17:50 We have been advised of a failure of the upstream News
server which feeds us. No ETR given at this stage.
20:30 Major latency between sydney and perth to all international
15739226 destinations. Reported to Paul at telstra wholesale who
was going to take it up with reach. No ETR. Snip of trace:
8 Pos2-0.ken-core4.Sydney.telstra.net (126.96.36.199) 43.915 ms
9 10GigabitEthernet3-0.pad-core4.Sydney.telstra.net (188.8.131.52) 44.134 ms
10 GigabitEthernet0-0.syd-core01.Sydney.net.reach.com (184.108.40.206) 41.247 ms
11 i-13-1.sjc-core01.net.reach.com (220.127.116.11) 1723.123 ms
12 qwest.sjc-core01.net.reach.com (18.104.22.168) 1750.298 ms
04:30 Telstra advise an interruption to "ALL-Albury" routers
for scheduled maintenance.
18:45 Loss of signal to MtBeauty radio AP.
Tech called to site, found flat battery. It had
not been replaced when it was supposed to have been
and was exhausted. Replaced, service interval fixed.
Service resumed 19:20.
10:50 Unusually high latency on peering link to Sydney.
Appears to be a problem within the telstra adsl
network introducing 5000mS latency. Reported to telstra
and they're working on it. Will not be affecting any
of our local users, but may cause problems for roaming
customers because of slow radius authenication.
15:30 Power failure in Wangaratta CBD. Site running on UPS.
UPS gave out about 16:30, power restored 17:55.
09:04 Barry Keenan from Telstra is on-site to do BERT tests on the
service, following more checks since 30/Jan/2004 show there
are more error-seconds running on this service. All call to
60494600 were interrupted briefly while that line was taken
out of service for testing, but calls were being accepted
on the same number but alternate service channels.
15:05 Several calls connected to Albury pool 60494600 dropped out
simultaneously. Our equipment recorded a loss of sync on one
Onramp30 service. Fault reported to telstra, awaiting action.