Historical status reports for 2003
17:48 Fault repaired, no reason or information from telstra yet.
16:30 Yes, another major telstra fault, in Sydney this time. Telstra are
working on it already, it has affected most of telstra, their peers,
international links etc. No ETR but it is a major problem and they
will be throwing everything at it to get it fixed.
16:45 We have just installed an amplifier at our "Wheelers Peak"
access point, increasing EIRP to the legal limit (up 6db)
but more importantly, adding 14db receive gain. All links
are showing a significant increase in quality.
06:48 Services restored, awaiting word from telstra.
06:03 Loss of connectivity between telstra alb1 and alb3
15371749 albury routers. Telstra have been advised and called back
to confirm their alb2 router is dead and are calling a
tech to their site immediately. No ETR at this stage.
Update: 06:35. Barry Keenan enroute to fix, ETR 07:10
18:00 Loss of international connectivity. Traceroute stops dead
15336627 at pos2-0.ken-core4.sydney.telstra.net. Telstra "say" there
is no problem, but are getting some to look and call back.
11:26 Power finally back in Wangaratta. Our server is back up, we
have no indication at this stage what the cause was, but it
is the 3rd time in a week!
07:52 Power fail in central Wangaratta, the whole CBD lost power at
about 7:32. Our UPS lasted until 7:52. It seems at least some
power was restored about 8:35, but our site is still down, we
are attempting to contact the site but so far no success.
16:15 Well, it's finally happened. I never wanted to do this, and I
have resisted for 8 years, but finally I've had to stick in a
filter for ICMP echo-request packets directed from outside our
network directed at our customers dial-up IPs. Customers can
still ping out to addresses outside, and replies will get back
in, although people outside will be unable to ping you.
This is in direct response to the huge traffic overhead being
generated by the blaster worm and its variants.
04:25 Albury Annex has stopped responding.
08:51 Replaced with a newer unit with more memory.
13:15 Major power failure in Lavington. Power authority eventually
responded, only to tell us that a hot air balloon had crashed
into the powerlines and "most of" Lavington was out.
We're running on generator and UPS.
(Power restored about 14:00)
12:30 We are under a MASSIVE Denial of Service attack from what seems
initially to be thousands of addresses in the 220.127.116.11/8 address
block, apparantly directed against our primary webserver.
Able to get 18.104.22.168/8 blocked at the upstream router.
DoS effectively mitigated shortly after 13:00
14:21 Added filter for port 135 on both TCP and UDP at the border router
in an attempt to slow down or stop spread of the "Blaster" worm.
18:17 Have replaced alb-anx1
14:00 Suspected fault with either Corowa or Indigo exchanges.
15022351 Have spoken to Bev Heiler from telstra faults, given all
relevant details. Seems to be only affecting clients in
Corowa, calling into our Indigo facilities.
09:35 Slow international connectivity again, reported to telstra.
15016555 Traceroute extract:
7 Pos2-0.ken-core4.Sydney.telstra.net (22.214.171.124) 42.517 ms
8 10GigabitEthernet3-0.pad-core4.Sydney.telstra.net (126.96.36.199) 42.025 ms
9 GigabitEthernet0-1.syd-core01.Sydney.net.reach.com (188.8.131.52) 42.401 ms
10 i-13-2.sjc-core01.net.reach.com (184.108.40.206) 729.563 ms
11 qwest.sjc-core01.net.reach.com (220.127.116.11) 1080.123 ms
12 svl-core-01.inet.qwest.net (18.104.22.168) 1169.474 ms
12:20 Upgrade progressed smoothly. Total downtime 4 minutes, although
we spent another 15 minutes tidying up cabling and tidying up.
09:26 We are planning a short (10 minute) interruption to upgrade our
Albury Analogue terminal server. This will affect users calling
our facilities in Indigo, Gerogery, Bullioh, Yackandandah and
Wymah. This is scheduled to occur at midday.
18:00 Over the last two hours, have made significant re-arrangements of
equipment in the main computer room. Consolidation of modem pools
has let me put all remaining analogue services into a single modem
rack, now located in the primary communications rack.
17:30 After 10 days of continuous hassle, being bounced from department
to department, we spat the dummy with telstra and have had part 1
of the disconnection order completed. This has now seen virtually
all local analogue lines decommissioned, with a little more work
left to be done on intermediate distance analogue lines next week.
16:20 Telstra have now raised a line-level fault to try to track down and
14975849 the megalink problem. 3 flaps yesterday, 1 more today.
13:00 Several unexplained short-duration (1 second) link flaps on Megalink
14965835 to telstra. Reported again, but being intermittent it may take time.
02:05 Everything that will come back up is back and running. Zen is dead.
RIP Zen. Born Feb 1996, died Jun 7th, 2003. 7 yrs 4 months 24/7.
23:50 Power has been restored earlier than expected. Machines being brought
22:50 There goes the last of the UPSs. Site is "dark". Nothing to do but
wait for power to be restored. Generator to be turned into fishfood.
22:20 Power fail in Lavington, again. Country Energy advised that at least
one transformer in Dick Rdd exploded, indications of feeders down.
Said power would be out all night, it was too dangerous to work in
the current weather conditions.
The generator, tested a week ago, picked tonight to decide to only
produce 100V AC output instead of 240. Snarl. Running on UPS for now.
23:40 Power restored, site returned to mains. The last of the equipment now
back on-line (office cameras and incidental equipment).
Casualties so far: One 3KVA UPS killed, replacement installed ok.
22:07 The Albury area is currently suffering extensive power failures
since a lightning strike at 21:37:50 tonight. We're running on
backup generators but no word on restoration times. Some equipment
may still be without power, we're still checking and restoring
systems that were lightning-affected.
15:56 International connectivity issues yet again.
14886314 Reported to telstra helpdesk, of course, they were unware of it.
9 22.214.171.124 (126.96.36.199) 35.120 ms
10 i-2-0.wil-core02.net.reach.com (188.8.131.52) 40.436 ms
11 i-2-1.syd-core02.net.reach.com (184.108.40.206) 1838.137 ms
12 GigabitEthernet1-1.pad-core5.Sydney.telstra.net (220.127.116.11) 1850.077 ms
13:30 Excessive international latency. Reported to several telstra sources
14832507 including the ISP fault centre (hopeless gits!) and the old number we
are supposed to not use any more at least accepted the fault.
(Also e-mailed to the reach NOC)
8 dap-brdr-01.inet.qwest.net (18.104.22.168) 9.555 ms
9 22.214.171.124 (126.96.36.199) 35.648 ms
10 i-2-0.wil-core02.net.reach.com (188.8.131.52) 49.873 ms
11 i-4-1.syd-core02.net.reach.com (184.108.40.206) 1394.873 ms
12 GigabitEthernet1-2.pad-core5.Sydney.telstra.net (220.127.116.11) 1393.732 ms
Fault has existed since 11:45 and is still there at 14:10
23:00 Corryong power restored (finally).
19:00 Corryong power failure yet again (number 5 for the day?)
Batteries about to die, TXU are not expecting restoration
until at least 20:00 tonight, so done a clean shutdown
while battery power remains.
17:00 Albury Annex died again. Tech recalled to site.
Removed one of 36-port RS232 cards, found a loose screw in there!
Back on-line, with any sort of luck, a permanent cure.
18:06 Hells bells. After nearly 9 hours telstra have finally found and
fixed the fault. Ultimately, and after sending someone to Corryong,
they found a faulty connection back at the AXE exchange in Albury.
09:37 9:37am without any warning, there seems to be a major failure on the
14730652 OnRamp-30 service in Corryong. Telstra have been called, we are still
waiting on any update or ETR.
10:45 From approx 21:30 last night, there has been an excessive amount
14710823 undesirable traffic on port 445. Yet another windows XP exploit
(worm). Have installed port block in ingress to help protect clients.
10:03 Shutdown all corryong servers after prolonged power fail.
Lost power several times from 7:29 onwards, finally lost
power totally at 8:33, ran on UPS and extented UPS supplies
until 10:03 at which point batteries were exhausted.
TXU advise power expected to be restored about 11:30am
16:35 Terrible USA performance again over telstra/reach networks.
14685631 Telstra called and advised, waiting on action.
From USA end, partial traceroute:
9 18.104.22.168 (22.214.171.124) 10.663 ms
10 i-2-0.wil-core02.net.reach.com (126.96.36.199) 41.741 ms
11 i-2-0.syd-core02.net.reach.com (188.8.131.52) 2161.470 ms
12 GigabitEthernet1-1.pad-core5.Sydney.telstra.net (184.108.40.206) 2164.570 ms
07:55 Loss of communications to primary mail and authentication servers.
Tech called to site, regained control of machine, were suffering
from a DoS attack from an ADSL user. Null-routed source, server
22:00 Major international delays on telstra links.
Seeing 40mS to the pacific, and 2200mS to the other side.
Telstra faults unresponsive as usual, called William Jamin
direct and have forwarded traceroutes.
16:30 MAJOR microsux MS-SQL worm has hit world-wide, resulting in major
disruption to various parts of the internet. While we have a block
on incomming packets to this port, it wont help problems elsewhere
on the internet whos bandwidth is being consumed. Blame mickysoft
again, we have no idea just how widespread it is, or how long its
affects will linger. It is affecting virtually every ISP in every
11:00 MtBeauty server relocation completed. Had a UPS failure when we
restarted everything, which required a second shutdown. Resulted in
two shutdowns, one at 10am for 12 mins, one about half past for 5 min.
09:26 Server usa.albury.net.au has been taken off-line for necessary upgrades
and should be back in service within 30 minutes.
10:00 Essential site works being done at our MtBeauty POP require
interruption to service for up to 30 minutes, currently scheduled
to commence at 10:00am on friday, Jan 17th.