The word "borken" is an actual typo, but since it sounded cool, I have decided to keep it. This blog will be mainly about problems (system administration, open source software, coding in c++ c# ruby on rails java, etc) and how I go about solving them. Hopefully it will be useful to someone out there too :)
Friday, February 29, 2008
Server DOWN!!!
Initially, the site experienced a problem with php connecting to the database. The error was that php cannot connect to the database because the file XXX.MYD cannot be located. I told my friend (who owns the hosting infrastructure) that if the problem is serious enough, the site owner will call him directly. That was yesterday afternoon. And nothing much happened. They only traded a couple of emails. I provided some advice, offhandedly, that most probably it is a corrupted database (as the MYD file contains the data table indices which I found out after some googling).
Well, nothing happened until evening. This morning, my friend told me that he was called and SMS at 12am, 1am, 2am... well, you know the drill, the owner is really jumping and reality has finally set in for him (the first few hours are usually denial, then requests for rebooting the machine, then testing, usually by vigorously pressing the "refresh" button on the browser, as if that will solve all the world's problems).
The web server is down.
Ditto to the email server.
Next, came the threats to remove the server and host it elsewhere. Because if the server is hosted with you, then its your duty to ensure that its up and running, despite the fact that we do not have any access to it (no passwords).
A blunt analogy is this: if you really have cancer, no matter how many doctors you go to, you still have it, changing doctors do not solve the problem.
After offering him my blunt opinion, I was told that I need to chill down and we are not pushing blame... What the ****???!!!
And so... the negotiation begins, to provide emergency server rescue, what cost? can we guarantee 100% recovery (if there's no backup, how can I guarantee all the data will be back?)? "confirm don't have the root password", so we have to reset it for him too.
Wow... a totally unmaintained server can be online for so long, too! I set the server up for the owner a couple of years back and left it to him as he wasn't interested in managed services. I can't believe it can survive so long in today's world! The server hardening I put it through was helpful, after all... :P hahaha!!!
So next, the final nego... and going down for a site visit and recovery effort estimation... meanwhile, I'm preparing a couple of LiveCDs (Knoppix, Ubuntu), installers (CentOS 5.1), System Rescue CD 0.4.3, and mentally prepping up for the tasks ahead. Hmm.. what else do I need to bring along? Once I'm there, it's in the middle of nowhere, not much chances to come out and buy anything I missed.
Friday night seems to be burnt for small change... I don't mind if someone else can do it for cheap actually, I need my rest... :|
*zzz*
Friday, February 15, 2008
World Class infrastructure for a World Class Event?
This is the headline from Straits Times article
Website booted him out three times
British Airways pilot Benterman takes 10 hours to get tickets for F1's first night race
To make it a double whammy, the permanent resident found, to his horror, that the seats he had reserved were lost when he was booted out of the website.
'It was absolutely frustrating and a disgrace,'' said the exasperated 39-year-old.
'I cannot accept not getting through to the website because it crashed. There is also no customer service number to call.''
Apparently, the website was supposed to be capable of handling 20,000 transactions per hour and the actual traffic apparently was way over what was expected.
Questions:
- Did anyone take the last F1 race's figures for a comparison and benchmarking?
- Was there a big change in the way the tickets are sold?
- Did the system incorporate proper transactions handling, queueing and all that?
- Was the system properly load tested before going live?
- Was someone even monitoring the system after it went live? Why was no action taken? (cf. MRT down, buses were deployed)
- Was there a contingency plan in place? (obviously not)
- Could the launch have been scheduled in phases? (online sales first, then outlet sales?)
- Were corners being cut in the system hardware so someone could save a few bucks? Or was the organize scammed by vendors who gave 3rd class hardware for 1st class prices? (which is normal)
So, we'll see... :)
Monday, February 11, 2008
Understanding a geek :P
http://www.randsinrepose.com/archives/2007/11/11/the_nerd_handbook.html
In summary, these are the traits:
Understand your nerd’s relation to the computer.The best part is this paragraph:
Your nerd has control issues.
Your nerd has built himself a cave.
Your nerd loves toys and puzzles.
Nerds are f**king funny.
Your nerd has an amazing appetite for information.
Your nerd has built an annoyingly efficient relevancy engine in his head.
Your nerd might come off as not liking people.
Understand your nerd’s relation to the computer. It’s clichéd, but a nerd is defined by his computer, and you need to understand why.
First, a majority of the folks on the planet either have no idea how a computer works or they look at it and think “it’s magic”. Nerds know how a computer works. They intimately know how a computer works. When you ask a nerd, “When I click this, it takes awhile for the thing to show up. Do you know what’s wrong?” they know what’s wrong. A nerd has a mental model of the hardware and the software in his head. While the rest of the world sees magic, your nerd knows how the magic works, he knows the magic is a long series of ones and zeros moving across your screen with impressive speed, and he knows how to make those bits move faster.
The nerd has based his career, maybe his life, on the computer, and as we’ll see, this intimate relationship has altered his view of the world. He sees the world as a system which, given enough time and effort, is completely knowable. This is a fragile illusion that your nerd has adopted, but it’s a pleasant one that gets your nerd through the day. When the illusion is broken, you are going to discover that…
Actually, I would prefer the word geek to nerd :P ... Nerd sounds too... nerdy :P
Read the blog to find out more! :D
Enjoy!
Can? How much? How fast?
"Hi, can you do a SQL/web/intranet/(fill in with appropriate IT word) program?"
"How much har?"
"When can finish?"
This is very common and I am always very cautious when dealing with such customers because:
- They typically (>90% of the time) do not know what is the effort involved (most likely they learn of this requirement from an in-house IT guru, who might or might not have any experience in IT)
- They also do not know what they really want (they are just relaying someone else's words)
- They only look at the cheapest quote
For customers whose only concern is cost, I would happily give them a miss as they are the "don't care, don't know, don't bother me unless it is delivered and working" type. These companies are the type that hobbles along on a barely working/functional and often broken IT infrastructure, going from vendor to vendor/supplier whenever the system is down, because they only look at the cost.
Vendor after vendor apply various patches, workaround and upgrades to the original system until it is barely recognizable, and maintainable. Most, if not all, of the time such companies do not have a documentation of the system and the database, resulting in tremendous efforts in tracing through the system and trying to figure out what it is supposed to be doing. And yes, most of the time this work is being performed on production systems too.
Despite the claims by the press, internet and local authorities, the majority of SME owners are rather IT illiterate and clueless (this is base on my limited experience). Most of the time, cost is the only concern. Some of them are happy with a halfway broken system because of the "I know it's broken but I have a staff doing it, fixing it will cost a lot of money lehhh..." way of life. They will devote a staff or two, or even three to perform some of the functions that the system should be performing alone if it was not broken.
Even worse are those who are semi IT-literate, certified as literate after attending a 3 day course in IT conducted by instructors who have barely have any experience in real life IT operations. They are adamant that their "IT Way" is correct and you should not attempt to advise them because they know better.
Let me provide an analogy. Suppose you need to buy shoes, you can either chose to buy a cheap one, or a slightly more expensive but durable one. So, would you rather buy a $30 pair of shoes that spoils every 3 months, or a $150 one that can last you at least a year or more? Some companies would avoid paying the $150 like the plague because the perceived cost is "high". That's sad, limited, and yet very real.
Therefore, in order to secure the job and to fix the problems properly, a lot of communication and persuasion is necessary. The customers must be convinced of the value of the solution, and invest his/her own time into it to help shape the final solution. IT is central to a lot of their operations and can be made to provide more assistance to their business, but yet is given very little priority and investment.
Having said that, there are a lot of moonlighters out there who over promise and under deliver, causing this vicious cycle to continue. I have personally seen some local e-commerce sites with extremely poor exception handling in their purchase and payment code. Yes, they have the usual certificates, logos and seals, but the certification process does not include the testing or validation of the source code itself.
Shop on local websites? Err... maybe not yet... let someone else be the guinea pig for these eBay wannabes :)
Maybe I will get a chance to help fix these borken sites once they get complained :P haha!
Tuesday, February 5, 2008
Torvalds on Microsoft's patent bluff
I told him that yes, there may be nice, friendly peeps there at MS, but there are also people who has nothing to do but spread FUD (Fear, Uncertainty and Doubt) and people who play games on both sides.
In this article published by Linuxworld, Torvalds remarked that
"I think there are people inside Microsoft who really want to improve interoperability and I also think there are people inside Microsoft who would much rather just try to stab their competition in the back," he said. "I think the latter class of people have usually been the one[s] who won out in the end, but -- so I wouldn't exactly trust them."
Me neither. Keep your friends close, but your enemies closer :P
NYT article on the cut cables
I... uh... guess the seas here are safer? :DTelecommunications operators have been trying to diversify the routes used for transmissions, said Alan Mauldin, research director with TeleGeography Research, particularly since an earthquake in Taiwan in 2006 disrupted service in Asia.
The cable network contains “choke points” — like those off the coast of Egypt and Singapore where many cables run, Mr. Mauldin said.
Monday, February 4, 2008
a Ruby on Rails query with the LIKE syntax in :conditions
In most rails apps, you would do either a simple .find(:all) or .find(:id) or .find(params[:id]). I needed to query my database based on a simple condition, matching all words in a column that starts with a particular alphabet.
select * from mytable where username LIKE a%;
(would return all usernames that start with a, like adrian, avin, etc).
To do that in RoR, I needed to add a :condition to the default query.
However, tagging it this way didn't help
:condition => 'username LIKE #{params[:uname]}%',
the sanitizer would convert that to username LIKE 'a'%, which gives an SQL exception.
After searching the web, scanning dozens of sites, including the rails API, the RoR forum, I accidentally stumbled upon the solution in the comments section of a blog (I can't remember where it was now, sorry).
My query is subsequently modified and tested to be
@results = User.find(:all, :conditions => ['username LIKE ?', params[:uname]+'%' ])
Sorry if this was an obvious thing to a lot of other people... :P
Friday, February 1, 2008
Elsewhere: the Internet is borken too
Internet outage hits business from Cairo to ColomboThere is another article here by Business Week:
Posted: 31 January 2008 2208 hrs
CAIRO: Damage to undersea Internet cables hit business across the Middle East and South Asia on Thursday, including the vital call centre industry, prompting calls for people to limit their surfing.
Around 70 per cent of Internet users in Egypt have been affected since two submarine cables in the Mediterranean Sea were damaged on Wednesday, also rupturing connections thousands of kilometres (miles) away.
...
India's Internet-dependent outsourcing industry was also severely disrupted, with businesses saying it may take up to 15 days to return to normal.
Damage to the Flag Europe-Asia and the SeaMeWe-4 cables have left only the older SeaMeWe-3 system to provide service between Europe and the Middle East, research firm TeleGeography said.
The two cables, with 620 Gbps in capacity, are the prime direct links between Europe, the Middle East and south Asia.
This looks very serious and the cause is yet unknown. Imagine all the businesses relying on that 2 cables losing all access to free/cheap overseas calls, internet (email, websites, ecommerce, etc)!
I wonder what is causing the disruption. Underwater volcano eruptions due to continental movement or... playful whales? :P