Burned by filled up partitions…again.
May 22nd, 2006 — maxwellMondays. Everyone hate Mondays. I arrived at the office today and had my usual coffee…have to have coffee before you even think about working, right. I started working on a Nagios project that I’ve been doing lately when a user contacted me with some e-mail issues. For the particular e-mail errors the user was receiving, I just contributed it to either a misinterpretation of the third-party vendor software, or a glitch in the e-mail (as my boss says, “A disturbance in the force.”)
I continued working on my Nagios project, when I myself checked my e-mail to find some very bizarre things. All the e-mails were sending, but had actually had the subject headers stripped, and then had the content ripped out. Shite, I think! Well I start with the usual…check the logs dummy. Hmmm, strange, nothing in any of the logs about any errors. I start to do a little experimenting sending e-mail to have the same thing…the e-mails sent, but had the subject headers stripped, and had the content removed. I immediately restart postfix thinking perhaps something screwy had happened to no avail. I stopped postfix then to think a bit, stopping postfix would allow e-mail to be queued up, but not actually process it.
Again I went back through all the various log files, pulling my hair out wondering what the strangeness was all about, to find absolutely nothing. Normally, I am fairly good at figuring out just about anything that is wrong given enough time. Well, seeing as this was our company’s e-mail (we absolutely have to have e-mail up and running), I decided to call the Alpha-Geek in.
Not realizing a ton of other services were also associated with e-mail, such as postgrey, dspam, and a few others I was headed down to the machine room to the rack. The phone rings, and it’s my boss, saying nevermind…
You dumb ass! I immediately thought after he told me the issue. A partition that our anti-virus software lives in had filled up. Damnit, damnit, damnit, I should have known better, especially since just a few weeks ago on a personal machine I had a drive fill up. The reason I didn’t even think about it, however, was because I never got any actual error messages. My drive that filled up weeks ago, actually had some crazy ass errors to help point me in the direction of “hey, your drive is full.”
Now to top it off. Normally my network monitoring program would have caught a problem like this, and it would have already paged the hell out of me to make me aware. Well…wouldn’t you know it that I hadn’t set up this particular server to be watching for disk space. There’s always a little crack that something will inevitably find, isn’t there. I hadn’t set it up, because way back when I wasn’t good at SNMP, and the set up required a bit of knowledge to do so. Since that time, however, I’ve grown very comfortable with SNMP, but had just forgot to audit my services I was checking. Well, guess what, that’s done now!
Anyways, back to the point. I was under the impression that if drives fill up, you always get some crazy errors in the logs to at least help you figure out the problem. My mistake, and a lesson that I won’t be burned on thrice. When you have a problem and have some application not doing what it’s supposed to be doing without outputting some good helpful error messages, the easiest thing to do is simply a Linux command df -h. This prints the partitions and the space left on the devices. Had I have done this, I wouldn’t have needed to call the Alpha-Geek in. Ahhh, the joys of learning to be a network system administrator, but it won’t happen again. ![]()





