ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Big Scary Daemons

Dealing with Full Disks

09/27/2001

So, your daily message shows that your partitions are getting full. (You do read your daily status mail, right? Of course you do.) While various desktop environments have nifty point-and-click interfaces that show you exactly where your disk space went, they don't help much when your GUI-less server starts having trouble. We're going to look at some basic disk measuring tools, with the goal of finding that missing few gigabytes of space.

First off, you need an overview of how much space each partition has left. df(1) is our best tool for that. The output from a vanilla df command isn't that easy to read, however. When hard disks peaked out at 10 MB or 40 MB, it wasn't so bad. But when a disk can easily hit a 100 GB, you can go cross-eyed shifting decimal points. The -h and -H flags both tell df to generate human-readable output. The small h uses base 2 to create a 1,024-byte megabyte, while the large H uses base 10 for a 1,000-byte megabyte. Most FreeBSD tools do not give you the option to use base 10; base 2 is undoubtedly more correct in the computer world, so we'll use it for our examples.

We should also check the available inodes on a partition. Having lots of disk space is utterly moot if you run out of inodes and cannot create any more files! The -i option gives us that information.

So, the current disk usage is:

#df -hi
Filesystem  Size Used Avail Capacity iused   ifree %iused Mounted on
/dev/ad0s1a  97M  43M   46M    48%    1368   23718    5%   /
/dev/ad0s1f 4.9G 2.7G  1.8G    60%  184468 1117034   14%   /usr
/dev/ad0s1e 194M  12M  166M     7%     794   49380    2%   /var
procfs      4.0K 4.0K    0B   100%      41    1003    4%   /proc
#

This would be plenty, if I didn't need to copy a 2-GB file onto the laptop. Not long ago, a 2-GB hard drive was more than adequate. Today, some large commercial software packages come as 2-GB tarballs. I have almost enough space.

The biggest problem is discovering where bloat lives. If your systems are like mine, disk usage somehow keeps growing for no apparent reason. You can use ls -l to identify individual large files on a directory-by-directory basis, but doing this on every directory in the system is impractical. The actual decision on what to keep and what to delete is highly personal, but there are more sophisticated tools to help you identify your biggest programs and directories.

Your first tool is du(1), which displays disk usage. Its initial output is intimidating, however, and can scare off a new system administrator.

# cd $HOME
# du
1       ./bin/RCS
21459   ./bin/wp/shbin10
5786    ./bin/wp/shlib10/fonts
13011   ./bin/wp/shlib10
19      ./bin/wp/wpbin/.wprc
7922    ./bin/wp/wpbin
2       ./bin/wp/wpexpdocs
1       ./bin/wp/wpgraphics
2       ./bin/wp/wplearn
10123   ./bin/wp/wplib
673     ./bin/wp/wpmacros/us
681     ./bin/wp/wpmacros
53202   ./bin/wp
53336   ./bin
5       ./.kde/share/applnk/staroffice_52
6       ./.kde/share/applnk
...

This goes on and on, displaying every subdirectory and giving its size in blocks. On my system, $BLOCKSIZE is set to k. The total of each subdirectory is given -- for example, the contents of $HOME/bin totals 53,336 KB, or roughly 52 MB. Of that, the $HOME/bin/wp directory contains 53,202 blocks of that. I could sit and let du list every directory and subdirectory, but then I'd have to dig through much more information than I really want to. And blocks aren't that convenient a measurement, especially not when they're printed left-justified. Let's clean this up. First, du supports a -h flag much like df.

# du -h
1.0K    ./bin/RCS
 21M    ./bin/wp/shbin10
5.7M    ./bin/wp/shlib10/fonts
 13M    ./bin/wp/shlib10
 19K    ./bin/wp/wpbin/.wprc
7.7M    ./bin/wp/wpbin
2.0K    ./bin/wp/wpexpdocs
1.0K    ./bin/wp/wpgraphics
2.0K    ./bin/wp/wplearn
9.9M    ./bin/wp/wplib
673K    ./bin/wp/wpmacros/us
681K    ./bin/wp/wpmacros
 52M    ./bin/wp
 52M    ./bin
5.0K    ./.kde/share/applnk/staroffice_52
...

Also in Big Scary Daemons:

Running Commercial Linux Software on FreeBSD

Building Detailed Network Reports with Netflow

Visualizing Network Traffic with Netflow and FlowScan

This is a little better, but I don't need to see the contents of each subdirectory. A total size of everything in the current directory would be nice. We can control the number of directories deep we display with du's -d flag. -d takes one argument, the number of directories deep you want to show. A -0 will just give you a simple subtotal of the files in a directory.

#du -h -d0 $HOME
1.0G    /home/mwlucas
#

I have a GB in my home directory? Let's look a layer deeper and see where the heck it is.

#du -h -d 1
 52M    ./bin
1.4M    ./.kde
 24K    ./pr
 40K    ./.ssh
2.0K    ./.cvsup
812M    ./mp3
1.0K    ./.links
5.0K    ./.moonshine
...

The big offender here is the mp3 directory. Oh. Ahem, well, that can be copied to another machine if I must. This is a good opportunity to clean up my home directory anyway. I tried KDE for a week, and still hated it, so .kde can go. So can .moonshine and related stuff. When I'm done, the home directory is down about 200 KB. Much better.

Now let's look at the main /usr and /var directories to see if anything unusually large is lurking there.

#cd /usr
#du -h -d1
 11M    ./bin
7.5M    ./include
 34M    ./lib
9.6M    ./libdata
 15M    ./libexec
571M    ./local
6.3M    ./sbin
 39M    ./share
289M    ./src
119M    ./ports
 57M    ./compat
1.5M    ./games
323M    ./obj
1.0K    ./tmp
234M    ./X11R6
1004M   ./home
 11M    ./sup
 36M    ./doc
2.7G    .
#

This output is pretty normal. There's 323 MB of stuff in /usr/obj that I can blow away easily enough, to gain another third of a GB. Just for reference, I'm attaching the output of a fairly empty /var filesystem. Depending on the purpose of your system, different /var directories can grow considerably. There's a surprisingly small amount of stuff in /var, however.

# du -h -d1
1.0K    ./account
3.0K    ./at
8.0K    ./backups
2.0K    ./crash
2.0K    ./cron
4.3M    ./db
434K    ./log
7.3M    ./mail
2.0K    ./msgs
1.0K    ./preserve
 54K    ./run
1.0K    ./rwho
 18K    ./spool
10.0K   ./tmp
 20K    ./yp
 62K    ./games
2.0K    ./lib
4.0K    ./ucd-snmp
1.0K    ./heimdal
 12M    .
#

The next time you find /var filling up, you can compare your directory structure to what you have here and at least have a good idea of what is normal on a small system.

I could use du(1) to browse through the entire filesystem and see where the main bloat is. The biggest cause of bloat in the rest of the system is installed software and user data. In the example above, /usr/local consumes over half a gigabyte. Deleting user data is not usually a good idea, but you can track down large packages easily enough with the -s flag to pkg_info(1).
# cd /var/db/pkg
# pkg_info -s *
Information for Hermes-1.3.2:

Package Size:
449     (1K-blocks)

Information for Mesa-3.4.1:

Package Size:
2507    (1K-blocks)
...

This can create huge amounts of output if your system has many packages installed. For example, my laptop has 134 of them. Scan through this looking for large packages.

Information for emacs-20.7:

Package Size:
43800   (1K-blocks)

Emacs is 43 MB? Yeah, yeah, I know, use vi.

Related Reading

Learning the Unix Operating System, 5th EditionLearning the Unix Operating System, 5th Edition
By Jerry Peek, Grace Todino & John Strang
Table of Contents
Index
Sample Chapter
Full Description

While many of the ports are necessary, I find quite a few that aren't vital or that I can easily reinstall. For example, there's 100 MB of teTex as a dependency on /usr/ports/textproc/docproj. That's simple enough to replace from a recent FreeBSD release -- teTex does not change quickly enough to require the freshest possible build.

By removing teTex and /usr/obj, I get enough space to copy this huge file to my laptop. du and pkg_info gave me the necessary information to safely choose the files to delete -- without having to mess with the user data in $HOME/mp3. Data is what's important, after all.

Michael W. Lucas


Read more Big Scary Daemons columns.

Return to the BSD DevCenter.

Copyright © 2009 O'Reilly Media, Inc.