TuxOnIce HOWTO Florent Chabaud, Nigel Cunningham, Bernard Blackham This HOWTO documents how to hibernate linux kernels to disk using the TuxOnIce kernel patch and various scripts. The source SGML file for this HOWTO is available here . Please make sure all cor- rections are based upon this and sent to Bernard . ______________________________________________________________________ Table of Contents 1. Introduction 1.1 What is TuxOnIce? 1.2 Copyright (skip this if you're in a hurry) 1.3 Why was this document written? 2. Overview 2.1 Kernel support 2.1.1 Kernel 2.2.X 2.1.2 Kernel 2.4.x 2.1.3 Kernel 2.6.x 2.2 Requirements 2.2.1 Requirements for the swapwriter 2.2.2 Requirements for the filewriter 2.3 Compiling the kernel 2.4 Patching the kernel 2.4.1 Initrd/Initramfs's with TuxOnIce 2.5 Installing the new kernel 3. Testing hibernation 3.1 Installing the hibernate script 3.2 Hibernating in text mode 3.3 Hibernating with X running 4. Tweaking 4.1 Hibernate script configuration 4.1.1 SwitchToTextMode 4.1.2 UseDummyXServer 4.1.3 Services (RestartServices/StopServices/StartServices) 4.1.4 Modules (UnloadModules/UnloadAllModules/LoadModules) 4.1.5 Unmounting Filesystems (Unmount) 4.1.6 Network interfaces (DownInterfaces/UpInterfaces) 4.1.7 Check for incompatible programs (IncompatiblePrograms) 4.1.8 Save clock on hibernate (SaveClock) 4.2 Enabling debugging and tuning the system (the sysfs interface) 4.2.1 sysfs entry descriptions 4.2.2 Obtaining debugging information 4.3 TuxOnIce hotkeys 4.4 Avoiding data loss 4.4.1 Booting non-TuxOnIce aware kernels 4.4.2 Windows touches your FAT partitions whilst Linux is hibernated 4.4.3 An initrd mounts your filesystem before resuming 5. Troubleshooting 5.1 It panics! 5.2 It hangs! 5.3 Hibernating in a minimalist environment 5.3.1 Single user mode (init S) 5.3.2 init=/bin/sh 5.4 Asking for help 6. How it works 7. Other configurations 7.1 TuxOnIce compiled as modules (after 2.2.9) 7.2 Using a swapfile 7.2.1 Superfluous details 7.3 Swap on LVM/dm-crypt 7.4 Using an initrd/initramfs but TuxOnIce compiled in 7.5 Keep image mode ______________________________________________________________________ 1. Introduction 1.1. What is TuxOnIce? TuxOnIce is a patch which enhances the ability of Linux to take a snapshot of the state of a computer's memory and save it to disk, and restore that image later. This allows you to fully power down a computer at an arbitrary point in time and not need to reload programs and reopen documents on power-up. In modern laptops, hibernation is not done by the BIOS. They use ACPI and ask the operating system to do most of the job. If you can't satisfy yourself with standby mode (suspend to RAM) because you leave your computer in suspend mode for too long (or because you're not using a 2.5 or later kernel and therefore lack suspend-to-RAM support), then you have to hibernate. 1.2. Copyright (skip this if you're in a hurry) This document is an update by Bernard Blackham and Nigel Cunningham of the original howto, Copyright (c) 2002 by Florent Chabaud. It is currently maintained by Bernard Blackham. You may freely copy and distribute (sell or give away) this document in any format. Send any corrections and comments to the document maintainer. You may create a derivative work and distribute it provided that: 1. If it's not a translation, email a copy of your derivative work to the author(s) and maintainer (could be the same person). 2. License the derivative work in the spirit of this license or use GPL. Include a copyright notice and at least a pointer to the license used. 3. Give due credit to previous authors and major contributors. 1.3. Why was this document written? TuxOnIce (previously known as Suspend2) has a long story, but for a long time it was considered as alpha. This is mainly due to the fact that hibernation has to deal with a lot of aspects in kernel. It is therefore difficult to follow the kernel development (for instance transition from 2.2 to 2.4 was not straightforward). The 2.6 patches are now considered stable. Installing, configuring and using hibernation support is not straightforward, since it involves recompiling the kernel and setting some boot parameters. TuxOnIce's website is at with mirrors around the globe. You can download scripts and updated patches at . You may want to peruse the FAQ also. 2. Overview 2.1. Kernel support 2.1.1. Kernel 2.2.X Software Suspend was first developped by Gabor Kuti and he used to maintain a page for his patch which is mirrored here for historical reference only. A patch is available against 2.2.20 kernel named v7c. Since this patch is no longer maintained, this document will not provide much information on this version. 2.1.2. Kernel 2.4.x Much of the development of hibernation support that has resulted in TuxOnIce was done on the 2.4 kernels, so historic patches can be found for that series. Patches for the 2.4 kernels are no longer maintained, however. All development is now focused on the 2.6 line. 2.1.3. Kernel 2.6.x An early version of hibernation support was directly included in 2.5.18 the main development kernel tree, it was neglected for a long time, but then began to see further development in 2.6. In the meantime, Nigel Cunningham continued to work on the out-of-tree patch, adding many more features. This out-of-tree patch is what was known as Suspend2, and is now called TuxOnIce. 2.2. Requirements TuxOnIce should run on any recent kernel. Like it's in-kernel sibling, it supports 32 and 64 bit kernels, single or multi-processor systems and both PC and PPC architectures. You will also need to decide where you want to store your hibernation image. There are currently two choices - on a swap partition, with the swapwriter, or on a normal partition, with the filewriter. 2.2.1. Requirements for the swapwriter The swapwriter supports writing your image to the available swap space on your machine. As a general rule, you should have as much spare swap space available as you have physical RAM (where spare swap does not include your typical swap usage). You can get away with a little less if you are using LZF compression - this usually attains around 30-50% compression, depending on the contents of RAM. If there isn't sufficient space, TuxOnIce will attempt to free caches until the image will fit, and if that isn't possible, it will abort gracefully. 2.2.2. Requirements for the filewriter The filewriter can write your image to any file on your hard disk. The file must already be created, to the size that will be required (the hibernate script can do this for you). As for the size of the file, the same rules apply as for the swapwriter, however you do not need to take into account normal swap usage. The file must contain the string "TuxOnIce" at the beginning - the hibernate script will do this for you when you set the option "FilewriterLocation /suspend_file 1000" where the 1000 is the size (in megabytes) of the hibernation file to create. It will create the file the first time you hibernate (which takes a rather long time, so do not be alarmed). The file can also be a block device such as /dev/hdaN, in which case the entire raw device is used. This is only recommended for people who know what they are doing. It must still contain the "TuxOnIce" signature string at the beginning of the device. 2.3. Compiling the kernel Compiling a kernel is not too hard but may be a little daunting for those doing it for the first time - the Kernel-HOWTO is a good reference for doing so. Be sure to save your running kernel as something like /boot/vmlinuz.old and add an entry in your lilo.conf file or GRUB's menu.lst to be able to boot it. You need to choose the kernel source tree you want to use as basis. The latest list of supported kernels is on the downloads page . You will need to download the tarball specific to your kernel version. Generally, the 2.6.X.Y kernels won't differ significantly from 2.6.X, so the same TuxOnIce patch can be used. Before applying any patches, you should compile the original kernel and set all configuration options so that your computer works correctly. Save the configuration in /usr/src/myconfig or something like that. The installation instructions for a kernel, including LILO/GRUB setup, can be found on the Kernel-HOWTO . 2.4. Patching the kernel Once you have succeeded in compiling your kernel, you can apply the TuxOnIce patches against your source tree. This is most easily done using a command along the lines of... cd /usr/src/linux (the root directory of your tree) bzcat /path/to/patch | patch -p1 If the kernel versions match, then no rejects or offsets should occur. If your kernel is not the vanilla one from kernel.org, you will have to apply these patches manually and edit some of the rejected hunks. Load your previously saved configuration settings and make menuconfig (or equivalent) again. In the "Power management options" submenu, you will find a 'Hibernation' option' This is the generic, in kernel option. Since TuxOnIce shares code with it, this option needs to be enabled. Having selected this option, a new sub-option will appear: "Enhanced Hibernation (TuxOnIce)", with its own submenu. TuxOnIce can be built either directly into the kernel or as loadable modules. If the modular option is used, you will need to use an initrd or initramfs and ensure that it loads the modules and attempts to resume prior to mounting any device backed filesystems (ie hard disk filesystems). Instructions on doing this are below. If the 'Hibernation' option isn't available, this is probably because you haven't selected CPU hotplugging on an SMP system. Although TuxOnIce can work without swap support, the generic option requires swap support, so this can also make the option disappear. In the TuxOnIce submenu, you will need to select support for allocating storage from swapspace or a generic file. You will probably also want compression support (unless your harddrive is significantly faster than your processor(s)), and support for the userspace user interface ('userui' - nice interface). Keep Image mode is primarily intended for kiosks. It allows you to use the same image again and again, and is therefore suited to configurations in which no writeable filesystems are mounted, or in which writeable filesystems are mounted post-resume and unmounted before powering down. (In this mode, after an image is stored, an attempt to hibernate results in a simple poweroff). Replacing swsusp (the generic implementation) by default is a good idea if you previously had a working swsusp configuration, and just want to switch to TuxOnIce. After compiling and installing the new kernel, everything should 'just work'. Cluster support is support for hibernating and resuming a number of networked computers in a synchronised fashion. At the time of writing (late November 2007), this support is incomplete. Checksumming pageset 2: A TuxOnIce image is stored in two parts, and is not strictly an atomic copy. To ensure that we do get the equivalent of an atomic copy, you can enable checksumming of the first (and largest) part of the image. Checksums will be calculated prior to starting to save the pages in this image, and then again after stopping all other activity but prior to doing the atomic copy of remaining pages. If any pages have changed, they can be resaved in the atomic copy, or the cycle can be aborted. Most of the time, checksumming should not be needed. The debugging information available post-resume will tell you how many pages (if any) needed to be resaved. 2.4.1. Initrd/Initramfs's with TuxOnIce Using an initrd/initramfs with TuxOnIce is possible, but you will have to trigger a resume yourself in your linuxrc/init. To do this, you MUST either modify your distro's initrd/ramfs generation routines or edit your linuxrc (or init) script yourself to contain the line echo 1 > /sys/power/tuxonice/do_resume Either way, this call must come BEFORE your initrd/ramfs mounts your filesystem. If the line is missing, your system will not resume. If the line comes after mounting file systems, you will most likely suffer from filesystem corruption. You have been warned. This will work with most initrd setups produced by the various mkinitrd's. If you however use a kernel command-line containing something like root=/dev/ram0 init=/linuxrc as is described in linux/Documentation/initrd.txt, it will not work. You're probably better off porting your setup to use an initramfs instead. You can now recompile your kernel with TuxOnIce... 2.5. Installing the new kernel You must set the resume= kernel option to the swap partition or file location you want to use for hibernating. For swap storage, if your swap partition is, for instance, the third primary one on first ide disk (/dev/hda3) you have to append "resume=swap:/dev/hda3" as a kernel parameter in your lilo.conf file or GRUB's menu.lst. Note that this value only determines which swap header used to indicate whether we have an image, and where it is. TuxOnIce can and will use any storage that is swapon'd at the time you start to hibernate. All of these devices need to be accessible when we come to resume. If you have encryped swap or are using the device mapper support, all of your swap needs to be configured prior to the echo > do_resume in your initrd/ramfs. It should not be swapon'd, and filesystems on which swap files are located must not be mounted, but the steps prior to doing a mount or swapon should be completed. (This applies accessibility requirement applies for the file allocator, too). For the file allocator, you should first prepare your hibernation file - this can be done by configuring your hibernate.conf file with the "FilewriterLocation" option, and running hibernate --no-suspend. Then take a look in /sys/power/tuxonice/resume for what to pass to your kernel. You should see something like file:/dev/hda7:0x10011f, in which case you should append "resume=file:/dev/hda7:0x10011f" as a kernel parameter in your lilo.conf file or GRUB's menu.lst. IMPORTANT: make this entry the default entry in lilo.conf file. Otherwise, it is likely that you will reboot a non hibernatable kernel on a resume image which may lead to unpredictable results, including severely damaging your filesystem. It may be useful to have another boot option with kernel parameters "resume=swap:/dev/hda3 noresume" instead. This will allow you to skip resuming and boot normally. Once again, it is also advisable to keep an original version of the kernel in your boot options. When you're happy with your file don't forget to run lilo (if you use LILO) and reboot into the hibernation-enabled kernel. It should boot and run the same way as your original kernel. The only difference you should see immediately is that the boot messages (dmesg | less) should contain a note that TuxOnIce has started (eg, "TuxOnIce 3.0-rc2: Normal swapspace found."). 3. Testing hibernation 3.1. Installing the hibernate script The best way to use TuxOnIce is to use the hibernate script. This script should be available on same site as the TuxOnIce patch. Download it to /tmp and type (replace 0.99 with the version you have downloaded): # cd /tmp # tar xzvf hibernate-script-0.99.tar.gz # cd hibernate-script-0.99 # ./install.sh # This will install the script, a default configuration file, and some man pages. You may wish to edit /etc/hibernate/hibernate.conf to taste, depending on how much of your hardware hibernates correctly, and how many little features you'd like. If you are using an RPM-based system, RPMs are available from the same website. If you are using a Debian system, you may point your apt sources at deb http://cp.yi.org/apt/hibernate ./ and apt-get install hibernate. Gentoo packages are available in Portage. Check that the directory of hibernate is in your PATH variable before going further. If you want users to be able to hibernate, use visudo and add the following lines to your /etc/sudoers file, changing the /usr/local/sbin/hibernate to the path it was installed (/usr/sbin/hibernate in the Debian & RPM packages) Host_Alias LOCALHOST=localhost, ALL LOCALHOST=NOPASSWD: /usr/sbin/hibernate * See man sudoers(5) for more information. Users will then be able to do sudo /usr/local/sbin/hibernate. 3.2. Hibernating in text mode For testing hibernating, you should begin by testing text only suspension. Switch to a console by Ctrl-Alt F1, login as root. Under Red Hat, Fedora or Mandrake, stop the X server by typing init 3. Debian users will need to stop the xdm, gdm or kdm service by running /etc/init.d/xdm stop (substituting xdm with your display manager). Users of other distributions should do something similar. Now try hibernate. The hibernate should take place ending with machine halting. Power on the computer and it should recover your session. If not see the troubleshooting section. 3.3. Hibernating with X running You can now recover your graphical session (by typing init 5 for Red Hat/Mandrake users, /etc/init.d/xdm start or similar for Debian users). Log in under X. Try sudo /usr/local/sbin/hibernate and see if suspension works correctly. 4. Tweaking 4.1. Hibernate script configuration The hibernate.conf file has several options that may help you to get hibernation to work correctly on your specific hardware. Most of them are fairly self-explanatory. Some often used ones are mentioned below. Running hibernate -h will give you the full list of options and help. 4.1.1. SwitchToTextMode On some hardware, the graphic chipset isn't properly restored by BIOS upon resume and the state memorized by the resumed kernel becomes inconsistent with that of the hardware. In this case, it may help to switch to a text console before hibernating and switch back to X only after resume. 4.1.2. UseDummyXServer This option may also help if you experience strange things such as 3D not working upon resume. It launches a fake xserver that should cause proper initialization of the chipset. 4.1.3. Services (RestartServices/StopServices/StartServices) It is wise to stop before suspension all services that could be resumed in a different environment. This includes usb, pcmcia and also network services. If your suspension is aborted because a task cannot be stopped by TuxOnIce and this task corresponds to a service, you can tell the hibernate script to stop and start these services. 4.1.4. Modules (UnloadModules/UnloadAllModules/LoadModules) In the same spirit it is wise to unload before suspension all modules that are unused. You can also force the insertion of specific modules upon resume but usually the kernel knows upon resume the modules it needs to reload. 4.1.5. Unmounting Filesystems (Unmount) Hibernating while network filesystems or removable devices are mounted may lead to unpredictible results. For NFS, if the mount should disappear and you mounted with the "hard" option, processes attempting to access files on the mount will hang until it can find the mount again. If you mounted with "soft", you risk losing data when a mount unexpectedly disappears. Hence you should unmount any kind of remote shares before hibernating (this includes NFS, CIFS/smbfs, etc). You can specify the mount points that must be unused before suspension. If these can not be unmounted successfully, the hibernate script will abort unless the --force and/or --kill option is used. The --force option will simply ignore those mount points. The --kill option will try to kill the processes that use these mountpoints. 4.1.6. Network interfaces (DownInterfaces/UpInterfaces) In the same spirit you can specify the network interfaces that must be shut down before suspension. Upon resume, you can ask the hibernate script to set some interfaces up. You can also use the keyword auto to let the network interfaces be restarted in reverse order (the default). 4.1.7. Check for incompatible programs (IncompatiblePrograms) Some programs may be incompatible with TuxOnIce because, for instance, they directly access some hardware that ignores power management events. You may ask the hibernate script to abort hibernating if specified processes are running. 4.1.8. Save clock on hibernate (SaveClock) It is a feature that the resumed kernel recovers its entire memory image, including the system date which therefore appears incorrect. The hibernate script calls hwclock on resume to reset the date according to the CMOS clock. You can ask the script to also save the system date to CMOS just before hibernating. Most of the time this is useless and will lead your clock becoming increasingly wrong. 4.2. Enabling debugging and tuning the system (the sysfs interface) TuxOnIce includes a /sys/power/tuxonice/ interface which allows you to tune & configure TuxOnIce according to your needs. 4.2.1. sysfs entry descriptions The entries within /sys/power/tuxonice/ are: debug_info (Requires debug enabled) Debug information that should be included with any debugging reports. user_interface/debug_sections Which sections to show when viewing debug info. Only available if debugging is compiled in (see include/linux/suspend-debug.h in the kernel source for what each bit controls). user_interface/default_console_level What to use as an initial console log level. (see ``Obtaining debugging information'' below) compression/enabled Disables compression if it was compiled into the kernel (LZF compression is on by default when compiled in). do_hibernate Echo anything into this to start a hibernation cycle. do_resume Echo anything into this to start resuming. user_interface/enable_escape Enables the possibility of aborting a hibernation cycle by pressing the Escape key during hibernation. (Note: some people consider this a security breach if, for example, you activate a hibernation, walk away and a malicious person aborts the hibernation). compression/expected_compression A guesstimate of the compression achieved by the LZF compressor, in order to hibernate when there is less swap available than memory to be written. swap/headerlocations Having made a swapfile and turned it on, this file will tell you what to place on your kernel command line (eg something along the lines of resume=swap:/dev/hda3:0x560) image_size_limit A soft upper limit on the image size. If more than this amount of memory is allocated when TuxOnIce starts, it will attempt to free memory until this constraint is met. If it is unable to free sufficient memory and other constraints are met, it will still hibernate. last_result Bit flags indicating the result of the last cycle (there are many more bits than just those given below - see the source code for a complete list). o 0x1: on if the last hibernate was unsuccessful. Additional bits specify the cause: o 0x2: hibernation aborted at user's request. o 0x4: hibernation aborted because there was no swap space enabled. o 0x8: hibernation aborted because there was insufficient swap available. o 0x10: hibernation aborted because all processes could not be frozen (dmesg will show which processes couldn't be frozen). user_interface/log_everything Set to log output that's not normally logged. user_interface/pause_between_steps Pause between the steps of the process (useful for debugging). powerdown_method Use your machine's ACPI implementation to put the machine to sleep in one of 3 ways. Entering 3 into here will make the machine enter S3 (suspend-to-RAM) after writing the image to disk. This allows for a faster resume under normal circumstances, but no losses if your battery should run out or for any other reason suspend-to-RAM should not wake up. Entering 4 will use the machine's S4 state - this may result in a faster resume, as many of the usual BIOS checks may be skipped. Entering 5 will powerdown the machine as per usual. reboot Set to reboot instead of powering off at the end of saving the image. resume Allows you to set what would be on the kernel command line as resume= (saves a reboot if you forget). This tells the kernel where to write the hibernation image. You will still need to add it to your bootloader (LILO or GRUB) in order to resume from the image! slow Slows down the hibernate process (for debugging). swapwriter/swapfilename Set the name of the swapfile to be automatically enabled and disabled before and after hibernating. See ``Using a swapfile'' for more information. version Read only. The version of the TuxOnIce patch you are using. 4.2.2. Obtaining debugging information To get debugging information, you first need to have compiled TuxOnIce with the 'Compile in debugging output' option enabled. You will probably also want to compile in SysRq support (in the 'Kernel Hacking' section). Having done this, during a hibernation cycle, you can press the 0-9 keys to change the console loglevel. As you do so, TuxOnIce varies the amount of output it produces. Level 0 is the normal 'nice' display. Level 1 enables a little more detail. Levels 3+ turn off the nice display and give more detailed information. You can toggle whether TuxOnIce pauses between each step of the process by pressing the Pause or Break (useful if you have KDB enabled) keys. There are also proc entries to set initial values for these settings. You may not want to view all the information TuxOnIce prints. In this case, you can set the debug_sections parameter to control what portions of output are printed (see include/linux/suspend-debug.h in the kernel source for what each bit controls). 4.3. TuxOnIce hotkeys When a hibernation cycle is in progress there are a number of keys that can be used to toggle various settings. o Esc - if the enable_escape entry is set to 1, pressing Escape will abort a hibernation cycle and restore the machine back to the previous state. Perfect for those "Oh, but I just forgot to ..." situations. o R - toggles rebooting at the end of hibernating (same as the reboot /proc entry). o P (Requires debug enabled) - toggles pausing between major steps when log level > 1. o S (Requires debug enabled) - toggles pausing between minor steps. o ` (Requires debug enabled) - toggles slowing down the hibernate process. o Space - continue when paused. o L - toggles logging all messages to syslog (same as the log_everything /proc entry). This requires that debugging be enabled (compiled in). o 0-7 - sets the console log level during hibernation and resume. This requires that debugging be enabled (compiled in). 4.4. Avoiding data loss While much care has gone into TuxOnIce to ensure reliability and safety of your data, there are some unusual scenarios which may lead to filesystem corruption and data loss. With a modern journalled filesystem (eg, ext3, reiserfs, xfs, jfs) a failed resume should not cause filesystem corruption, as this is the equivalent of simply losing power which should be dealt with gracefully. Other more dangerous situations are described below, with ways to workaround and avoid them: 4.4.1. Booting non-TuxOnIce aware kernels Problem: Booting a second kernel that does not recognise the hibernated image in your swap partition. Or, attempting to resume with the same kernel, but failing to put a resume= parameter in your bootloader, then putting one in and resuming without removing the image first. This is very dangerous as there is no warning. For example, you have two kernels, Kernel A and Kernel B. Kernel A has TuxOnIce support, and Kernel B does not. The danger comes when hibernating with Kernel A, you then decide to boot Kernel B, that does not recognize the swap image. It ignores the "bad" swap partition and runs without it as if nothing happened (except it may run low on memory). The problem to the hibernated Kernel A is that since Kernel B mounted the filesystem r/w, it no longer matches the hibernated kernel and will be corrupted on resume. Thus, severe damage to the filesystem will result once the disk is written to by the resumed kernel. This problem has bitten unsuspecting users when, for example, a new kernel is released for their distribution and automatically installed, or if you boot a live CD (eg, Knoppix) and try to mount filesystems from there. Solution: Should this scenario occur, either the hibernated kernel must be prevented from resuming. This can be accomplished by an mkswap of the swap partition from Kernel B, or booting Kernel A with noresume. To make sure this always happens, below is a snippet from SuSE's bootscripts (courtesy of Stefan Seyfried) that you can put into an init script that is started early on boot (before swapspace is enabled). Another alternative is to modify your bootloader upon hibernating to enforce you to boot the same kernel. The hibernate script has configuration options for doing this in both GRUB and LILO. #!/bin/sh get_swap_id() { local line; fdisk -l | while read line; do case "$line" in /*Linux\ [sS]wap*) echo "${line%% *}" esac done } check_swap_sig () { local part="$(get_swap_id)" local where what type rest p c while read where what type rest ; do test "$type" = "swap" || continue c=continue for p in $part ; do test "$p" = "$where" && c=true done $c case "$(dd if=$where bs=1 count=6 skip=4086 2>/dev/null)" in S1SUSP|S2SUSP|pmdisk|[zZ]*) mkswap $where esac done < /etc/fstab } check_swap_sig 4.4.2. Windows touches your FAT partitions whilst Linux is hibernated Problem: Similar to the scenario above, if you have mounted FAT drives or partitions when hibernating and you then boot into Windows, as soon as you boot back into Linux you will probably end up corrupting your FAT drives. Solution: Unmount your FAT drives before hibernating. If you are using the Hibernate Script, you can use the following line in your hibernate.conf: UnmountFSTypes vfat fat msdos 4.4.3. An initrd mounts your filesystem before resuming Problem: An incorrectly setup initrd can corrupt your filesystem if it tries to mount it before resuming. If your filesystem is touched in any way before resuming, your data will risk serious corruption. This even includes mounting the filesystem in read-only mode as many journalling filesystems (ext3, reiserfs, xfs) will replay the journal. Solution: Ensure your initrd is setup correctly - attempting to resume should be the first thing it tries to do after mounting /proc. If you are using an initrd, enable the "Warn if possibility of filesystem corruption" in the TuxOnIce kernel configuration. This will throw a BIG FAT WARNING if you try to resume after a filesystem has been mounted read/write. It won't catch all cases, but will probably get the most common ones. You may find this will throw spurious BIG FAT WARNINGs at your initrd being mounted read-write (which is safe in itself, as it is not linked to the hard disk). To avoid the warning here, you should mount your initrd as read-only whilst trying to resume, by putting the two mount lines below into your linuxrc script: mount -n -o remount,ro / echo 1 > /sys/power/tuxonice/do_resume mount -n -o remount,rw / 5. Troubleshooting 5.1. It panics! D'oh. If you can capture the panic somehow (either over a serial console, or digital camera if need be), and send it to the list, somebody might hopefully be able to do something about it. Before you send it to the list, try hibernating from single user mode (described below) and see if the same thing happens. Panics captured on 2.4 must be run through the ksymoops program before any useful information can be obtained from them. 5.2. It hangs! The most common symptom of a non-functioning TuxOnIce is a hang either during hibernating or resuming. Hangs while resuming have two sources: 1. Foul ups by Nigel in the kernel code 2. Driver issues Type 1 issues will generally affect areas other than the point where you see "Copying original kernel back". Type 2 issues are more common, and affect particular hardware (and therefore some computers/configurations, but not others). There are usually workarounds for these issues, and the workarounds generally involve building functionality as modules and stopping related userpace daemons and unloading the modules prior to hibernate, and reloading/restarting afterwards. Particularly common here are issues with USB, DRI/DRM, FireWire, cpufreq, sound card and network drivers. In the case of DRI/DRM, you can't simply disable it prior to resuming, so people (unfortunately) have to choose between 3D performance and the ability to hibernate (unless you're willing to stop X prior to hibernating and restart it afterwards). As driver writers improve their support for hibernating and resuming, it is expected that these issues will ultimately go away. Finding which driver is causing hibernation to hang is sometimes less than obvious. In these cases, we recommend starting with a minimal environment (init S) and seeing if you can hibernate from there, or failing that with init=/bin/sh. If you have success, add more and more components of your normal configuration until you find something that breaks hibernating. If you can't even hibernate from init S, it may be that you've compiled the driver that's causing problems into the kernel. In this case, look at your kernel configuration and seek to compile as modules as much as possible. Then try with a minimal configuration again. 5.3. Hibernating in a minimalist environment To eliminate problems with drivers, people will often recommend to hibernate from single user mode or use init=/bin/sh. 5.3.1. Single user mode (init S) Linux distributions where many services are not enabled, and ideally, many modules are not enabled. It can be entered by adding the letter "S" to the kernel command line when you boot. Unfortunately some distributions will still load many modules in this runlevel, so if hibernation still fails here, you might need to try using init=/bin/sh. 5.3.2. init=/bin/sh In this mode, absolutely nothing else is done afer loading the kernel. It requires a bit of manual intervention to get things going into a usable state, documented below, but should help in isolating bad drivers causing TuxOnIce to fail. o To enter this mode, as you boot your machine you need to catch your bootloader (eg, LILO or GRUB) before it boots your kernel. Edit the kernel command-line to add the text "init=/bin/sh". In LILO this can be done by holding the Ctrl key down to bring up the menu, selecting your kernel with the arrow keys, then typing " init=/bin/sh" and press Enter. In GRUB, you can do this by highlighting your kernel in the menu, press E to edit, highlight the line that begins "kernel ...", press E to edit that, and type " init=/bin/sh" on the end, press Enter to save, and press B to boot. o You should be greeted with a screen of the kernel booting and a prompt that looks a little like sh-3.00# o From here, enter the following commands: o # mount -o remount,rw / o # mount -t proc proc proc o # swapon -a You may need to start devfsd or udev too: o # /etc/init.d/devfsd start And now try running the hibernate script to see if it succeeds. If so, you can blame a module and now the process begins to load mod- ules one by one until hibernation breaks. o To cleanly reboot from this point, you will want to do the following: o # mount -o remount,ro / o # sync o # reboot 5.4. Asking for help People on the suspend2-users and suspend2-devel mailing lists are generally pretty helpful, but they'll be better able to help you if you can provide more information. In particular, it will help if you provide information regaring what make/model of computer you have, what modules you have loaded while trying to hibernate, and a copy of your kernel configuration. Most of details can be quickly and easily gathered using the 1.03 or later hibernate script with the --bug-report option. If you are sending a kernel panic from a 2.4 kernel, be sure to run the output through ksymoops and send the result. Otherwise the panic is of very little use to anybody. 6. How it works For the more curious users, briefly listed here are the steps of the kernel part of the hibernation process. This may also help in tracking why it doesn't hibernate. 1. Stop all processes. If some processes can't be stopped, abort. 2. Eat memory. At this stage we check if we have enough memory to save the image and meet your specifications for maximum image size. If necessary, we unfreeze the processes, eat memory until we think we have enough and then try again from step 1. 3. Suspend drivers. This accounts for the screen briefly going blank on some computers. 4. Prepare a directory of pages and save it along with the image. This is done in two parts. First, all of the pages we know aren't needed for hibernation (ie, not buffer & page caches) are saved, being very careful not to have a lasting impact on the image we're making. Then, an atomic copy is made of the remaining pages (kernel & process space) using the memory just saved and any other free RAM. This copy is then saved. Finally, we save the page directory for the latter set of pages separately and also store the page directory's location in the swap header. We can then (if your computer supports it) power down. Resuming is essentially the reverse of this process. The pagedir is loaded, as is the copy (being careful not to use RAM we're about to overwrite). Then we copy the old kernel back and restore registers. We are now running with the original kernel image. Finally, we load the first set of pages we saved, clean up and exit. 7. Other configurations 7.1. TuxOnIce compiled as modules (after 2.2.9) Support for compiling TuxOnIce as modules was removed for quite a while, but has returned after the 2.2.9 release. Building TuxOnIce as modules lets you free the memory for other uses when not hibernating, which is particularly useful in embedded applications. 7.2. Using a swapfile 1. Create and activate your swap file: o # dd if=/dev/zero of=/path/to/swapfile bs=1M count= o # mkswap /path/to/swapfile o # swapon /path/to/swapfile o Make sure you know what device your swapfile is on (eg, /dev/hda4). If you're not sure, run mount and find the longest mountpoint that is a prefix to the path of the swapfile (eg, it might be /usr/local, /usr or just /). Read the corresponding device to this mountpoint. 2. Find out your header locations: o # cat /sys/power/tuxonice/headerlocations For swapfile `/path/to/swapfile `, use resume=swap:/dev/:0x2da2b4. o For example, if I know the swapfile is on /dev/hda1, then my "resume=" line becomes resume=swap:/dev/hda1:0x2da2b4 3. Enter your resume= line: o On the kernel command line either in lilo.conf, or GRUB's menu.lst, put the full resume=... parameter. o If you use the hibernate script, in hibernate.conf put SuspendDevice swap:/dev/hda1:0x2da2b4 o If you use a custom hibernate script, enter echo swap:/dev/hda1:0x2da2b4 > /sys/power/tuxonice/resume somewhere before hibernating. Doing one of the last two changes above is not required, but if you don't you will have to reboot your machine before being able to hibernate to the swapfile. 4. Get your swap partition automatically enabled and disabled for hibernating: o By entering the name of your swapfile into /sys/power/tuxonice/swapfile before hibernating, TuxOnIce will automatically enable and disable the swapfile for you. This is not required, but merely a convenience. 7.2.1. Superfluous details Specifying the resume= device simply tells TuxOnIce where to write and find the header that indicates the machine is suspended and where to find the rest of the image. The image itself is written to any available swapspace whether it be swapfile or swap partition. The actual order of swap devices that data is written to turns out to essentially be the order in which swap partitions are enabled. Hence it's entirely plausible if you have a normal swap partition (or file), and one turned on only for hibernating, that the entire image is contained in the first swap partition, but the header alone is in the second. 7.3. Swap on LVM/dm-crypt This requires Suspend2 2.1.4.2 or later. You need to setup LVM/encrypted swap before TuxOnIce attempts to resume. This means you need an initrd or initramfs. The linuxrc/init script on such an initrd/initramfs roughly needs to do the following steps: 1. Mount /sys and /proc. 2. Load the modules you need. 3. Make the device nodes you need, be it by starting udev or just mknod'ing them. 4. Start lvm. For example by issuing vgscan vgchange -a y or by using dmsetup. 5. ... or setup your encrypted swap. 6. echo 1 > /sys/power/tuxonice/do_resume 7. Only now is it safe to mount your roofs (if required). 8. Do other things init/linuxrc would normally do (mount --move /proc, /sys, /dev; pivot_root; etc) Of course you need to make sure that all the binaries and libraries you need are on the initrd/initramfs. 7.4. Using an initrd/initramfs but TuxOnIce compiled in If you are using an initrd, you MUST edit the linuxrc script to attempt to resume before filesystems are mounted. Do this by inserting the line: echo 1 > /sys/power/tuxonice/do_resume somewhere after mount /proc but before mounting filesystems in your linuxrc script. If you are using an initramfs, you will need to do the same thing to your /sbin/init script, else you will never be able to resume. See the Modules FAQ for some distribution- specific tips. 7.5. Keep image mode The idea behind keep-image mode is that when you have resumed, the image is not invalidated, meaning you can resume from it several times. This is useful in kiosk-type situations, or perhaps in computing labs, where you want to boot quickly to a known state, every time. The only caveats with keep-image mode is that no filesystem that is backed on disk can be mounted read-write in the suspended image. Nor can they be modified without creating a new resume image. The two issues to keep in mind are that: 1. the kernel's filesystem drivers keep a partial copy of the state of the filesystem in memory, which is saved in the image. Thus, if the filesystem on disk is modified without the kernel in the saved image knowing, you will almost certainly cause severe filesystem corruption when it resumes. 2. the kernel also keeps a cache of files on the filesystem. If, for example, a file is read off disk from a read-only filesystem, and the file is subsequently modified without a new image being created, the kernel in the image will not be aware of this and you'll more than likely hit it's cached copy. Memory-backed filesystems (eg, tmpfs) are safe - their state is restored to how things were at hibernation time. You may also mount filesystems after resuming without much concern. To setup a system for keep-image mode, you setup your machine how you want it at resume time, then mount all your disk-backed filesystems read-only or unmounted. Then (with a kernel compiled with keep-image mode - CONFIG_SUSPEND2_KEEP_IMAGE), echo 1 > /sys/power/tuxonice/keep_image and hibernate in the usual manner. When you resume, the image is not invalidated, so you can resume from it as many times as you like. So long as the state of the filesystem on the hard disk does not change, you should be safe.