Posts Tagged ‘Linux’

Fixing an LVM in Ubuntu

Monday, November 7th, 2011

WARNING: There are steps, references, commands and/or instructions in this article that can be very dangerous to your filesystem and possibly cause irreversible damage to your data.  You are responsible for maintaining your data – use these commands only after you fully understand the implications of what they do and you’re comfortable and competent.  Any damage resulting from using any of these commands or guidance of this article is fully your responsibility – I can’t be responsible for your data.  By continuing, you agree to release the author, the hosting service for this site, anybody posting comments to this site, your dog, etc. free from liability should any damage occur, whether accidental or intentional.

I use Ubuntu quite a bit and have recently started using LVM on an internal FTP/TFTP/SCP “drop box” of sorts. The reason for LVM is that I needed to dynamically “grow” the amount of disk space available for the server (it was full). Being a VM, it’s easy to carve out another virtual disk and assign it to the VM. LVM takes over from there…

This is all fine-and-dandy if your storage is stable, however we ran into an issue where storage was oversubscribed and completely full on the SAN. This particular server was running on a NetApp SAN and the volumes ended up being taken offline. This happened automatically on the SAN. After freeing up some space for the remaining volumes, I brought the volumes back up. Either this process itself or having me shutdown the VM (I hadn’t noticed that the storage was gone at the point I shut the VM down) resulted in a corrupted filesystem.

Upon trying to boot the server back up, I was presented with a BusyBox screen and several error messages indicating that the filesystem was messed up. I received a message like “Target filesystem doesn’t have /sbin/init” on the screen (with the BusyBox prompt).

Here’s what I did to fix the filesystem and restore the system to full functionality, and a lesson I learned.

To restore the system:

  1. Download the SystemRescueCd ISO (www.sysresccd.org)
  2. Create a new virtual disk (for copying data that I couldn’t live without to) and assign this to the VM
  3. Mount the ISO in the VM on startup and boot off of the CD
  4. Setup the networking (assign an IP address, default gateway, etc.)
  5. SSH into the server
  6. Check the LVM for filesystem errors
    1. e2fsck -n /dev/mapper/<LVM name>
    2. If it shows errors, you might want to continue to fix these errors (if possible)
  7. Mount the old LVM
    1. mkdir /mnt/t (t for temp, or whatever directory name you desire)
    2. mount /dev/mapper/<LVM name> /mnt/t
  8. Create a partition on and format the new virtual disk
    1. Plenty of resources on this – Google for the filesystem you’re wanting to use (ext2, ext3, ext4, etc.)
  9. Mount the virtual disk
    1. mkdir /mnt/n (n for new – again, whatever you want)
    2. mount /dev/sdc1 /mnt/n
  10. Copy data that I couldn’t live without from the old partition to the new one (so from /mnt/t to /mnt/n)
  11. Unmount and fix errors on the old LVM partition
    1. umount /dev/mapper/<LVM name>
    2. e2fsck -v /dev/mapper/<LVM name>
      1. I know there are ways to have e2fsck automatically fix errors, but I wanted to see and approve each error, so I went the somewhat slow path
  12. Shutdown the VM, remove the SystemRescueCD ISO and the new virtual disk
  13. Try booting the VM

After following these steps (best as I can remember), the system was back working again.  In this system, I had a single LVM that was pretty much an “everything” partition – root, FTP/TFTP/SCP storage, etc.  This leads up to my lessons learned:

  • On file servers, particularly VMs, it’s so easy to carve out additional virtual disks, keep your root filesystem (boot loader, kernel, etc.) on one virtual disk that’s only used for base OS functionality.  Create another virtual disk(s) for file storage, using LVMs if necessary.  This allows you to still easily access the data by simply assigning the virtual disk to another “clean” Ubuntu install, bypassing the need to boot off of the SystemRescueCD ISO.  In my instance, everything was combined.  Since it was on a SAN, I had assumed very little risk of data corruption (messed up filesystem), however oversubscription can cause problems.
  • Use oversubscription on your storage sparingly and in a planned fashion.  This bit me, and I’ve heard other IT professionals say “Oh, don’t worry about it – it won’t use it”.  While it might be unforeseeable that a system consume all of the allocated space, it is possible and should be guarded against.  What happens if it does consume all of the space (logs, updates, etc. can all contribute to filesystem growth).  Ensure that your core, mission-critical systems are NOT using oversubscribed storage.
  • When working with LVMs, don’t point to the physical disks (/dev/sdb) and partitions (/dev/sdb1) for troubleshooting – it’ll get you nowhere.  Using the pvdisplay, lvdisplay, etc. commands (for examining your LVMs), you’ll be able to see the LVM name, as well as partitions that comprise the LVMs.  Focus on the LVM, not the partition (at least in my case).  This isn’t to say that sometimes there are physical issues occurring (SAN stats or SMART errors on a local disk should help here).

So, your mileage may vary, but this is what I did to fix the LVM filesystem (and restore functionality of my system).

What do you consider best-practices for Linux VM creation as well as general filesystem tasks?  Do you have a different tip or trick than I’ve mentioned above?

Until next time…

Running two simultaneous instances of the Dynamips hypervisor

Tuesday, February 17th, 2009

Running Dynamips can really tax system resources.  Often times it’s CPU utilization that’s first considered with several different routers.  While high CPU utilization can occur (shouldn’t be constant if the idle-pc values are adjusted properly for the different IOS images), I’ve run into more problems with memory utilization.

Each operating system has certain limitations in terms of how much memory space can be accessed by a single user-mode process.  Many people use one or another variant of the Microsoft Windows operating system (OS).  According to Microsoft (http://msdn.microsoft.com/en-us/library/aa366778.aspx), the limit for 32-bit user-mode processes is 2GB.  This can easily be exceeded with a large Dynamips/Dynagen lab.  The following is how I’ve overcome this limitation, by using multiple Dynamips hypervisor instances.

(more…)

My journey with ioctl() on Linux

Wednesday, October 10th, 2007

Background

Lately, I’ve been working on an application that heavily uses the ioctl() function for different settings with hardware devices (get/set NIC config – IP, subnet, broadcast, MTU, etc.). This function is so versatile and has so many different uses, that it can be difficult at times to understand what exactly is needed for it to work properly. (more…)

HOWTO dual-boot between Linux and Windows using GRUB

Tuesday, May 8th, 2007

When I first started out with Linux, I wasn’t aware of some of the issues that can arise with dual-booting Windows with another OS.  The following ramblings are some of my experiences/findings. (more…)