Category Archives: System Administration

Logical Volume Management with RAID (mdadm)


Logical Volume Management


/ has just run out of space, what do we do now? With a normal partition setup, increasing the size of a partition is difficult as it can’t just be resized: typically you would have to backup your files, delete the partitions, create a new partition layout, create file systems, and copy all your files back on.

That’s a lot of work to just add some room to a partition that’s running out of space. Graphical tools such as gparted make this easier, but the system would still have to be taken down for maintenance and may not be easily available for the server in question. Here is where LVM comes in:

[root@localhost ~]# lvextend -L+1G /dev/new_volume/lv_root
  Size of logical volume new_volume/lv_root changed from 2.00 GiB (512 extents) to 3.00 GiB (768 extents).
  Logical volume lv_root successfully resized

[root@localhost ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/new_volume-lv_root
                      2.0G  3.0M  1.9G   1% /mnt

[root@localhost ~]# resize2fs /dev/mapper/new_volume-lv_root
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/mapper/new_volume-lv_root is mounted on /mnt; on-line resizing required
old desc_blocks = 1, new_desc_blocks = 1
Performing an on-line resize of /dev/mapper/new_volume-lv_root to 786432 (4k) blocks.
The filesystem on /dev/mapper/new_volume-lv_root is now 786432 blocks long.

[root@localhost ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/new_volume-lv_root
                      2.9G  3.0M  2.8G   1% /mnt

There are no guarantees with partition resizing. Always keep full up-to-date backups of your data.

With minimal time and work the root partition has now been increased by 1GB, and the system didn’t need to be taken off-line to do it. It was all done while the system was still running.

LVM can be used on a single hard drive, but it also takes multiple hard drives and pools the available space into a single logical volume group. The space is then allocated to the logical volumes that will be used for the Linux file system. If space is running low in an LVM, additional hard drives can be added to the volume group to provide space to extend the logical volumes. Logical volumes can be extended and shrunk as required.

lvg

LVM provides the ability to dynamically allocate space, but it does not provide redundancy (although newer versions may support this). If a disk died in a volume group the entire volume would be destroyed.

LVM on Linux software RAID array


The purpose of a Redundant Array of Independent Disks (at level 1 or higher) is to provide redundancy and keep the system up and running when a hard disk fails; if this happens, the missing data is provided on-the-fly from the redundant information on the other hard disks. This allows the system to continue uninterrupted until the drive can be replaced and the missing data is rebuilt. Using LVM with a RAID allows LVM to continue working even if one of the physical disks dies.

Using a RAID array with LVM does have its disadvantages. RAID will will use the size of the smallest disk when building the array due to the need to stripe information across the drives. In the example above the smallest disk is 100GB, and so only 100GB would be used from each of the other two drives despite their larger capacity.

In a RAID 5 array, the space of one drive will be used for redundancy which makes it unavailable for storage. This issue can be reduced by using more hard drives, but the more hard drives there are the greater the risk of failure becomes.

7

Total available space in a RAID 5 array = “size of smallest device” x (number of disks – 1)

100GB x (3 – 1) = 200GB. With LVM the entire 550GB was available for use in the volume group, but with a RAID 5 array only 200GB out of the 550GB would be available for use due to the mismatched device sizes. To make better use of the disk space the drive sizes should be matched more closely.

lvg-mdadm

Example Setup


I have used a virtual machine with 5 partitions to create an mdadm RAID 5 array. Normally the RAID setup would be done with multiple disks but the concept works for this example. The mdadm RAID array is then used to create a volume group from which the logical volumes can be created.

[root@localhost ~]# mdadm --create /dev/md0 --raid-devices=5 --level=5 /dev/xvdb5 \
> /dev/xvdb6 /dev/xvdb7 /dev/xvdb8 /dev/xvdb9
[...]
md0: WARNING: xvdb9 appears to be on the same physical disk as xvdb8.
[...]
md0: WARNING: xvdb5 appears to be on the same physical disk as xvdb8.
True protection against single-disk failure might be compromised.
md/raid:md0: device xvdb8 operational as raid disk 3
md/raid:md0: device xvdb7 operational as raid disk 2
md/raid:md0: device xvdb6 operational as raid disk 1
md/raid:md0: device xvdb5 operational as raid disk 0
md/raid:md0: allocated 0kB
md/raid:md0: raid level 5 active with 4 out of 5 devices, algorithm 2
md0: detected capacity change from 0 to 6476005376
[...]
mdadm: array /dev/md0 started.
 md0: unknown partition table
[root@localhost ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 xvdb9[5] xvdb8[3] xvdb7[2] xvdb6[1] xvdb5[0]
      6324224 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]

unused devices:

The RAID device is now created and ready to be used. Now to create the LVM volume on the RAID device.

[root@localhost ~]# pvcreate /dev/md0
  Physical volume "/dev/md0" successfully created
[root@localhost ~]# vgcreate new_volume /dev/md0
  Volume group "new_volume" successfully created
[root@localhost ~]# lvcreate -L 500M -n lv_swap new_volume
  Logical volume "lv_swap" created.
[root@localhost ~]# lvcreate -L 2G -n lv_root new_volume
  Logical volume "lv_root" created.
[root@localhost ~]# lvcreate -l 100%FREE -n lv_home new_volume
  Logical volume "lv_home" created.
[root@localhost ~]# mkfs.ext4 /dev/mapper/new_volume-lv_root
[...]
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

[root@localhost]~# mkfs.ext4 /dev/mapper/new_volume-lv_home
[...]
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

Ansible Configuration Management

What is it?


Ansible is a Configuration Management tool designed to automate the deployment of defined configurations from a single host to many hosts. It allows one to define hosts into specific groups, and then run a specific set of tasks against those hosts, allowing one to set up a web server or a mail server quickly and perfectly configured every time. These qualities are essential for productivity in a large server environment.

Ansible is similar to Puppet and Chef in terms of configuration management, but Ansible does not require a client to be installed on the target server in order to perform its job. Instead, Ansible accesses each server through SSH and then performs all its commands through the command line — just like a regular user would — in order to leave as little mark on the system as possible; in fact, as Ansible uses an unobtrusive method of control, it can be used alongside Puppet and Chef.

Installation


The first step is to get Ansible installed on the ‘master’ system (the system you plan to control all of your defined servers from). In this case, all of our servers will be CentOS 6.7 Minimal based. As root:

yum install epel-release
yum install ansible

Ansible is available from the ‘Extra Packages for Enterprise Linux’ repository (commonly known as epel), and so this needs to be installed first. Ansible is now installed on the system and is ready to be configured.

Configuration


SSH keys are necessary to allow access from the ‘master’ server (the server one wishes to control the other machines from) to each ‘client’ machine without requiring a password. Passwords can be defined in Ansible, but keys make life a lot easier.

ssh-keygen

And then a quick loop to copy the new SSH public key on to the target machines

[root@server1 ~]# for i in {2..4}; do
> ssh-copy-id server$i
> done
The authenticity of host 'server2 (10.44.16.152)' can't be established.
RSA key fingerprint is 12:7e:90:4b:af:11:bb:56:40:84:4e:84:e6:78:b8:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server2,10.44.16.152' (RSA) to the list of known hosts.
root@server2's password: 
Now try logging into the machine, with "ssh 'server2'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

The authenticity of host 'server3 (10.44.16.153)' can't be established.
RSA key fingerprint is 12:7e:90:4b:af:11:bb:56:40:84:4e:84:e6:78:b8:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server3,10.44.16.153' (RSA) to the list of known hosts.
root@server3's password: 
Now try logging into the machine, with "ssh 'server3'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

The authenticity of host 'server4 (10.44.16.154)' can't be established.
RSA key fingerprint is 12:7e:90:4b:af:11:bb:56:40:84:4e:84:e6:78:b8:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server4,10.44.16.154' (RSA) to the list of known hosts.
root@server4's password: 
Now try logging into the machine, with "ssh 'server4'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

All of the required configuration files are available from /etc/ansible. The one that is the most important at this point is the hosts file; this allows hosts to be defined in groups in order to run particular sets of Ansible playbooks against them.

[root@server1 ~]# cat /etc/ansible/hosts
[group1]
server3

[group2]
server2
server4

[servers:children]
group1
group2

[servers:vars]
ansible_ssh_user=root

The group names are arbitrary and can be anything that the user chooses, such as ‘webservers’, ‘mailservers’, ‘dbservers’, etc.

CentOS uses SELinux, and this tends to get in the way when changes are made to the system. To get around this, a necessary package is installed on the clients that allows such changes to be made

ansible servers -m yum -a "name=libselinux-python state=latest"

Ansible should now be ready for full usage.

Using Ansible


Now everything is in place to start doing interesting things with Ansible. The following is an example of what can be done

[root@server1 ~]# ansible group2 -m command -a "df -h"
server2 | success | rc=0 >>
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_server-lv_root
                      8.3G  628M  7.3G   8% /
tmpfs                 499M     0  499M   0% /dev/shm
/dev/sda1             477M   30M  422M   7% /boot

server4 | success | rc=0 >>
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_server-lv_root
                      8.3G  627M  7.3G   8% /
tmpfs                 499M     0  499M   0% /dev/shm
/dev/sda1             477M   30M  422M   7% /boot

[root@server1 ~]# ansible servers -m service -a "name=ntpd state=restarted"
server3 | success >> {
    "changed": true, 
    "name": "ntpd", 
    "state": "started"
}

server4 | success >> {
    "changed": true, 
    "name": "ntpd", 
    "state": "started"
}

server2 | success >> {
    "changed": true, 
    "name": "ntpd", 
    "state": "started"
}

Referring back to the /etc/ansible/hosts file one can see that Ansible is executing the commands only on the servers specified within particular groups. You may have noticed a parent group ‘servers’ which all available groups have been specified as children of. This allows much tighter control over which commands are run on which servers.

Ansible Playbooks

The above doesn’t save much time if we have 40 packages to install and a bunch of configuration files to copy across. Sure, it saves us ‘time of configuration’ x ‘amount of servers to configure’, but what if new servers are constantly added which require identical configurations? We’d be back to square one of having to type things out command by command for every server. This is where the power of playbooks comes in.

To demonstrate this, let’s start with a simple playbook that creates a MOTD on our servers. In this example, I’ve created a group in /etc/ansible/hosts called maintenance that contain all the servers that will suffer downtime due to planned maintenance.

[root@server1 playbooks]# cat /etc/ansible/hosts | grep -A4 maintenance
[maintenance]
server2
server3

[servers:children]
[root@server1 playbooks]# cat test_playbook.yml 
---
- hosts: maintenance
  user: root
  vars:
    motd_warning: '\nWARNING! The system will be shut down for maintenance at 17:00 on 16/11/15\n\n'
  tasks:
    - name: setup a MOTD
      copy: dest=/etc/motd content="{{ motd_warning }}"

[root@server1 playbooks]# ansible-playbook test_playbook.yml 

PLAY [maintenance] ************************************************************ 

GATHERING FACTS *************************************************************** 
ok: [server3]
ok: [server2]

TASK: [setup a MOTD] ********************************************************** 
changed: [server3]
changed: [server2]

PLAY RECAP ******************************************************************** 
server2                    : ok=2    changed=1    unreachable=0    failed=0   
server3                    : ok=2    changed=1    unreachable=0    failed=0   

[root@server1 playbooks]# ssh server3
Last login: Thu Nov 12 01:40:01 2015 from 10.44.16.151

WARNING! The system will be shut down for maintenance at 17:00 on 16/11/15

[root@server3 ~]#

If more servers require maintenance, just add them to the group and re-run the playbook.

[root@server1 playbooks]# cat /etc/ansible/hosts | grep -A4 maintenance
[maintenance]
server2
server3
server4

[root@server1 playbooks]# ansible-playbook test_playbook.yml 

PLAY [maintenance] ************************************************************ 

GATHERING FACTS *************************************************************** 
ok: [server3]
ok: [server2]
ok: [server4]

TASK: [setup a MOTD] ********************************************************** 
ok: [server3]
ok: [server2]
changed: [server4]

PLAY RECAP ******************************************************************** 
server2                    : ok=2    changed=0    unreachable=0    failed=0   
server3                    : ok=2    changed=0    unreachable=0    failed=0   
server4                    : ok=2    changed=1    unreachable=0    failed=0

More Complex Ansible Playbooks

Now that the essence of Ansible playbooks has been established, we’ll design a playbook to take care of all of the common tasks that are necessary when a new server is set up.

In this example, there are a series of templates for Ansible to copy to the target systems in order to configure things the way we want them. Here is one such template

[root@server1 playbooks]# cat ssh_banner.j2 

****************************************************************************

 WARNING! This is a private server. The use of this system is restricted to
 authorized users only. Unauthorized access is forbidden. All information
 and communications on this system are monitored.

****************************************************************************

Now for the playbook

[root@server1 playbooks]# cat common_setup.yml 
---
- hosts: servers
  user: root
  tasks:
  - name: Install latest version of libselinux-python
    yum: name=libselinux-python state=latest

  - name: Update server to latest package versions
    yum: name=* state=latest

  - name: Add hosts file
    template: src=hosts.j2 dest=/etc/hosts mode=644 owner=root group=root

  - name: Add SSHD config file
    template: src=sshd_config.j2 dest=/etc/ssh/sshd_config mode=644 owner=root group=root

  - name: Add SSH banner
    template: src=ssh_banner.j2 dest=/etc/ssh/banner.txt mode=644 owner=root group=root

  - name: Restart SSH Service
    service: name=sshd state=restarted

  - name: Add motd.sh file
    template: src=motd.sh.j2 dest=/etc/motd.sh mode=755 owner=root group=root

  - name: Update /etc/profile
    template: src=profile.j2 dest=/etc/profile mode=644 owner=root group=root

[root@server1 playbooks]# ansible-playbook common_setup.yml 

PLAY [servers] **************************************************************** 

GATHERING FACTS *************************************************************** 
ok: [server3]
ok: [server4]
ok: [server2]

TASK: [Install latest version of libselinux-python] *************************** 
ok: [server3]
ok: [server4]
ok: [server2]

TASK: [Update server to latest package versions] ****************************** 
ok: [server4]
ok: [server2]
ok: [server3]

TASK: [Add hosts file] ******************************************************** 
changed: [server2]
changed: [server3]
changed: [server4]

TASK: [Add SSHD config file] ************************************************** 
changed: [server4]
changed: [server2]
changed: [server3]

TASK: [Add SSH banner] ******************************************************** 
changed: [server2]
changed: [server4]
changed: [server3]

TASK: [Restart SSH Service] *************************************************** 
changed: [server4]
changed: [server3]
changed: [server2]

TASK: [Add motd.sh file] ****************************************************** 
changed: [server3]
changed: [server4]
changed: [server2]

TASK: [Update /etc/profile] *************************************************** 
changed: [server2]
changed: [server3]
changed: [server4]

PLAY RECAP ******************************************************************** 
server2                    : ok=9    changed=6    unreachable=0    failed=0   
server3                    : ok=9    changed=6    unreachable=0    failed=0   
server4                    : ok=9    changed=6    unreachable=0    failed=0

And finally just check that the new configurations have done what you expected them to do. Once you know that a playbook does everything you expect it to you should be able to trust it.

[root@server1 playbooks]# ssh server2

****************************************************************************

 WARNING! This is a private server. The use of this system is restricted to
 authorized users only. Unauthorized access is forbidden. All information
 and communications on this system are monitored.

****************************************************************************

Last login: Thu Nov 12 12:13:17 2015 from 10.44.16.151

WARNING! The system will be shut down for maintenance at 17:00 on 16/11/15

      Host: server2.hostname.com
    Uptime: 12:25:19 up 4:29, 1 user, load average: 0.08, 0.02, 0.01
    Memory: 105MB used / 890MB free
      Disk: 791M used / 7.1G free

[root@server2 ~]# logout
Connection to server2 closed.
[root@server1 playbooks]# ssh server3

****************************************************************************

 WARNING! This is a private server. The use of this system is restricted to
 authorized users only. Unauthorized access is forbidden. All information
 and communications on this system are monitored.

****************************************************************************

Last login: Thu Nov 12 03:41:27 2015 from 10.44.16.151

WARNING! The system will be shut down for maintenance at 17:00 on 16/11/15

      Host: server3.hostname.com
    Uptime: 03:53:35 up 4:30, 2 users, load average: 0.00, 0.00, 0.00
    Memory: 105MB used / 890MB free
      Disk: 887M used / 7.0G free