TroyGrosfield.com TroyGrosfield.com

Headline

MongoDB with RAID 10 on Ubuntu 11.04

Author
by Troy Grosfield
Date
November 8th, 2011
Category
Developer
Story

I recently setup RAID 10 on our MongoDB server. I thought I would share how and why I did it.  We’re using Amazon’s Web Services (AWS) and running Ubuntu 11.04.

  1. What is RAID 10
  2. Why We Chose RAID 10
  3. Setup RAID 10 with MongoDB on Ubuntu 11.04
  4. Monitor the RAID
  5. Troubleshooting
    1. What to do When a Device in An Array Goes Bad
    2. How to Replace the Faulty Device
    3. What to do if the RAID fails
    4. Replacing an old MongoDB Instance with a new MongoDB RAID Instance
  6. Resources

What is RAID 10

RAID is an acronym for Redundant Array of Inexpensive/Independent Disks.

RAID 1+0: (a.k.a. RAID 10) mirrored sets in a striped set (minimum four drives; even number of drives) provides fault tolerance and improved performance but increases complexity.

The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses so long as no mirror loses all its drives.

RAID 10 works very well with MongoDB for both data replication as well as allowing us to handle cases where EBS drives go bad.  Some of you may be asking “why not just use mongodb replica sets”?  Well, we are.  See below why we chose to use RAID as well.

Why We Chose RAID 10

The single point of failure we were having in our development stack was when an attached EBS (drive) would go bad in on our mongo master instance.   We were incurring downtime until we took that instance out of the replica set.  The mongodb replica set failover only works when the actual hardware goes bad.  Not the EBS drive itself.  The EBS would fail (happened about 3 times so far this year) and mongo wouldn’t failover to either of the two mongo instances we had in the replica set and we would get some nasty PROD error emails.

Setting up RAID 10 prevented us having the single failure point and also correctly fails over to one of the other mongo instances in the replica set if the RAID fails.

RAID 10 is also the recommended RAID level by mongodb.

Setup RAID 10 with MongoDB on Ubuntu 11.04

Below are the steps to create RAID with Mongo.

  1. (AWS specific) In the AWS console, create your EC2 instance that will become your MongoDB server
  2. (AWS specific) Create 4 EBS Volumes of the same size in the same zone as your EC2 instance
  3. (AWS specific) Attach the 4 newly created volumes to your MongoDB server.  When you attach the devices it will ask you where you want to attach the drives.  Attach the devices at /dev/sdp4, /dev/sdp5, /dev/sdp6 and /dev/sdp7. The numbering isn’t super important.  Just make sure they are different.
    • NOTE: with the ubuntu 11.04 AMI, whatever location you use to specify, it will be located at /dev/xvdp# instead of /dev/sdp#.  See https://forums.aws.amazon.com/thread.jspa?messageID=205549
  4. ssh into your EC2 instance.
  5. Look to see the newly attached drives:
    $ ls /dev
    // The ouput here should show you all of your drives on this
    // box. Make note that you don't have md0 (this is where 
    // we will create the RAID) and see where you're newly
    // attached EBS' are located. You should see something like:
    ...
    xvdp4
    xvdp5
    xvdp6
    xvdp7
  6. We’re going to use mdadmto manage our RAID.  It’s a Linux utility to manage sofware RAID devices.  So let’s install mdadm:
    $ sudo apt-get install mdadm
  7. Now that mdadm is installed, let’s create the RAID with our EBS drives:
    $ sudo mdadm --create --verbose /dev/md0 --level=10 --chunk=256 --raid-devices=4 /dev/xvdp4 /dev/xvdp5 /dev/xvdp6 /dev/xvdp7
    • –create is the command to create the RAID
    • –verbose prints the output
    • /dev/md0 is where we create the array.  If md0 is taken, then feel free to use md1.
    • –level is the RAID level we’re intending to setup
    • –chunk is the chunk size
    • –raid-devices are the number of devices in your RAID
    • /dev/xvdp4 /dev/xvdp5 /dev/xvdp6 /dev/xvdp7 these are the EBS devices we attach and what the drives the RAID will consist of.
  8. Create the filesystem for the RAID:
    $ sudo mke2fs -t ext4 -F /dev/md0

    MongoDB recommends using either ext4 of xfs

  9. Create the directory where we’ll mount the Mongo EBS RAID:
    $ sudo mkdir -p /ebs
  10. Append the following line to your filesystem configuration file:
    $ sudo vi /etc/fstab
    // Add the following line at the bottom of the file
    /dev/md0        /ebs     auto    defaults,nobootwait,noatime     0       0
  11. Mount the RAID to the ebs folder we just created:
    $ sudo mount /dev/md0 /ebs
  12. Check the status to make sure the RAID was successfully created and it working properly:
    $ sudo mdadm --detail /dev/md0
    // Sample healthy RAID output:
    
    ...[omitted]...
    
     Active Devices : 4
    Working Devices : 4
     Failed Devices : 0
    
    ...[omitted]...
    
     Rebuild Status : 6% complete
    
    ...[omitted]...   
    
    Number   Major   Minor   RaidDevice State
       0     202      244        0      active sync   /dev/xvdp4
       1     202      245        1      active sync   /dev/xvdp5
       2     202      246        2      active sync   /dev/xvdp6
       3     202      247        3      active sync   /dev/xvdp7

    The build status can take a very long time depending on the size of your volumes. It took about 2.5 hours for the sync to complete for us with 4 EBS as 120 GB per EBS.

  13. Once the sync is complete your RAID is setup.  Now you can install MongoDB an add this EC2 instance to your MongoDB replica set.

That’s it, you’re done!  You now have RAID setup.

NOTE: We’re using AWS, but the RAID isn’t dependent upon AWS in any way.  You just simply need to setup your server and attach the devices you want in your array accordingly.

Monitor the RAID

mdadm has a method (--monitor) that will monitor your RAID for you which is fine. How every we wanted to create a cron job that would run every 10 minutes that would monitor the RAID for us and email us if a device in the RAID is faulty.

  1. Create the script to check the RAID status and send an email if it’s faulty:
    #!/bin/sh
    
    STATUS="$(cat /proc/mdstat | grep '(F)')"
    
    if [ "$STATUS" != "" ]
    then
        DETAILED_STATUS="$(sudo mdadm --detail /dev/md0)"
        MESSAGE="Failed device on $STATUS\n\n$DETAILED_STATUS"
        sendEmail -f some_from_email@gmail.com \
                  -t someadminemail@gmail.com \
                  -u 'Failed device on RAID' \
                  -m "$MESSAGE" \
    fi
  2. Create a cron job that will run the script every 10 minutes:
    # Job to monitor the mongo RAID array to make sure no devices went bad.
    # If a devices is indeed bad, an email will be sent to the admins.
    # This task runs ever 10 minutes.
    */10 * * * * sudo sh /location/to/monitor/script/RaidMonitor
  3. Start the cron job:
    $ crontab /location/to/crontab/file/jobs.crontab

Troubleshooting

What to do When a Device in An Array Goes Bad

These are the steps you take to see when a device in the array goes bad and needs to be replaced.  Check the status of RAID array. There are two ways of doing this:

  1. $ cat /proc/mdstat

    When running this command, a healthy array output will show the RAID name and
    all the devices in the array:

    md0 : active raid10 xvdp7[3] xvdp6[2] xvdp5[1] xvdp4[0]

    A RAID with problems will show output like the following:

    md0 : active raid10 xvdp7[3] xvdp6[2] xvdp5[1] xvdp4[0](F)

    You’ll notice the (F) near device xvdp4. This shows that it’s faulty.

  2. $ sudo mdadm --detail /dev/md0

    Near the sample output if everything is healthy you’ll see something like:

    ...[omitted]...
    
     Active Devices : 4
    Working Devices : 4
     Failed Devices : 0
    
    ...[omitted]...
    
    Number   Major   Minor   RaidDevice State
       0     202      244        0      active sync   /dev/xvdp4
       1     202      245        1      active sync   /dev/xvdp5
       2     202      246        2      active sync   /dev/xvdp6
       3     202      247        3      active sync   /dev/xvdp7

    If a device in the array goes bad it will look something like:

    ...[omitted]...
    
     Active Devices : 4
    Working Devices : 3
     Failed Devices : 1
    
    ...[omitted]...
    
    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1     202      245        1      active sync   /dev/xvdp5
       2     202      246        2      active sync   /dev/xvdp6
       3     202      247        3      active sync   /dev/xvdp7
    
       0     202      244        -      faulty spare   /dev/xvdp4

    From this we can see that /dev/xvdp4 is bad and needs to be removed from the
    array.

How to Replace the Faulty Device

  1. ssh onto the Mongo master instance and remove the faulty device from the array:
    $ sudo mdadm /dev/md0 -r /dev/xvdp4
  2. In the AWS console and create and attach a new volume to your mongo
    master instance.  Make sure to give it a new attachment location (i.e. /dev/xvdp8)
  3. Once attached, locate the new attached volume in /dev and add it to the array:
    $ sudo mdadm /dev/md0 -a /dev/xvdp#

    NOTE: replace the ‘#’ with the number you used when you attached the device.
    If you forgot, you should be able to see what the attached number was
    in the AWS console.

  4. Run the following command to see if it’s correctly added to RAID and starting
    to sync:

    $ sudo mdadm --detail /dev/md0

    The output will look something like the following:

    ...[omitted]...
    
     Active Devices : 3
    Working Devices : 4
     Failed Devices : 0
      Spare Devices : 1
    
    ...[omitted]...
    
    Rebuild Status : 6% complete
    
    ...[omitted]...
    
    Number   Major   Minor   RaidDevice State
       0     202      244        0      spare rebuilding /dev/xvdp8
       1     202      245        1      active sync   /dev/xvdp5
       2     202      246        2      active sync   /dev/xvdp6
       3     202      247        3      active sync   /dev/xvdp7
  5. Once the sync finishes, you’re done!

What to do if the RAID fails

This likely happens because 3 or more of the attached EBS went bad (very
unlikely), but in the event this does happen…

  1. Verify this is the case check the RAID status by running the command:
    $ sudo mdadm --details /dev/md0
  2. If the state shows failed then then continue on.
  3. ssh onto the raid mongo box
  4. Unmount the md device:
    $ sudo umount /dev/md0
  5. Stop the RAID array:
    $ sudo mdadm -S /dev/md0

    Stopping the array will actually force a secondary replica set member become
    primary. So if we have downtime, stop the array and we’ll be back up.

  6. Recreate the RAID with the correct working devices:
    $ sudo mdadm --create --verbose /dev/md0 --level=10 --chunk=256 --raid-devices=4 /dev/xvdp# /dev/xvdp# /dev/xvdp# /dev/xvdp#
  7. Mount the RAID array:
    $ sudo mount /dev/md0 /ebs
  8. Wait until the resyncing is complete before continuing. Check status by
    running:

    $ sudo mdadm --detail /dev/md0

Replacing an old MongoDB Instance with a new MongoDB RAID Instance

This is how we made the switch to incorporate the new MongoDB RAID Instance into our replica sets and make the new RAID Mongo instance the master.

  1. Follow the steps to create a new RAID Mongo instance at the beginning of this post
  2. Wait for the RAID sync to happen. This takes a long time (can take up to a few hours)! You can add the new box with RAID onto the mongo replica sets, but I would wait until the RAID sync is finished. You can check the sync status by running the fab command:
    $ sudo mdadm --detail /dev/md0
  3. Once the RAID sync is finished, go into mongo master shell and add the new mongo box into the replica sets.
    $ rs.add("ec2-##-##-###-###.compute-1.amazonaws.com")
  4. Wait until the mongo RAID instance has sync’d with the other replica sets. Test by going into mongo shell of the new box and verifying the data is all there.  A simple count on the number of competitions and/or users should match on both the raid mongo box and the other mongo box.
  5. Once you’re confident all the data has sync’d, make the new mongo box with RAID the master by running the following:
    > config = rs.conf()
    {
    	"_id" : "foo",
    	"version" : 1,
    	"members" : [
    		{
    			"_id" : 0,
    			"host" : "A",
    		},
    		{
    			"_id" : 1,
    			"host" : "B",
    		},
    		{
    			"_id" : 2,
    			"host" : "C",
    		},
    		{
    			"_id" : 3,
    			"host" : "D",   config.version++
    > // the member number is the 0 based index in the list. Not the _id.  So
    > // '3' is the 3rd index or 4th member in the array.
    > // the default priority is 1
    > config.members[3].priority = 2
    > rs.reconfig(config)

    see: http://www.mongodb.org/display/DOCS/Forcing+a+Member+to+be+Primary

    Keep checking the status until you see the box you want become “Primary”

  6. Remove the unused mongo box from the replica sets in the mongo master shell. It’s no longer being used.  Remove it by issuing the following command in the mongo master shell:
    > config = rs.config()
    > config.version++
    > // Find the members you want to keep.
    > config.members = [config.members[1], config.members[2], config.members[3]]
    > rs.reconfig(config)
    > rs.status()

    You should only see the three mongo instance members you wanted to keep.

You now added the new Mongo instance with RAID into the replica set and removed the old replica set.

Helpful resources

Tags
Comments
11 Comments »

11 Comments

Leave a reply

 
  1. Author
    Stuart Battersby
    Date
    August 19th, 2012 at 2:12 am
    Comment

    Troy, great article. We’re in the situation where the VM holding the disks has died. If we’re to attach these 4 disks with their data to a new instance, how would we get mdadm to recognise that data exists on them and rebuild the array without wiping the disks? Thanks!

  2. Author
    Sandip Chaudhari
    Date
    August 12th, 2012 at 10:53 am
    Comment

    Very helpful and thorough article. Thanks for sharing all this information.

    I would like to add to the resources – except monitoring all the rest can be automated using Amazon’s CloudFormation – http://www.mongodb.org/display/DOCS/Automating+Deployment+with+CloudFormation

    and ofcourse, monitoring can also be added to the CloudFormation by tweaking the template.

  3. Author
    Garrett Robinson
    Date
    June 21st, 2012 at 4:51 pm
    Comment

    Thanks for taking the time to write this! Great article, very clear and straightforward.

  4. Author
    How to build a RAID 10 EBS array on Amazon EC2 with Ubuntu 12.04 « Nate's Blog
    Date
    May 9th, 2012 at 7:26 pm
    Comment

    […] http://blog.troygrosfield.com/2011/11/08/mongodb-with-raid-10-on-ubuntu-11-04/  (great article) […]

  5. Author
    Notes on MongoDB @ AWS-Ubuntu-12.04 XFS, RAID10 & LVM « My missives
    Date
    May 5th, 2012 at 11:03 am
    Comment

    […] MongoDB, RAID10 & Ubuntu – includes detailed commands & explanations […]

  6. Author
    Joe
    Date
    February 18th, 2012 at 1:46 pm
    Comment

    Thanks a lot for this post, very helpful

  7. Author
    Amnon
    Date
    January 28th, 2012 at 7:50 am
    Comment

    Thank you very much for this post, very helpful and saved me probably 2 working days of sorting this out on my own

  8. Author
    [발 번역] MongoDB With RAID 10 on Ubuntu 11.04 | Charsyam’s Blog
    Date
    January 15th, 2012 at 9:34 am
    Comment

    […] 글은 MongoDB with Raid 10 on Ubuntu 11.04(http://blog.troygrosfield.com/2011/11/08/mongodb-with-raid-10-on-ubuntu-11-04/) 라는 글을 발번역한 것입니다. 오역에 […]

  9. Author
    Alex
    Date
    December 30th, 2011 at 5:11 pm
    Comment

    Does anything need to be done to have the raid array up on reboot and /dev/md0 mounted to /ebs?

  10. Author
    Valeeum
    Date
    November 25th, 2011 at 10:23 pm
    Comment

    Thanks for this great writeup. If one is to automate this process by creating an install script using the steps you’ve outlined above, how can you implement a delay with a check to see if the raid re-syncing is complete?

  11. Author
    MongoDB Raid 10
    Date
    November 12th, 2011 at 11:01 am
    Comment

    […] What is RAID 10 […]