Automate incremental backups of your family's data with FreeNAS

10. September 2012

FreeNAS: network attached storage

All harddrives fail. This guide will show you how to create something every home needs: highly efficient automated backups for every computer requiring so little network bandwidth it's practical even over Wireless-G (thanks rsync). All you need is an old PC with at least one harddrive, a 2GB or larger flash drive for the FreeNAS operating system, and a blank CD.

I prefer to burn the ISO from FreeNAS.org with ImgBurn. Network your soon-to-be FreeNAS box to your router via an ethernet cable. Before booting to the CD you may want to jump into BIOS and confirm the system is set to boot from not only CDs but also USB drives, and that it isn't set to halt on a missing keyboard if you plan to remove the peripherals later.

Insert and boot to the CD. At the FreeNAS prompt choose to Install/Upgrade it to your flash drive. More than likely you aren't upgrading an existing installation. As noted, the drive you install FreeNAS to will be completely wiped, so make sure you've properly selected your flash drive.

Remove the CD, reboot, and hopefully you're greeted with a URL that you can successfully access. (Provided your router has DHCP enabled.) If you can, and the BIOS isn't set to halt on a missing keyboard, your FreeNAS box should no longer require any peripherals such as a monitor, keyboard, or mouse.

Browse to the web interface from a different machine. http://freenas should work, if not try the one given and you may need to tweak your network's DNS configuration. Set a password by clicking Account, Change Password (leave Change root password as well checked).

Now we will set up a storage volume, click Storage and Volume Manager. Select your storage disk(s) and filesystem type. ZFS is recommended but "requires 4GB of memory and needs 6 to run smoothly", mine is an old box so I chose UFS. I named my volume data but you may want to choose something that shows up better in log files. FreeNAS will wipe the selected drives, so make sure you don't need any data from them.

I created a user account for my mom so that I could give only her access to her folder. If you're going to add others and would like folders shareable between them, first create a group they can belong to (such as "family"). You'll find Add Group and Add User under Account. Add a user, specifying username, full name, and password. Uncheck Create a new primary group for the user, and select either the family group you created or nogroup. Leave the rest at defaults.

(Screenshot of edit window, the one during creation is larger.)

Now we will create our folders. Click Shell and navigate to the volume with cd /mnt/data or similar. Since we may store other files on our NAS, let's create a folder just for backups: mkdir backup. We'll navigate into our folder, cd backup, and create a folder for mom: mkdir mom. We can view permissions with ls -l. Her folder is owned by root, and group wheel (whose members consist of just root by default). Mom's account is neither root nor a member of wheel, so let's change the folder's ownership to mom: chown mom mom (username then folder).

Now we'll set up our Windows shares. Close the shell and navigate to Sharing, Windows (CIFS) Shares, Add Windows (CIFS) Share. If we share the backup folder in general she may see folders to which she doesn't have access, so let's share her folder directly. Name: mom, Path: /mnt/data/backup/mom, defaults for everything else. Click OK and you will be asked to enable the CIFS service, choose Yes.

You can now navigate to \\freenas\mom from a networked computer. Browse there from hers, supplying her FreeNAS username and password and select to Remember my credentials. This is important. Hooray! Access. You've just learned to use FreeNAS.

rsync on the server

rsync is an incredibly cool protocol and tool for sending only changed data through a network, hence small incremental backups to our FreeNAS box on a regular basis will yield a consistently complete backup.

We'll begin by creating an rsync module on the server, which isn't much more than an alias to a path with the credentials to access it. Use Add Rsync Module under Services. We'll name it mom, path: /mnt/data/backup/mom, user: mom.

Under Services click to turn on Rsync, and the wrench icon to configure it. We will add the following Auxiliary parameters:

incoming chmod = ug=rwx,o=
exclude from = /mnt/data/backup/exclude.txt
These settings apply to all rsync modules. Chmod sets the permission on incoming data so that our folder and files can continue to be accessed.

Back in Shell navigate to backup (cd /mnt/data/backup) and let's create exclude.txt to avoid backing up non-essentials. nano exclude.txt opens the file in a text editor, where we enter the following:

/$Recycle.Bin
/Windows
/Program Files
/hiberfil.sys
/pagefile.sys
Temp
Temporary Internet Files
*.tmp
Paths matching these patterns will be excluded. The initial forward slash designates the root of the folder we're sending, in our case this will be C:\, thus excluding C:\Windows, C:\Program Files and so forth. The others match anywhere, excluding all folders named Temp, Temporary Internet Files, and all files ending in .tmp.

Ctrl+Shift+X exits the editor, type Y to save changes and press Enter to confirm the filename.

rsync on the client

The client configuration is easier. First we need rsync, so run setup.exe and install Cygwin to C:\cygwin. When choosing packages, search for rsync, expand Net, and click Skip, it should change to a version number indicating it's to be installed. Continue with the installation.

Now we'll schedule our backup. Launch Task Scheduler and Create a basic task. Name it rsync backup to FreeNAS or similar, set to run Daily at a time of your choice.

Program/script: C:\cygwin\bin\bash.exe
Arguments: -c '/bin/rsync -amv --delete --ignore-errors /cygdrive/c/ freenas::mom 2>/dev/null'
The last part, 2>/dev/null hides error messages and --delete --ignore-errors deletes files at the destination that no longer exist at the source. The first letter of amv means we want to recurse into sub-directories and preserve almost everything, m prunes empty directories, and v will output our progress. The trailing slash on /c/ is important as it specifies we're transferring the contents of the c drive. Without it our paths in exclude.txt would be incorrect.

For the very first transfer, copying the data manually will be fastest; rsync is designed for sending changes between files and suffers on whole-file copies. Run the task to verify it works. Don't be surprised if it takes a while just on building file list, an entire drive has a lot of files. If successful, open the task's properties and select to Run whether user is logged on or not, then check Do not store password. This will completely hide the backup.

You're finished! Your family members now have a complete daily backup with minimal network bandwidth. Congratulations.

possibilities

It shouldn't be too daunting to modify this for the full system image created by Windows Backup and Restore as an instant emergency restoration, though it would require much more storage. Another dynamic possibility is hourly, daily, and weekly snapshots for perusing old files. Future blog post if I do either.

Logging packet loss and repairing ADSL

2. September 2012

When you live in the boonies and your internet source is ADSL, connection issues can be frustrating. Like other connection related issues you need to trace and isolate the problem. Don't just bypass your router and plug your laptop into the modem, bypass your in-house telephone wiring and connect the modem straight to your external telephone box, it's simple to do. Conduct your tests. If you're still seeing an issue, call your phone/internet company and they will check your connection at the local switch - you may be set for a speed higher than the last mile to your building can handle. Ask the technician if you're using the modem they recommend, especially if yours is one you sourced yourself.

Splitter and filter are my additions, not to mention running a direct line to the modem through the drop ceiling.

My own investigation revealed not only was our modem incompatible (unknown to the technician), not only was the connection set too high at the switch, but the in-house wiring was insufficient as well - and every phone was definitely filtered. I purchased a recommended modem from eBay ($40-$60), and the telephone company lowered our speed. My proudest moment however was installing a splitter right at the phone box with a dedicated line straight to the ADSL modem, and a single ADSL filter on the other (no other filters in the house). Our telephone box is old and it isn't pretty, but I'm damn proud.

Those three components needed work before I saw the last of the packet loss disappear, and now the connection is rock solid. Using the highly stable and fantastic Tomato firmware on an ASUS RT-N16 is the other half of excellent reliability, I wouldn't trust anything else for avoiding support calls from my parents or regular reboot cycles.

When I was on this adventure I needed a way to record when packet loss occurred, and so I created the following PowerShell script:

It performs a 2-hop trace route to yahoo.com every second. Since it's only 2 hops we're really just looking at the connection from local machine to router, and router to internet provider, thus yahoo.com can be any external address. (If your modem isn't bridged to your router, or you have other routers in place this script may not be sufficient as-is.) It then checks for dropped pings denoted by asterisks on each hop and logs a failure, noting whether it's between machine and router or router and ISP.

The script also records how long the machine has been idle though I don't quite remember why. You will need TimeIdlePhysical.exe, which is nothing more than a compiled 1-line AutoHotkey script.

If you would like a graph of ping response times, you can use the following:

This one creates the graph (you will need gnuplot):

gnuplot's auto-scaling works excellent with any range:

Facebook group geovisualization

1. September 2012

I administer a Facebook group and I noticed I can view the profile and location of its members. Whether this is because I'm an admin or merely another member I'm not sure. I wanted to visualize the spread of the members on a globe so I looked into using Facebook's Graph API to gather the data. To my surprise I discovered that this is not in the API, there is no available permission for this particular relationship.

Of course since I can view it manually, it's accessible. So I took an alternative route and used jQuery and Ajax, executed from Chrome's JavaScript console on any Facebook page (to avoid a cross-site scripting denial). You first need your own basic access token, which is easy to retrieve by simply visiting Facebook's Graph API Explorer. You also need your group id, retrieved by looking at the link to your group on Facebook. With access token and group id we can assemble the code.

var accessToken = "ACCESS TOKEN GOES HERE";
var groupId = "GROUP ID GOES HERE";
var membersUrl = "https://graph.facebook.com/" + groupId + "/members?fields=id&access_token=" + accessToken + "&callback=?";

$.getJSON(membersUrl, function(members) {     $.each(members.data, function(i,user) {         $.ajax("http://www.facebook.com/" + user.id).done(function(data) {             var match = data.match(/Lives in .+?<\/a>/);             var location = match && match.length > 0 ? match[0] : "";             console.log(user.id + ' ' + $(location).text());         });     }); });

Use a jQuerify bookmarklet to first load jQuery before executing the code from the console. It uses the Graph API to retrieve member ids, then loads each of their profiles asynchronously, matching the text "Lives in", which is used by the newer Timeline profiles. I couldn't find an older profile or I'd have added support for them. The user id and location is output to the console, where you can simply copy & paste it.

There are a couple of great tools for parsing unstructured geographical data, such as Yahoo Placemaker (here's a nice Python wrapper) and Geodict (works offline).

Finally to actually display the data, WebGL Globe over at Chrome Experiments looks promising, also Geochart. Geochart can be easily limited to a single country as well, for instance using: region: 'US', resolution: 'provinces' with a simple state and member count array yields nice results:

SSH on Windows 7 continued: charade, ssh, rsync, Unison

13. October 2010

05-16-11 Updated and polished.

In the previous article we established an SSH session with KiTTY. However to take full advantage of Cygwin and SSH it's equally important that your Windows client can connect with Cygwin's ssh.

As Pageant handles authentication for KiTTY, so ssh-agent authenticates for ssh.exe.
But we're on Windows! We like KiTTY, and Pageant has a nice interface, it should be all we need.

Charade is an ssh-agent in Cygwin that proxies requests to Pageant.

Client instructions (Windows-centric variation on keychain)

  1. Install Cygwin and hstart and configure environment variables on the client as done previously on the server.
  2. Compile Download charade.exe
  3. Drop it in C:\cygwin\bin
  4. Add another program start action to our Pageant entry in Task Scheduler.
    Program: hstart Arguments: /noconsole "bash -c "charade > ~/.ssh-agent""
    Move this entry up, before Pageant's start action.
  5. Append source ~/.ssh-agent to the end of C:\cygwin\home\<User>\.bash_profile
  6. Run task, launch local Cygwin shell, connect to your server: ssh <hostname>. Hooray!


With charade operational, we can use rsync and Unison over SSH. Awesome!
Remember when we exported our private key in OpenSSH format (no file extension)? That's the one ssh.exe requires.

Here's an example bash script for pushing changes over a LAN with rsync that handles spaces in filenames.

#!/bin/bash
receiver=$1
# escape spaces in file paths # (the escapes won't be visible if you echo... you'd need to triple escape... which we don't want) src=`cygpath $2` src="echo $src | sed 's/ /\\ /g'" src=`eval $src`
if [ $# = 2 ] then     dest=$src     #src=$src/ else     dest=`cygpath $3`     dest="echo $dest | sed 's/ /\\ /g'"     dest=`eval $dest` fi
source ~/.ssh-agent
# rsync # -a, archival mode, does: # -r (recursive) # -l (copy symlinks as symlinks) # -p (preserive permissions) # -t (preserve modification times) # -g (preserve group) # -o (preserve owner) # -D (preserve device & special files) # -v, verbose # --delete, delete extraneous files from destination dirs (DANGEROUS) # --rsh, the remote shell to use # -z, compress file data during the transfer
# ssh # -a, disables agent forwarding # -x, disables x11 forwarding # -c, set the cipher specification (blowfish being the quickest)
#LAN rsync: rsync -s -av --delete --rsh="ssh -ax -c blowfish" "$src" $receiver:"$dest" #WAN rsync: #rsync -s -avz --delete --rsh="ssh -ax" "$src" $receiver:"$dest"

I call the script from within my text editor like this: cmd /c bash ~/push.sh Chris-Laptop 'C:\abc\some_source_dir' 'C:\some_dest_dir'
(or 'C:\abc\source_dir\' 'C:\abc\dest_dir\', or just a single 'C:\abc\source_dest\' if the path is equivalent at the destination.)

It's important to understand the distinction of a trailing slash on the source folder with rsync (especially with --delete). Back up your data before experimenting.



More to come!