I've been spending some time lately familiarizing myself with EC2, setting up some MySQL servers & clusters here and there, and doing some really basic configuration testing. One situation you'll run into when interacting with EC2 is that it gets unwieldy to use the AWS Management Console web interface for interacting with your instances. There ends up being lots of scrolling, lots of staring, and lots of sighs. Since I'm using SSH to connect to and interact with my instances, I want a reasonable way to find information about them on the Unix command line.
Amazon has an official set of tools [http://aws.amazon.com/developertools/351] that give you this information , at least theoretically. It is some gigantic distribution of shell scripts and Java madness that, if you are very patient, will eventually give you some information about your instances, in a format that is very difficult to work with.
$ time ./bin/ec2-describe-instances i-83c5d4e0 Unable to find a $JAVA_HOME at "/usr", continuing with system-provided Java... RESERVATION r-7db4731c 801025846226 INSTANCE i-83c5d4e0 ami-31814f58 stopped skysql-ec2 0 m1.small 2011-12-09T20:41:39+0000 us-east-1c aki-805ea7e9 monitoring-disabled 10.0.0.164 vpc-cd4fafa5 subnet-c44fafac ebs paravirtual xen sg-134b547f default BLOCKDEVICE /dev/sda1 vol-19ec6174 2011-12-10T01:30:32.000Z TAG instance i-83c5d4e0 Name ndb32-02 real 0m7.693s user 0m10.119s sys 0m0.451s
OK, it takes me about 7.5 seconds to get data about an instance, and it's given to me in 4 lines. If I get information about all of my data, I have no idea how I would be able to successfully grep through that to interact with any of it programatically. I went looking for a different solution, preferably one that would be faster, more flexible, and easier to use.
I found a great script called, simply, aws, written by Timothy Kay [http://timkay.com/aws/].
$ du -hsc ec2-api-tools* 14M ec2-api-tools-1.5.0.1-2011.11.30 11M ec2-api-tools.zip 26M total $ ls -sk aws 76 aws
Ahem. I'll take a 76K perl script over a 14M mess any day. Let's see how it performs.
$ time aws din i-83c5d4e0 +------------+--------------+----------------------+------------------------------------------+------------+--------------+--------------------------+---------------------------------------------+--------------+----------------+-----------------+--------------+------------------+-----------------+---------------------------------------------+-------------------------------------------------------------------------------------------------+--------------+----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------+--------------------+--------+------+----------+ | instanceId | imageId | instanceState | reason | keyName | instanceType | launchTime | placement | kernelId | monitoring | subnetId | vpcId | privateIpAddress | sourceDestCheck | groupSet | stateReason | architecture | rootDeviceType | rootDeviceName | blockDeviceMapping | virtualizationType | tagSet | key | value | +------------+--------------+----------------------+------------------------------------------+------------+--------------+--------------------------+---------------------------------------------+--------------+----------------+-----------------+--------------+------------------+-----------------+---------------------------------------------+-------------------------------------------------------------------------------------------------+--------------+----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------+--------------------+--------+------+----------+ | i-83c5d4e0 | ami-31814f58 | code=80 name=stopped | User initiated (2011-12-10 01:29:51 GMT) | skysql-ec2 | m1.small | 2011-12-09T20:41:39.000Z | availabilityZone=us-east-1c tenancy=default | aki-805ea7e9 | state=disabled | subnet-c44fafac | vpc-cd4fafa5 | 10.0.0.164 | true | item= groupId=sg-134b547f groupName=default | code=Client.UserInitiatedShutdown message=Client.UserInitiatedShutdown: User initiated shutdown | i386 | ebs | /dev/sda1 | item= deviceName=/dev/sda1 ebs= volumeId=vol-19ec6174 status=attached attachTime=2011-12-10T01:30:32.000Z deleteOnTermination=true | paravirtual | | | | | | | | | | | | | | | | | | | | | | | | | | | Name | ndb32-02 | +------------+--------------+----------------------+------------------------------------------+------------+--------------+--------------------------+---------------------------------------------+--------------+----------------+-----------------+--------------+------------------+-----------------+---------------------------------------------+-------------------------------------------------------------------------------------------------+--------------+----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------+--------------------+--------+------+----------+ real 0m1.546s user 0m0.123s sys 0m0.035s
Well, the output format isn't exactly any more appealing than what you get from the Amazon tool, but it sure gives it to you a lot faster! A little poking around showed me that the aws tool allows you to forego the pretty-printing and get the actual XML that the tool receives from the AWS API.
$ aws --xml din i-83c5d4e0 <?xml version="1.0" encoding="UTF-8"?> <DescribeInstancesResponse xmlns="http://ec2.amazonaws.com/doc/2011-11-01/"> <requestId>4e1bf76d-ad02-439b-b255-108e09713251</requestId> <reservationSet> <item> <reservationId>r-7db4731c</reservationId> <ownerId>801025846226</ownerId> <groupSet/> <instancesSet> <item> <instanceId>i-83c5d4e0</instanceId> <imageId>ami-31814f58</imageId> <instanceState> <code>80</code> <name>stopped</name> </instanceState> <privateDnsName/> <dnsName/> <reason>User initiated (2011-12-10 01:29:51 GMT)</reason> <keyName>skysql-ec2</keyName> <amiLaunchIndex>0</amiLaunchIndex> <productCodes/> <instanceType>m1.small</instanceType> <launchTime>2011-12-09T20:41:39.000Z</launchTime> <placement> <availabilityZone>us-east-1c</availabilityZone> <groupName/> <tenancy>default</tenancy> </placement> <kernelId>aki-805ea7e9</kernelId> <monitoring> <state>disabled</state> </monitoring> <subnetId>subnet-c44fafac</subnetId> <vpcId>vpc-cd4fafa5</vpcId> <privateIpAddress>10.0.0.164</privateIpAddress> <sourceDestCheck>true</sourceDestCheck> <groupSet> <item> <groupId>sg-134b547f</groupId> <groupName>default</groupName> </item> </groupSet> <stateReason> <code>Client.UserInitiatedShutdown</code> <message>Client.UserInitiatedShutdown: User initiated shutdown</message> </stateReason> <architecture>i386</architecture> <rootDeviceType>ebs</rootDeviceType> <rootDeviceName>/dev/sda1</rootDeviceName> <blockDeviceMapping> <item> <deviceName>/dev/sda1</deviceName> <ebs> <volumeId>vol-19ec6174</volumeId> <status>attached</status> <attachTime>2011-12-10T01:30:32.000Z</attachTime> <deleteOnTermination>true</deleteOnTermination> </ebs> </item> </blockDeviceMapping> <virtualizationType>paravirtual</virtualizationType> <clientToken/> <tagSet> <item> <key>Name</key> <value>ndb32-02</value> </item> </tagSet> <hypervisor>xen</hypervisor> </item> </instancesSet> <requesterId>058890971305</requesterId> </item> </reservationSet> </DescribeInstancesResponse>
Sweet, sweet data! Hold on, though, I can't use grep to get at that. I'm going to have to remember how to interact with XML documents; I decided I had better see if I could dig up any XPath knowledge.
The next question was what tool I wanted to use to execute some XPath expressions against. I was not very keen on having to write an entire perl or python script to read the XML, build it into some DOM, and then loop several times over crusty data structures to get the data I wanted. I wanted to be able to do some more generalized things that are very easily accomplished in XPath, such as getting a list of instances based on a prefix of their Name, get a list of "stopped" instances, get a list of instances with public IP addresses, et cetera.
I figured there must be some command-line tool that would let me execute arbitrary XPath against an XML file. After poking around a while, I found XMLStarlet [http://xmlstar.sourceforge.net/]. Installing this on my MacBook Pro using Homebrew [http://mxcl.github.com/homebrew/] was very easy and I was off to the races.
After grappling for a very annoying amount of time with XML namespaces, I eventually figure I'd just strip the thing out so that I didn't have to deal with it. (If you leave the namespace in, you have to give it an alias and then specify that before every tag in your XPath expressions. No, thanks.)
$ cat strip_xmlns sed 's/ xmlns="[^>]*"//'
The xmlstarlet/xmlstar/xml tool works by specifying a template that includes some expression to match and some expressions to generate output. The tool does a lot, so some of the options can appear to be a bit verbose at first. Here's a very basic use of the tool to get just a list of instance IDs:
$ aws --xml din | strip_xmlns | xml sel -T -t -m '//instancesSet/item' -v 'instanceId' -n i-d1dbceb2 i-afdacfcc i-cbc2d7a8 i-99bfaafa i-1d40547e i-f7c5d494 i-83c5d4e0 i-77c4d514 i-75c4d516 i-47feee24 i-707d9512
You can see the XSLT that the tool is applying internally by using the -C option:
$ aws --xml din | strip_xmlns | xml sel -C -t -m '//instancesSet/item' -v 'instanceId' -n <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output omit-xml-declaration="yes" indent="no"/> <xsl:template match="/"> <xsl:for-each select="//instancesSet/item"> <xsl:value-of select="instanceId"/> <xsl:value-of select="' '"/> </xsl:for-each> </xsl:template> </xsl:stylesheet>
OK, so, there's a tool that will let me execute some XPath and get back information about my instances, that's nice. Instead of trying to parse some formatted output, I should be able to select the XML elements I want for a particular task.
Say I want the instance IDs of all instances that are stopped:
aws --xml din | strip_xmlns | xml sel -T -t -m '//instancesSet/item[instanceState/name="stopped"]' -v 'instanceId' -n
Or maybe I want all instances with Names that start with the string "ndb":
aws --xml din | strip_xmlns | xml sel -T -t -m '//instancesSet/item[starts-with(tagSet/item[key="Name"]/value, "ndb32")]' -v 'instanceId' -n
Instead of having to write several loops in perl or python, I'm able to write a very straightforward expression that matches just the nodes I want. Instead of writing that XPath every time, of course, I'll put a few of the more popular ones into a script along with some flexibility to provide arbitrary filtering. (I call this WHERE in the script because that's the first thing my DBMS-addled brain came up with!)
#!/bin/bash while getopts "p:s:w:" OPTION do case $OPTION in p) WHERE="[starts-with(tagSet/item[key='Name']/value, '$OPTARG')]" ;; s) WHERE="[instanceState/name = '$OPTARG']" ;; w) WHERE="$OPTARG" ;; esac done MATCHEXPR="/DescribeInstancesResponse/reservationSet/item/instancesSet/item$WHERE" aws --xml din | strip_xmlns | xml sel -T -t -m "$MATCHEXPR" -o "instanceId " -v instanceId -n -o "instanceName " -v tagSet/item[key=\"Name\"]/value -n -o "privateIp " -v privateIpAddress -n -o "ipAddress " -v ipAddress -n -o "instanceState " -v instanceState/name -n -n $ ec2-ls -p ndb $ ec2-ls -s stopped $ ec2-ls -w "[instanceType='m1.small']"
My script returns several items that may or may not be of interest to others. Further extension to the script could easily make the list of items returned a bit more useful. From that basically reasonable if limited script, I vastly overreached my bash skills and turned it into this monstrosity:
#!/bin/bash
OUTPUT="instanceId;instanceName:tagSet/item[key='Name']/value;privateIp:privateIpAddress;ipAddress;instanceState:instanceState/name"
DELIM=" " #there might be a <tab> in there!
declare -a XMLARGS
push() # Push item on stack.
{
if [ -z "$1" ] # Nothing to push?
then
return
fi
XMLARGS[${#XMLARGS[*]}]="$1"
return
}
while getopts "p:s:w:o:d:D" OPTION
do
case $OPTION in
p)
WHERE="[starts-with(tagSet/item[key='Name']/value, '$OPTARG')]"
;;
s)
WHERE="[instanceState/name = '$OPTARG']"
;;
w)
WHERE="$OPTARG"
;;
o)
OUTPUT="$OPTARG"
;;
d)
DELIM="$OPTARG"
;;
D)
DEBUG=1
;;
esac
done
shift $((OPTIND-1)) #something about argument processing, supposedly
for i in "sel" "-T" "-t" "-m"; do
push "$i"
done
MATCHEXPR="/DescribeInstancesResponse/reservationSet/item/instancesSet/item$WHERE"
push "$MATCHEXPR";
OLDIFS=$IFS;
IFS=";"
for f in $OUTPUT; do
FIELDNAME=$(echo $f | cut -d':' -f 1)
FIELDEXPR=$(echo $f | cut -d':' -f 2)
if [[ -z $FIELDEXPR ]]; then
FIELDEXPR=$FIELDNAME
fi
push "-o";
push "$FIELDNAME$DELIM";
push "-v";
push "$FIELDEXPR";
push "-n";
done
push "-n";
IFS=$OLDIFS
if [[ $DEBUG -eq 1 ]]; then
echo "$MATCHEXPR" >&2
echo "${XMLARGS[@]}" >&2
fi
aws --xml din | strip_xmlns | xml "${XMLARGS[@]}"
I'm sure there are plenty of problems with that script, but at least now I can finally get the information I want about my EC2 instances!
$ ec2-ls -p ndb32 -o "instanceId;privateIp:privateIpAddress"






















Comments
looks awesome! nedd to try!
looks awesome! nedd to try! thanks for sharing
AWS EC2 Scripts
You can find my scripts at https://github.com/ronaldbradford/aws
The aws_audit.sh script will give you valuable information on instances (EC2), load balances (ELB) and cross references for instance,public name, ip etc and including a parallel-ssh file.
I use these to manage environments with 500+ EC2 instances