Amazon Elastic Compute Cloud (Amazon EC2)

Amazon EC2

Secure and resizable compute capacity in the cloud. Launch applications when needed without upfront commitments.

Image result for amazon ec2
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.
Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon’s proven computing environment. Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change. Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the tools to build failure resilient applications and isolate them from common failure scenarios

What Is an Instance?

At the simplest level, an instance can be thought of as a virtual server, the same as you might rent on a monthly basis from a virtual private server (VPS) provider. Indeed, some people are using EC2 in exactly the same way as they would a VPS. While perfectly serviceable in this respect, to use it in this way ignores several interesting features and technologies that can make your job a lot more convenient.
Amazon Machine Images (AMIs) are the main building blocks of EC2. They allow you to configure an instance once (say, installing Apache or Nginx) and then create an image of that instance. The image can be used to launch more instances, all of which are functionally identical to the original. Of course, some attributes—such as the IP address or instance ID—must be unique, so there will be some differences.
AWS REGIONS AND AVAILABILITY ZONES
AWS services operate in multiple geographic regions around the world. At the time of this writing, there are seventeen public AWS regions, each of which is further divided into multiple availability zones. This geographic disparity has two main benefits: you can place your application resources close to your end users for performance reasons, and you can design your application so that it is resilient to loss of service in one particular region or availability zone. AWS provides the tools to build automatic damage control into your infrastructure, so if an availability zone fails, more resources can be provisioned in the other availability zones to handle the additional load.
Each availability zone (AZ) is located in a physically separate datacenter within its region. There are three datacenters in or around Dublin, Ireland, that make up the three availability zones in the EU West 1 region—each with separate power and network connections. In theory, this means that an outage in one AZ will not have any effect on the other AZs in the region. In practice, however, an outage in one AZ can trigger a domino effect on its neighboring AZs, and not necessarily due to any failing on Amazon’s part.
Consider a well-architected application that, in the event of an AZ failure, will distribute traffic to the remaining AZs. This will result in new instances being launched in the AZs that are still available. Now consider what happens when hundreds of well-architected applications all failover at the same time—the rush for new instances could outstrip the capability of AWS to provide them, leaving some applications with too few instances.
This is an unlikely event—although AWS has service outages like any other cloud provider, deploying your application to multiple AZs will usually be sufficient for most use cases. To sustain the loss of a significant number of AZs within a region, applications must be deployed to multiple regions. This is considerably more challenging than running an application in multiple AZs.
A final reminder that AWS services are not uniformly available across all regions—validate deployment plans involving regions you are not already familiar with against the newest version of the official Region Table.

Instance Types

EC2 instances come in a range of sizes, referred to as instance types, to suit various use cases. The instance types differ wildly in the amount of resources allocated to them. The m3.medium instance type has 3.75 GB of memory and 1 virtual CPU core, whereas its significantly bigger brother c3.8xlarge has 60 GB of memory and 32 virtual CPU cores. Each virtual CPU is a hyperthread of an Intel Xeon core in the m3 and c3 instance classes.
For most of the examples in the book, we will use a t2.micro instance, among the smaller and one of the cheapest instance types suitable for any operating system choice, which makes it ideal for our tests.
In production, picking the right instance type for each component in your stack is important to minimize costs and maximize performance, and benchmarking can be the key when making this decision.

Processing Power

EC2, along with the rest of AWS, is built using commodity hardware running Amazon’s software to provide the services and APIs. Because Amazon adds this hardware incrementally, several hardware generations are in service at any one time.
WARNING
When it comes to discussing the underlying hardware that makes up the EC2 cloud, Amazon used to play the cards close to its chest and reveal relatively little information about the exact hardware specifications. This led to the creation of a dedicated compute unit:
One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
It is easy to encounter this metric in older AWS benchmarks. Amazon now openly identifies what hardware underlies the EC2 compute layer, and these abstract units are obsolete and no longer in use.
Amazon provides a rather vast selection of instance types, the current generation of which is described at the EC2 Instance Types page. The previously mentioned t2.micro instance type therefore refers to a second generation general-purpose burstable performance instance. An immediate update of already running applications is generally not required as older generations remain available for provisioning, with their original functionality intact. It remains advisable to adopt the latest instance type generation when designing a new (or revised) application, so as to benefit from the capabilities of the newer hosting hardware.
TIP
No EC2 instance type has ever been discontinued in almost 10 years. This record is made possible by market forces: as newer instance types become available, their significantly better price/performance ratio induces a user migration away from the previous generation. A reduced demand base in turn allows Amazon to continue to supply those deprecated instance types without having to add capacity with old hardware that may be unavailable.
Older instance types are, however, not available in the newer AWS regions they predate—for example, the first generation to be deprecated, cc1, is not found in the newest region ap-northeast-2 hosted in Seoul, Korea. If our spirited advice and the cost savings produced by migrating to newer instance generations are not sufficient to entice you to regularly update your instance selection, perhaps your global expansion plans will.
AWS machine images may make use of either of the two virtualization types supported by the Xen hypervisor: paravirtualized or hardware virtual machine (HVM). It is not necessary to be conversant in the finer differences of the two technologies to make effective use of AWS, but the two approaches present boot-time differences to the guest OS environment. A given Linux machine image will only support booting one virtualization type as a result, a requirement easily met by filtering any image search with the appropriate virtualization type.
Amazon recommends using HVM virtualization on current-generation AMIs. Where that approach is not suitable, it becomes necessary to determine what virtualization type is supported by the older generation of a specific instance type. This is quickly accomplished by launching a test HVM instance from the AWS CLI and watching for a helpful error message. The AWS documentation also provides insight into what virtualization type is supported by what older instance type.
Different combinations of CPU, memory, network bandwidth, and even custom hardware differentiate AWS instance types. There are nine instance type classes in the current generation at the time of writing, including general purpose (M4, M3), burstable performance (T2), compute optimized (C4, C3), memory intensive (R3), storage optimized (I2 for performance, or D2 for cost), and GPU enabled (G2). These in turn include multiple types with resource allotments of increasing size, bringing the total number of choices we select from above forty.
NOTE
Jeff Barr of Amazon has published an interesting timeline of EC2’s instance generations.
Taking a scientific approach to benchmarking is the only way to really be sure you are using the right instance type. AWS makes it really simple to run the very same workload configuration with a succession of different instance types, considerably simplifying this task. The most common approach in the AWS user community is to start with an instance type considered high-CPU for the workload under consideration. While running top, drive the CPU to 100% using your application’s load generator of choice. Now examine memory use: if you observe the instance running out of memory before the CPU is at full throttle, switch to a higher-memory instance type. Continue this process until you achieve a reasonable balance.
Alongside fixed-performance instances, including the C4, C3, and R3 types, EC2 offers burstable performance instances like the T2 type. Burstable performance instances generally operate at a CPU performance baseline but can “burst” above this limit for a time. Bursting is governed by CPU credits that are accumulated when the instance runs without its full allotment of CPU. A CPU credit represents use of a full CPU core for one minute.
A practical example will illustrate the accounting mechanism EC2 employs: a t2.micro instance type allocates one virtual CPU to your cloud instance, with six CPU credits earned each hour, representing a 10% share of a real CPU core. Let’s assume our workload is a web server, often idling while waiting for requests. If the CPU load falls below 10%, CPU credits are added to that instance’s credit for up to 24 hours. Burstable performance is particularly useful for workloads that do not consistently use their full share of the CPU, but benefit from having access to additional, fast CPUs when the occasion arises—applications include small databases, web servers, and development systems.
STOLEN CPU TIME
Alongside the traditional CPU shares of us (user), sy (system), id (idle), and wa (IO wait), the EC2 hypervisor exposes the additional metric st, meaning stolen:
%Cpu(s):  0.1 us,  0.1 sy,  0.1 ni, 98.2 id,  1.0 wa,  0.0 hi,  0.0 si,  0.5 st
Stolen CPU time represents the share of time the instance’s virtual CPU has been waiting for a real CPU while the hypervisor is using it to service another virtual processor. Stolen CPU has gained prominence as a metric that Netflix, possibly the most prominent AWS tenant, tracks closely. Despite its present fame, Stolen CPU is not as significant for workloads that are not sensitive to network jitter or real-time in nature.
The Noisy Neighbor is a related compute cause célèbre: in any virtual environment, the noisy neighbor effect occurs when an instance starves other instances for a shared resource, causing performance issues to others on the same infrastructure. You will not observe memory or CPU contention as EC2 instances are generally not overprovisioned; any potential noisy neighbor problems will be limited to network or disk I/O.
One simple approach countering this issue is to automatically allocate a new instance, replacing the one where the performance problem was encountered. Larger instance types are less likely to present this problem on account of sharing a host with fewer neighbors. SR-IOV support (Enhanced Networking) increases storage and network I/O bandwidth, helping to minimize any noise. The ultimate solution is to use Dedicated Hosts, a facility providing complete control of your instance placement for an additional fee.
Specific instance types may provide the latest advanced features found in Intel hardware, including on-chip support for AES encryption and the Advanced Vector Extensions instruction set. The G2 instance type is currently the most prominent example of enhanced compute support, featuring more than 1,500 NVIDIA GPU cores. Advanced compute options are rapidly evolving; their most recent iteration is documented in the instance types page, which we recommend you review often.
EC2 instances can be purchased in three ways. Allocated by the hour and requiring no upfront commitment, on-demand instances are the default and are used exclusively throughout this book. Reserved instances represent a prepaid commitment on the part of a customer, which is usually rewarded by AWS with very steep discounts, up to 75% of on-demand pricing. Spot instance pricing requires no upfront commitment, and their pricing fluctuates according to the supply and demand of compute capacity. The customer may define a maximum hourly price not to be exceeded, and EC2 will automatically shut those instances down if their spot pricing tops the set threshold.

Storage

There are two options when it comes to virtual disk storage for your instances: instance storage (also known as ephemeral storage) and Elastic Block Store (or EBS). Both are simply block storage devices that can be attached to instances. Once attached, they can be formatted with your operating system’s tools and will act like a standard disk. AWS storage comes in two flavors: magnetic disks and solid-state drives (SSDs). SSDs provide higher read and write performance when compared to magnetic disks, but the cost is slightly higher.
There are some key differences between instance storage and EBS. Instance storage is directly attached to the physical host that runs your instance, whereas EBS is attached over the network. This has implications in terms of disk latency and throughput, so we recommend performing another series of benchmarks to see which is best if your application is sensitive to latency or I/O jitter.
I/O speeds are not the only difference—EBS has features that make it preferable to instance storage in nearly all usage scenarios. One of the most useful is the ability to create a snapshot from an EBS. A snapshot is a copy of an EBS volume at a particular point in time. Once you have created a snapshot, you can then create additional EBS volumes that will be identical copies of the source snapshot. You could, for example, create a snapshot containing your database backups. Every time a new instance is launched, it will have a copy of the data ready for use. EBS snapshots form the backbone of many AWS backup strategies.
When an instance is terminated, any data stored on instance storage volumes is lost permanently. EBS volumes can persist after the instance has been terminated. Given all of the additional features, using EBS volumes is clearly preferable except in a few cases, such as when you need fast temporary storage for data that can be safely discarded.
Multiple volumes (of either type) can be attached to an instance, leading to pretty flexible storage configurations. The Block Device Mapping facility allows multiple volumes to be associated with an instance at boot time. It is even possible to attach multiple volumes to an instance and build a software RAID array on them—an advantage of volumes appearing as block storage devices to the operating system.
TIP
The disk_setup and mounts modules of Cloud-init allow customization of all disks associated with an instance upon boot, including partitioning and formatting disks as well as configuring mount points in /etc/fstab. The official documentation also sheds light on the details of how many public clouds can initialize their instance storage using Cloud-init.
In June 2012, AWS began offering SSDs as a higher-performance alternative to magnetic storage, and over time introduced multiple options with different performance levels and cost. Some instance types now include an SSD-backed instance store to deliver very-high random I/O performance, with types I2 and R3 being the first to support TRIM extensions. Instance types themselves have evolved to include high-I/O instances (type I2), aimed at delivering high IOPS from up to 8 local SSD drives, while dense storage instances (type D2) offer the lowest price per-disk throughput in EC2 and balance cost and performance, using 24 local magnetic drives.
EBS Magnetic and SSD volumes are currently limited to 16 TB in size, limits easily exceeded by dense storage (d2) instances, which can boot with 48 TB of local disk storage. Whereas EBS volumes can be provisioned at any time and in arbitrary configurations, the number and size of available instance store volumes varies with instance type, and can only be attached to an instance at boot time. In addition, EBS volumes can be dynamically resized, which is also used to redefine their performance at runtime.
EBS SSD options include a number of performance flavors. General-purpose SSD volumes are provisioned with 3 IOPS per GB, with burst performance reaching 3,000 IOPS for extended periods. Provisioned IOPS SSD volumes allow the user to define the desired level of performance, up to 20,000 IOPS and 320 MB/s of throughput. A less costly option is offered by the EBS-optimized M4 type instances, which include dedicated EBS bandwidth between 450 and 4,000 Mbps depending on the specific instance type. EBS-optimized instances use an optimized configuration stack requiring corresponding support on the machine image’s part for optimal performance
Long-term storage options are best supported by the S3 service, but a block storage option is available through Cold HDD EBS volumes. Backed by magnetic drives, Cold HDD volumes offer the lowest cost per GB of all EBS volume types, and still provide enough performance to support a full-volume scan at burst speeds. EBS also supports native at-rest encryption that is transparently available to EC2 instances and requires very little effort on the administrator’s part to deploy and maintain. EBS encryption has no IOPS performance impact and shows very limited impact on latency, making it a general-purpose architectural option even when high security is not strictly required.

Networking

At its simplest, networking in AWS is straightforward—launching an instance with the default networking configuration will give you an instance with a public IP address. Many applications will require nothing more complicated than enabling SSH or HTTP access. At the other end of the scale, Amazon offers more-advanced solutions that can, for example, give you a secure VPN connection from your datacenter to a Virtual Private Cloud (VPC) within EC2.
At a minimum, an AWS instance has one network device attached. The maximum number of network devices that can be attached depends on the instance type. Running ip addr show on your instance will show that it has a private IP address in the default 172.31.0.0/16 range. Every instance has a private IP and may have apublic IP; this can be configured at launch time or later, with the association of an Elastic-IP address.
WARNING
AWS accounts created after December 2013 no longer have access to the legacy EC2-classicnetworking model. This book covers the current EC2-VPC networking model exclusively.
Amazon Virtual Private Cloud enables you to provision EC2 instances in a virtual network of your own design. A VPC is a network dedicated to your account, isolated from other networks in AWS, and completely under your control. You can create subnets and gateways, configure routing, select IP address ranges, and define its security perimeter—a series of complex tasks that are bypassed by the existence of the default VPC. The default VPC includes a default subnet in each availability zone, along with routing rules, a DHCP setup, and an internet gateway. The default VPC enables new accounts to immediately start launching instances without having to first master advanced VPC configuration, but its security configuration will not allow instances to accept connections from the internet until we expressly give our permission, by assigning our own security group settings.
The default security group allows all outbound traffic from instances to reach the internet, and also permits instances in the same security group to receive inbound traffic from one another, but not from the outside world. Instances launched in the default VPC receive both a public and a private IP address. Behind the scenes, AWS will also create two DNS entries for convenience.
For example, if an instance has a private IP of 172.31.16.166 and a public IP of 54.152.163.171, their respective DNS entries will be ip-172-31-16-166.ec2.internal and ec2-54-152-163-171.compute-1.amazonaws.com. These DNS entries are known as the private hostname and public hostname.
It is interesting to note that Amazon operates a split-view DNS system, which means it is able to provide different responses depending on the source of the request. If you query the public DNS name from outside EC2 (not from an EC2 instance), you will receive the public IP in response. However, if you query the public DNS name from an EC2 instance in the same region, the response will contain the private IP:
# From an EC2 instance
$ dig ec2-54-152-163-171.compute-1.amazonaws.com +short
172.31.16.166
# From Digital Ocean
$ dig ec2-54-152-163-171.compute-1.amazonaws.com +short
54.152.163.171
The purpose of this is to ensure that traffic does not leave the internal EC2 network needlessly. This is important as AWS has a highly granular pricing structure when it comes to networking, and Amazon makes a distinction between traffic destined for the public internet and traffic that will remain on the internal EC2 network. The full breakdown of costs is available on the EC2 Pricing page.
If two instances in the same availability zone communicate using their private IPs, the data transfer is free of charge. However, using their public IPs will incur internet transfer charges on both sides of the connection. Although both instances are in EC2, using the public IPs means the traffic will need to leave the internal EC2 network, which will result in higher data transfer costs.
By using the private IP of your instances when possible, you can reduce your data transfer costs. AWS makes this easy with their split-horizon DNS system: as long as you always reference the public hostname of the instance (rather than the public IP), AWS will pick the cheapest option.

Amazon EC2 Pricing

Amazon EC2 is free to try. There are four ways to pay for Amazon EC2 instances:
  • On-Demand
  • Reserved Instances,
  • Spot Instances
  • You can also pay for Dedicated Hosts which provide you with EC2 instance capacity on physical servers dedicated for your use.
2018-11-27_21h54_06

On-Demand
With On-Demand instances, you pay for compute capacity by per hour or per second depending on which instances you run. No longer-term commitments or upfront payments are needed. You can increase or decrease your compute capacity depending on the demands of your application and only pay the specified per hourly rates for the instance you use.
On-Demand instances are recommended for:
  • Users that prefer the low cost and flexibility of Amazon EC2 without any up-front payment or long-term commitment
  • Applications with short-term, spiky, or unpredictable workloads that cannot be interrupted
  • Applications being developed or tested on Amazon EC2 for the first time
Spot Instances
Amazon EC2 Spot instances allow you to request spare Amazon EC2 computing capacity for up to 90% off the On-Demand price.
Spot instances are recommended for:
  • Applications that have flexible start and end times
  • Applications that are only feasible at very low compute prices
  • Users with urgent computing needs for large amounts of additional capacity



Reserved Instances
Reserved Instances provide you with a significant discount (up to 75%) compared to On-Demand instance pricing. In addition, when Reserved Instances are assigned to a specific Availability Zone, they provide a capacity reservation, giving you additional confidence in your ability to launch instances when you need them.
For applications that have steady state or predictable usage, Reserved Instances can provide significant savings compared to using On-Demand instances.
Reserved Instances are recommended for:
  • Applications with steady state usage
  • Applications that may require reserved capacity
  • Customers that can commit to using EC2 over a 1 or 3 year term to reduce their total computing costs
Dedicated Hosts
A Dedicated Host is a physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licenses, including Windows Server, SQL Server, and SUSE Linux Enterprise Server (subject to your license terms), and can also help you meet compliance requirements.
  • Can be purchased On-Demand (hourly).
  • Can be purchased as a Reservation for up to 70% off the On-Demand price.
Per Second Billing
With per-second billing, you pay for only what you use. It takes cost of unused minutes and seconds in an hour off of the bill, so you can focus on improving your applications instead of maximizing usage to the hour. Especially, if you manage instances running for irregular periods of time, such as dev/testing, data processing, analytics, batch processing and gaming applications, can benefit.
EC2 usage are billed on one second increments, with a minimum of 60 seconds. Similarly, provisioned storage for EBS volumes will be billed per-second increments, with a 60 second minimum. Per-second billing is available for instances launched in:
  • On-Demand, Reserved and Spot forms
  • All regions and Availability Zones
  • Amazon Linux and Ubuntu
Create an EC2 Instance
1.On the Amazon Web Services site (here's the link), click on "Sign In to the Console". Sign in if you have account. If you don't, you will need to make one.

2.On the EC2 Dashboard, click on EC2.

Create an Instance
3.On the Amazon EC2 console, click on Launch Instance.

4.Click on the "Select" button in the row with Microsoft Windows Server 2016 Base. Please note that this will create a Windows based instance instead of a typical Linux based instance. This effects how you will connect to the instance.

5.Make sure t2 micro (free instance type) is selected.

and click on "Review and Launch"

6.Click on Launch.

7.Select "Create a new key pair". In the box below ("Key pair name"), fill in a key pair name. I named my key DataCampTutorial, but you can name it whatever you like. Click on "Download Key Pair". This will download the key. Keep it somewhere safe.

Next, click on "Launch Instances"

8.The instance is now launched. Go back to the Amazon EC2 console. I would recommend that you click on what is enclosed in the red rectangle as it will bring you back to the console.

9.Wait till you see that "Instance State" is running before you proceed to the next step. This can take a few minutes.

Connect to your Instance
10.Click on connect.

11.Click on "Download Remote Desktop File". Save the remote desktop file (rdp) file somewhere safe.

12.Click on "Get Password". Keep in mind that you have to wait at least 4 minutes after you launch an instance before trying to retrieve your password.

13.Choose the pem file you downloaded from step 7 and then click "Decrypt Password"

14.After you decrypt your password, save it somewhere safe. You will need it to log into your instance.

15.Open your rdp file. Click on continue. If your local computer is a Mac, you will need to download "Microsoft Remote Desktop" from the App Store to be able to open your rdp file.

16.Enter your password you got from step 14

After you enter your password, you should see a screen like this
  • Stop or Terminate an Instance (Important)
After finishing use of an instance, it is a good idea to stop or terminate the instance. To do this, go to the Amazon EC2 console and click on "Actions" then "Instance State" and you will have the option of either stopping or terminating the instance.
If you plan on using the instance again, stop the instance. If you don't plan on using the instance again, terminate the instance.
While the instance in this tutorial was in the "free tier", I would recommend terminating the instance so you don't forget about it.


Launching Instances

The most useful thing one can do with an instance is launch it, which is a good place for us to start. As an automation-loving sysadmin, you will no doubt quickly automate this process and rarely spend much time manually launching instances. Like any task, though, it is worth stepping slowly through it the first time to familiarize yourself with the process.

Launching from the Management Console

Most people take their first steps with EC2 via the Management Console, which is the public face of EC2. Our first journey through the Launch Instance Wizard will introduce a number of new concepts, so we will go through each page in the wizard and take a moment to look at each of these in turn. Although there are faster methods of launching an instance, the wizard is certainly the best way to familiarize yourself with related concepts.
LAUNCHING A NEW INSTANCE OF AN AMI
To launch a new instance, first log in to Amazon’s web console, open the EC2 section, and click Launch Instance. This shows the first in a series of pages that will let us configure the instance options.

As described earlier, Amazon Machine Images (AMIs) are used to launch instances that already have the required software packages installed, configured, and ready to run. Amazon provides AMIs for a variety of operating systems, and the Community and Marketplace AMIs provide additional choices. For example, Canonical provides officially supported AMIs for various versions of its Ubuntu operating system. Other open source and commercial operating systems are also available, both with and without support. The AWS Marketplace lets you use virtual appliances created by Amazon or third-party developers. These are Amazon Machine Images already configured to run a particular set of software; for example, many variations of AMIs running the popular WordPress blogging software exist. While some of these appliances are free to use (i.e., you only pay for the underlying AWS resources you use), others require you to pay an additional fee on top of the basic cost of the Amazon resources.
If this is your first time launching an instance, the My AMIs tab will be empty. Later in this chapter, we will create our own custom AMIs, which will subsequently be available via this tab. The Quick Start tab lists several popular AMIs that are available for public use.
Click the Select button next to the Amazon Linux AMI.

Selecting the instance type
EC2 instances come in a range of shapes and sizes to suit many use cases. In addition to offering increasing amounts of memory and CPU power, instance types also offer differing ratios of memory to CPU. Different components in your infrastructure will vary in their resource requirements, so it can pay to benchmark each part of your application to see which instance type is best for your particular needs. You can also find useful community-developed resources to quickly compare instance types at EC2instances.info.
The Micro instance class is part of Amazon’s free usage tier. New customers can use 750 instance-hours free of charge each month with the Linux and Windows micro instance types. After exceeding these limits, normal on-demand prices apply.
Select the checkbox next to t2.micro and click Review and Launch. Now you are presented with the review screen, which gives you a chance to confirm your options before launching the instance.
EC2 INSTANCE DETAILS AND USER DATA
So far, we have been using only the most common options when launching our instance. As you will see on the review screen, there are a number of options that we have not changed from the defaults. Some of these will be covered in great detail later in the book, whereas others will rarely be used in the most common use cases. It is worth looking through the advanced options pages to familiarize yourself with the possibilities.
User data is an incredibly powerful feature of EC2, and one that will be used a lot later in the book to demonstrate some of the more interesting things you can do with EC2 instances. Any data entered in this box will be available to the instance once it has launched, which is a useful thing to have in your sysadmin toolbox. Among other things, user data lets you create a single AMI that can fulfill multiple roles depending on the user data it receives, which can be a huge time-saver when it comes to maintaining and updating AMIs. Ubuntu and Amazon Linux support using shell scripts as user data, so you can provide a custom script that will be executed when the instance is launched.
Furthermore, user data is accessible to configuration management tools such as Puppet or Chef, allowing dynamic configuration of the instance based on user data supplied at launch time.
The Kernel ID and RAM Disk ID options will rarely need to be changed if you are using AMIs provided by Amazon or other developers.
Termination protection provides a small level of protection against operator error in the Management Console. When running a large number of instances, it can be easy to accidentally select the wrong instance for termination. If termination protection is enabled for a particular instance, you will not be able to terminate it via the Management Console or API calls. This protection can be toggled on or off while the instance is running, so there is no need to worry that you will be stuck with an immortal instance. Mike can personally attest to its usefulness—it once stopped him from erroneously terminating a production instance running a master database.
IAM roles allow you to assign a security role to the instance. Access keys are made available to the instance so it can access other AWS APIs with a restricted set of permissions specific to its role.
Most of the time your instances will be terminated through the Management Console or API calls. Shutdown Behavior controls what happens when the instance itself initiates the shutdown, for example, after running shutdown -h now on a Linux machine. The available options are to stop the machine so it can be restarted later, or to terminate it, in which case it is gone forever.
Tags are a great way to keep track of your instances and other EC2 resources via the Management Console.
Tags perform a similar role to user data, with an important distinction: user data is for the instance’s internal use, whereas tags are primarily for external use. An instance does not have any built-in way to access tags, whereas user data, along with other metadata describing the instance, can be accessed by reading a URL from the instance. It is, of course, possible for the instance to access its own tags by querying the EC2 API, but that would require API access privileges to be granted to the instance itself in the form of a key, something less than desirable in a healthy security posture.
Using the API, you can perform queries to find instances that are tagged with a particular key/value combination. For example, two tags we always use in our EC2 infrastructures are environment (which can take values such as production or staging) and role (which, for instance, could be webserver or database). When scripting common tasks—deployments or software upgrades—it becomes a trivial task to perform a set of actions on all web servers in the staging environment. This makes tags an integral part of any well-managed AWS infrastructure.
If the Cost Allocation Reports feature (on the billing options page of your account settings page) is enabled, your CSV-formatted bill will contain additional fields, allowing you to link line-item costs with resource tags. This information is invaluable when it comes to identifying areas for cost savings, and for larger companies where it is necessary to separate costs on a departmental basis for charge-back purposes. Even for small companies, it can be useful to know where your sources of cost are.
After reviewing the options, click Launch to move to the final screen. At the time of this writing, the wizard’s Quick Start process will automatically create a convenient launch-wizard-1 security group granting the instance SSH access from the internet at large. This is not the default security group previously discussed, and this helpfulness is not present when using the AWS CLI or API interfaces to create instances

The Review screen (the prominent security warning is alerting you that SSH access has been opened with a default security group)
KEY PAIRS
The next screen presents the available key pairs options

Key pair selection
Key pairs provide secure access to your instances. To understand the benefits of key pairs, consider how we could securely give someone access to an AMI that anyone in the world can launch an instance of. Using default passwords would be a security risk, as it is almost certain some people would forget to change the default password at some point. Amazon has implemented SSH key pairs to help avoid this eventuality. Of course, it is possible to create an AMI that uses standard usernames and passwords, but this is not the default for AWS-supplied AMIs.
All AMIs have a default user: when an instance is booted, the public part of your chosen key pair is copied to that user’s SSH authorized keys file. This ensures that you can securely log in to the instance without a password. In fact, the only thing you need to know about the instance is the default username and its IP address or hostname.
This also means that only people with access to the private part of the key pair will be able to log in to the instance. Sharing your private keys is against security best practices, so to allow others access to the instance, you will need to create additional system accounts and configure them with passwords or SSH authorized keys.
NOTE
The name of the default user varies between AMIs. For example, Amazon’s own AMIs use ec2-user, whereas Ubuntu’s official AMIs use ubuntu.
If you are unsure of the username, one trick you can use is to try to connect to the instance as root. The most friendly AMIs present an error message informing you that root login is disabled, and letting you know which username you should use to connect instead.
Changing the default user of an existing AMI is not recommended, but can be easily done. The details of how to accomplish this have been documented by Eric Hammond of Alestic. The following table enumerates default usernames for most popular Linux distributions:
Distribution
Default Username
Amazon Linux
ec2-user
Ubuntu
ubuntu
Debian
admin
RHEL
ec2-user (since 6.4), root (before 6.4)
CentOS
root
Fedora
ec2-user
SUSE
root
FreeBSD
ec2-user
BitNami
bitnami
You can create a new SSH key pair through the EC2 Key Pairs page in the AWS Management Console—note that key pairs are region-specific, and this URL refers to the US East 1 region. Keys you create in one EC2 region cannot be immediately used in another region, although you can, of course, upload the same key to each region instead of maintaining a specific key pair for each region. After creating a key, a .pem file will be automatically downloaded.
Alternatively, you can upload the public part of an existing SSH key pair to AWS. This can be of great help practically because it may eliminate the need to add the -i /path/to/keypair.pem option to each SSH command where multiple keys are in use (refer to ssh-agent’s man page if you need to manage multiple keys). It also means that the private part of the key pair remains entirely private—you never need to upload this to AWS, it is never transmitted over the internet, and Amazon does not need to generate it on your behalf, all of which have security implications.
Alestic offers a handy Bash script to import an existing public SSH key into each region.
TIP
If you are a Windows user connecting with PuTTY, you will need to convert this to a PPK file using PuTTYgen before you can use it. To do this, launch PuTTYgen, select Conversions → Import Key, and follow the on-screen instructions to save a new key in the correct format. Once the key has been converted, it can be used with PuTTY and PuTTY Agent.
From the Key Pairs screen in the launch wizard, you can select which key pair will be used to access the instance, or to launch the instance without any key pair. You can select from your existing key pairs or choose to create a new key pair. It is not possible to import a new key pair at this point—if you would like to use an existing SSH key that you have not yet uploaded to AWS, you will need to upload it first, just follow the instructions on the EC2 Key Pairs page.
Once you have created a new key pair or imported an existing one, click “Choose from your existing Key Pairs,” select your key pair from the drop-down menu, and continue to the next screen. You have now completed the last step of the wizard—click Launch Instances to create the instance.
WAITING FOR THE INSTANCE
Phew, we made it. Launching an instance can take a few seconds, depending on the instance type, current traffic levels on AWS, and other factors. The Instances page of the Management Console will show you the status of your new instance. Initially, this will be pending, while the instance is being created on the underlying physical hardware. Once the instance has been created and has begun the boot process, the page will show the runningstate. This does not mean your instance is servicing requests or ready for you to log in to, merely that the instance has been created.
Selecting an instance in the Management Console will show you its public DNS name, as well as more detail about the settings and status of the instance. At this point, you can try to SSH to the public hostname. If the connection fails, it means SSH is not yet ready to accept connections, so wait a moment and try again. Once you manage to log in to the instance, you will see a welcome screen specific to the AMI you launched.
QUERYING INFORMATION ABOUT THE INSTANCE
Now that you have an instance, what can you do with it? The answer is—anything you can do with an equivalent Linux server running on physical hardware. Later chapters demonstrate some of the more useful things you can do with EC2 instances. For now, let’s take a look at the ec2metadata tool, which is included on most well-designed AMIs.
WARNING
In the infancy of AWS, EC2 had no real style guide; the question of how to name something was up to the developer. A few different but equivalent tools parsing instance metadata appeared: ec2metadata in the case of Ubuntu’s, and ec2-metadata in the case of Amazon Linux’s variant.
The ec2metadata tool is useful for quickly accessing the metadata attributes of your instance: for example, the instance ID, or the ID of the AMI from which this instance was created. Running ec2metadata without arguments will display all available metadata.
If you are interested in specific metadata attributes, you can read the values one at a time by passing the name of the attribute as a command-line option. For example:
$ ec2metadata --instance-id
i-ba932720
$ ec2metadata --ami-id
ami-f5f41398
This is useful if you are writing shell scripts that need to access this information. Rather than getting all the metadata and parsing it yourself, you can do this:
INSTANCE_ID=$(ec2metadata --instance-id)
AMI_ID=$(ec2metadata --ami-id)
echo "The instance $INSTANCE_ID was created from AMI $AMI_ID"
NOTE
Every instance downloads its metadata from the following URL:
http://169.254.169.254/latest/meta-data/<attribute_name>
So to get the instance ID, you could request the URL http://169.254.169.254/latest/meta-data/instance-id.
This URL is accessible only from within the instance, while the IP address maps to the hostnamehttp://instance-data, which is easier for users to remember. See AWS’s Documentation for full details on instance metadata.
If you want to query the metadata from outside the instance, you will need to use the ec2-describe-instances command.
TERMINATING THE INSTANCE
Once you have finished testing and exploring the instance, you can terminate it. In the Management Console, right-click the instance and select Terminate Instance.
Next, we will look at some of the other available methods of launching instances.
TIP
In early 2013, Amazon introduced a mobile app interface to the AWS Management Console with versions supporting both iOS and Android devices. After multiple updates and enhancements, the app has become an excellent tool for administrators who need a quick look at the state of their AWS deployment while on the move.
The app’s functionality is not as comprehensive as the web console’s, but it showcases remarkable usability in its streamlined workflow (see Figure 2-5 for an example), and most users enjoy the quick access to select functionality it provides: some users now even pull up their mobile phone to execute certain tasks rather than resorting to their trusted terminal!

The AWS console mobile app

Launching with Command-Line Tools

If you followed the steps in the previous section, you probably noticed a few drawbacks to launching instances with the Management Console. The number of steps involved and the variety of available options engender complex documentation that takes a while to absorb. This is not meant as criticism of the Management Console—EC2 is a complex beast, thus any interface to it requires a certain level of complexity.
Because AWS is a self-service system, it must support the use cases of many users, each with differing requirements and levels of familiarity with AWS itself. By necessity, the Management Console is equivalent to an enormous multipurpose device that can print, scan, fax, photocopy, shred, and collate.
This flexibility is great when it comes to discovering and learning the AWS ecosystem, but is less useful when you have a specific task on your to-do list that must be performed as quickly as possible. Interfaces for managing production systems should be streamlined for the task at hand, and not be conducive to making mistakes.
Documentation should also be easy to use, particularly in a crisis, and the Management Console does not lend itself well to this idea. Picture yourself in the midst of a downtime situation, where you need to quickly launch some instances, each with different AMIs and user data. Would you rather have to consult a 10-page document describing which options to choose in the Launch Instance Wizard, or copy and paste some commands into the terminal?
Fortunately, Amazon gives us precisely the tools required to do the latter. The EC2 command-line tools can be used to perform any action available from the Management Console, in a fashion that is much easier to document and much more amenable to automation.
WARNING
As you start exploring dynamic infrastructure provisioning with AWS CLI, we recommend you set up a billing alarm. Leveraging the CloudWatch and Simple Notification services, billing alerts will notify you if you exceed preset spending thresholds.
While not ruinously expensive, forgetting to shut down a few of your test instances and letting them run for the rest of the month (until you notice as you are billed) will easily exceed your personal phone bill. It is a snap to inadvertently make this mistake; we have slipped up ourselves and advise you let the system help keep track with these friendly notifications.
Make sure you have set the AWS_ACCESS_KEY and AWS_SECRET_KEY environment variables or the equivalent values in the .aws/credentials file in your home directory.
ACCESS KEY IDS AND SECRETS
When you log in to the AWS Management Console, you will use your email address and password to authenticate yourself. Things work a little bit differently when it comes to the command-line tools. Instead of a username and password, you use an access key ID and secret access key. Together, these are often referred to as your access credentials.
Although access credentials consist of a pair of keys, they are not the same as an SSH key pair. The former is used to access AWS’s APIs, while the latter is used to SSH into an instance to perform work on the shell.
When you created your AWS account, you also generated a set of access credentials for your root account identity. These keys have full access to your AWS account—keep them safe! You are responsible for the cost of any resources created using these keys, so if a malicious person were to use these keys to launch some EC2 instances, you would be left with the bill.
“IAM Users and Groups” discusses how you can set up additional accounts and limit which actions they can perform, as defined by current security best practices. For the following examples, we will just use the access keys you have already created during CLI setup.
AWS lets you inspect all active access credentials for your account through the Security Credentials page of the Management Console, but for increased security you will be unable to retrieve their secret access keys after creation. This stops any unauthorized access to your account from resulting in a compromise of your API credentials, but has the annoying side effect of requiring you to replace your access keys if they ever were lost.
To launch an instance from the command line, you need to provide values that correspond to the options you can choose from when using the Management Console. Because all of this information must be entered in a single command, rather than gathered through a series of web pages, it is necessary to perform some preliminary steps so you know which values to choose. The Management Console can present you with a nice drop-down box containing all the valid AMIs for your chosen region, but to use the command line, you need to know the ID of the AMI before you can launch it.
The easiest way to get a list of available images is in the Instances tab of the Management Console, which lets you search through all available AMIs. Keep in mind that AMIs exist independently in EC2 regions—the Amazon Linux AMI in the US East region is not the same image as the Amazon Linux AMI in Europe, although they are functionally identical. Amazon, Canonical, and other providers make copies of their AMIs available in each region as a convenience to their users, but the same AMI will show a different ID in different regions.
FINDING UBUNTU IMAGES
Searching for Ubuntu images yields 27,175 results in the us-east-1 region alone at the time of this writing. Filtering for official images released by Canonical (owner 099720109477) reduces the crop to only 6,074 images. These high numbers are due to Ubuntu’s high popularity in public cloud environments, and to Canonical’s commitment to refreshing AMIs with the newest packages as security updates or bug fixes are published. Older AMIs remain available as new ones are issued by the vendor, the timing of when to switch to newer images being entirely under the admin’s control, not AWS’s. All these factors conspire to make finding the correct Ubuntu image a rather nontrivial task.
Ubuntu AMIs can be most easily found using Canonical’s AMI Locator which shows only the most recent release by default and which updates results as you search by substring or select from prepopulated pull-down menus. This is an essential resource for navigating the sea of Ubuntu images found on AWS. At the time of this writing, the Locator narrows down our options to twelve images varying in storage and system bit width.

Figure 2-6. Ubuntu EC2 AMI Locator (clicking the selected AMI ID launches it from the Management Console)
Equally interesting to power users is the collection of official Ubuntu cloud images found on Ubuntu.com. This site includes both daily builds and official releases. Finding the latter is accomplished by navigating to the release/ subdirectory of any Ubuntu version, which is http://cloud-images.ubuntu.com/releases/16.04/release/for Xenial.
If you need to find an AMI using the command-line tools, you can do so with the aws ec2 describe-imagescommand. A few examples follow:
# Describe all of your own images in the US East region
aws ec2 describe-images --owners self --region us-east-1

# Find Amazon-owned images for Windows Server 2012, 64-bit version
aws ec2 describe-images --owners amazon --filters Name=architecture,Values=x86_64 | grep Server-2012

# List the AMIs that have a specific set of key/value tags
aws ec2 describe-images --owners self --filters Name=tag:role,Values=webserver Name=tag:environment,Values=production
The first query should of course yield no results, unless you have already created some AMIs of your own. Later examples showcase combining the tool’s own filtering and grep to find the image you are really looking for. In our second example we are searching for a Windows Server image created by another party. Note that we explicitly searched for Amazon-owned images, as any AWS customer can decide to make her AMIs accessible to all other customers. Image names are freely chosen by their creator just like their contents, thus not only complicating our search with a very large number of results, but potentially posing a security problem if one carelessly selects an unknown party’s bits.
At the time of writing, the most popular Ubuntu long-term support (LTS) version on AWS is 16.04, going by the nickname of Xenial Xerus. In the Eastern US EC2 region, the latest version of Canonical’s official AMI is ami-43a15f3e (64b, HVM, EBS storage), which is used in many of the examples. Make sure to update this with your chosen AMI. If you are not sure which to use, or have no real preference, the authors recommend using the latest LTS version of Ubuntu for 64-bit systems.
The command used to launch an instance is aws ec2 run-instances. The most basic invocation is simply aws ec2 run-instances --image-id ami-6d060707, which will launch an older m1.small instance in the default us-east-1 region. If you are paying attention, you noticed we used a different AMI ID with paravirtualization support as the older m1.small instance type does not support the newer HVM virtualization style. However, if you run this command and attempt to log in to the instance, you will soon notice a rather large problem: because no key pair name was specified, there is no way to log in to the instance. Instead, try running the command with the -key option to specify one of the SSH key pairs you created earlier. In the following example, we have also changed the instance type to t2.micro, the smallest instance type all AWS operating systems are currently comfortable with:
$ aws ec2 run-instances --image-id ami-43a15f3e --region us-east-1 \
--key federico --instance-type t2.micro  --output text
740376006796 r-bbcfff10
INSTANCES 0 x86_64  False xen ami-43a15f3e i-64a8a6fe t2.micro federico 2016-04-03T07:40:48.000Z ip-172-31-52-118.ec2.internal 172.31.52.118  /dev/sda1 ebs True  subnet-2a45b400 hvm vpc-934935f7
[ output truncated ]
Once EC2 receives the request to launch an instance, it prints some information about the pending instance. The value we need for the next command is the instance ID, in this case, i-64a8a6fe.
Although this command returns almost immediately, you will still need to wait a short while before your instance is ready to accept SSH connections. You can check on the status of the instance while it is booting with the aws ec2 describe-instance-status command. While the instance is still booting, its status will be listed as pending. This will change to running once the instance is ready. Remember that ready in this context means that the virtual instance has been created, and the operating system’s boot process has started. It does not necessarily mean that the instance is ready to receive an SSH connection, which is important when writing scripts that automate these commands.
TIP
Granting access to an already running image can involve multiple manual steps adding the new user’s SSH credentials to the authorized keys file. Juggling files can be avoided working with Ubuntu images thanks to the ssh-import-id command. Just invoking the following:
ssh-import-id lp:f2
will retrieve Federico’s SSH identity from Launchpad.net and grant him access, since he’s the user the command was run under. You can accomplish the same for Mike by using his GitHub user ID:
ssh-import-id gh:mikery
All that is required is the user ID from either site. This is roughly equivalent to running the following (which could be used to derive alternative import strategies for other sites):
wget https://launchpad.net/~f2/+sshkeys -0 - >>
~/.ssh/authorized_keys && echo >> ~/.ssh/authorized_keys
Once your instance is running, the output should look similar to this:
$ aws ec2 describe-instance-status --instance-ids i-64a8a6fe --region us-east-1 --output text
INSTANCESTATUSES us-east-1a i-64a8a6fe
INSTANCESTATE 16 running
INSTANCESTATUS ok
DETAILS reachability passed
SYSTEMSTATUS ok
DETAILS reachability passed
Another way to display information about your instance is with aws ec2 describe-instances, which will show much more detail. In particular, it will show the public DNS name (for example, ec2-54-247-40-225.eu-west-1.compute.amazonaws.com), which you can use to SSH into your instance:
$ aws ec2 describe-instances --instance-ids i-64a8a6fe --region us-east-1 --output text
RESERVATIONS 740376006796 r-bbcfff10
INSTANCES 0 x86_64  False xen ami-43a15f3e i-64a8a6fe t2.micro federico 2016-04-03T07:40:48.000Z ip-172-31-52-118.ec2.internal 172.31.52.118 ec2-52-90-56-122.compute-1.amazonaws.com 52.90.56.122 /dev/sda1 ebs True  subnet-2a45b400 hvm vpc-934935f7
BLOCKDEVICEMAPPINGS /dev/sda1
[ output truncated ]
EBS 2016-04-03T07:40:48.000Z True attached vol-e9c0c637
MONITORING disabled
NETWORKINTERFACES  12:5a:33:b3:b5:97 eni-ce4084ea 740376006796 ip-172-31-52-118.ec2.internal 172.31.52.118 True in-use subnet-2a45b400 vpc-934935f7
ASSOCIATION amazon ec2-52-90-56-122.compute-1.amazonaws.com 52.90.56.122
ATTACHMENT 2016-04-03T07:40:48.000Z eni-attach-2545d3d4 True 0 attached
GROUPS sg-384f3a41 default
PRIVATEIPADDRESSES True ip-172-31-52-118.ec2.internal 172.31.52.118
ASSOCIATION amazon  52.90.56.122
PLACEMENT us-east-1a  default
SECURITYGROUPS sg-384f3a41 default
STATE 16 running
To terminate the running instance, issue aws ec2 terminate-instance. To verify that this instance has indeed been terminated, you can use the aws ec2 describe-instances command again:
$ aws ec2 terminate-instances --instance-ids i-64a8a6fe --region us-east-1
INSTANCE i-64a8a6fe running shutting-down
$ aws ec2 describe-instances --instance-ids i-64a8a6fe --region us-east-1
RESERVATION r-991230d1 612857642705 default
INSTANCE i-64a8a6fe ami-43a15f3e   terminated mike 0  t1.micro 2012-11-25T15:51:45+0000 
[ output truncated ]
As you find yourself using the command-line tools more frequently, and for more complex tasks, you will probably begin to identify procedures that are good candidates for automation. Besides saving you both time and typing, automating the more complex tasks has the additional benefits of reducing the risk of human error and simply removing some thinking time from the process.
The command-line tools are especially useful when it comes to documenting these procedures. Processes become more repeatable. Tasks can be more easily delegated and shared among the other members of the team.
TIP
Trying to connect multiple times as an instance boots is inelegant. Fortunately, we can one-line script our way out of this. The BSD version of ping, notably on macOS, includes a convenient “one ping only” option (-o) that we like to think honors Sean Connery’s famous quote in Hunt for Red October. The option terminates ping once the first reply is received. Like Captain Marko Ramius, we can use this to ask for “one ping only, please”:
$ ping -o 52.90.56.122; sleep 2; ssh ubuntu@52.90.56.122
PING 52.90.56.122 (52.90.56.122): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
64 bytes from 52.90.56.122: icmp_seq=3 ttl=48 time=40.492 ms
[ output truncated ]

Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-1052-aws x86_64)
Perhaps less steeped in movie lore, but nonetheless equally effective is this GNU-compatible version that waits in a loop for the SSH service to start up:
$ until ssh ubuntu@52.90.56.122; do sleep 1; done
ssh: connect to host 52.90.56.122 port 22: Connection refused
ssh: connect to host 52.90.56.122 port 22: Connection refused
[ output truncated ]

Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-1052-aws x86_64)

Post a Comment

Previous Post Next Post