Powered by Blogger.

300x250 AD TOP

Tagged under: , , , , , ,

Amazon Elastic Compute Cloud (Amazon EC2)

Amazon EC2

Secure and resizable compute capacity in the cloud. Launch applications when needed without upfront commitments.

Image result for amazon ec2
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.
Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon’s proven computing environment. Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change. Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the tools to build failure resilient applications and isolate them from common failure scenarios

What Is an Instance?

At the simplest level, an instance can be thought of as a virtual server, the same as you might rent on a monthly basis from a virtual private server (VPS) provider. Indeed, some people are using EC2 in exactly the same way as they would a VPS. While perfectly serviceable in this respect, to use it in this way ignores several interesting features and technologies that can make your job a lot more convenient.
Amazon Machine Images (AMIs) are the main building blocks of EC2. They allow you to configure an instance once (say, installing Apache or Nginx) and then create an image of that instance. The image can be used to launch more instances, all of which are functionally identical to the original. Of course, some attributes—such as the IP address or instance ID—must be unique, so there will be some differences.
AWS services operate in multiple geographic regions around the world. At the time of this writing, there are seventeen public AWS regions, each of which is further divided into multiple availability zones. This geographic disparity has two main benefits: you can place your application resources close to your end users for performance reasons, and you can design your application so that it is resilient to loss of service in one particular region or availability zone. AWS provides the tools to build automatic damage control into your infrastructure, so if an availability zone fails, more resources can be provisioned in the other availability zones to handle the additional load.
Each availability zone (AZ) is located in a physically separate datacenter within its region. There are three datacenters in or around Dublin, Ireland, that make up the three availability zones in the EU West 1 region—each with separate power and network connections. In theory, this means that an outage in one AZ will not have any effect on the other AZs in the region. In practice, however, an outage in one AZ can trigger a domino effect on its neighboring AZs, and not necessarily due to any failing on Amazon’s part.
Consider a well-architected application that, in the event of an AZ failure, will distribute traffic to the remaining AZs. This will result in new instances being launched in the AZs that are still available. Now consider what happens when hundreds of well-architected applications all failover at the same time—the rush for new instances could outstrip the capability of AWS to provide them, leaving some applications with too few instances.
This is an unlikely event—although AWS has service outages like any other cloud provider, deploying your application to multiple AZs will usually be sufficient for most use cases. To sustain the loss of a significant number of AZs within a region, applications must be deployed to multiple regions. This is considerably more challenging than running an application in multiple AZs.
A final reminder that AWS services are not uniformly available across all regions—validate deployment plans involving regions you are not already familiar with against the newest version of the official Region Table.

Instance Types

EC2 instances come in a range of sizes, referred to as instance types, to suit various use cases. The instance types differ wildly in the amount of resources allocated to them. The m3.medium instance type has 3.75 GB of memory and 1 virtual CPU core, whereas its significantly bigger brother c3.8xlarge has 60 GB of memory and 32 virtual CPU cores. Each virtual CPU is a hyperthread of an Intel Xeon core in the m3 and c3 instance classes.
For most of the examples in the book, we will use a t2.micro instance, among the smaller and one of the cheapest instance types suitable for any operating system choice, which makes it ideal for our tests.
In production, picking the right instance type for each component in your stack is important to minimize costs and maximize performance, and benchmarking can be the key when making this decision.

Processing Power

EC2, along with the rest of AWS, is built using commodity hardware running Amazon’s software to provide the services and APIs. Because Amazon adds this hardware incrementally, several hardware generations are in service at any one time.
When it comes to discussing the underlying hardware that makes up the EC2 cloud, Amazon used to play the cards close to its chest and reveal relatively little information about the exact hardware specifications. This led to the creation of a dedicated compute unit:
One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
It is easy to encounter this metric in older AWS benchmarks. Amazon now openly identifies what hardware underlies the EC2 compute layer, and these abstract units are obsolete and no longer in use.
Amazon provides a rather vast selection of instance types, the current generation of which is described at the EC2 Instance Types page. The previously mentioned t2.micro instance type therefore refers to a second generation general-purpose burstable performance instance. An immediate update of already running applications is generally not required as older generations remain available for provisioning, with their original functionality intact. It remains advisable to adopt the latest instance type generation when designing a new (or revised) application, so as to benefit from the capabilities of the newer hosting hardware.
No EC2 instance type has ever been discontinued in almost 10 years. This record is made possible by market forces: as newer instance types become available, their significantly better price/performance ratio induces a user migration away from the previous generation. A reduced demand base in turn allows Amazon to continue to supply those deprecated instance types without having to add capacity with old hardware that may be unavailable.
Older instance types are, however, not available in the newer AWS regions they predate—for example, the first generation to be deprecated, cc1, is not found in the newest region ap-northeast-2 hosted in Seoul, Korea. If our spirited advice and the cost savings produced by migrating to newer instance generations are not sufficient to entice you to regularly update your instance selection, perhaps your global expansion plans will.
AWS machine images may make use of either of the two virtualization types supported by the Xen hypervisor: paravirtualized or hardware virtual machine (HVM). It is not necessary to be conversant in the finer differences of the two technologies to make effective use of AWS, but the two approaches present boot-time differences to the guest OS environment. A given Linux machine image will only support booting one virtualization type as a result, a requirement easily met by filtering any image search with the appropriate virtualization type.
Amazon recommends using HVM virtualization on current-generation AMIs. Where that approach is not suitable, it becomes necessary to determine what virtualization type is supported by the older generation of a specific instance type. This is quickly accomplished by launching a test HVM instance from the AWS CLI and watching for a helpful error message. The AWS documentation also provides insight into what virtualization type is supported by what older instance type.
Different combinations of CPU, memory, network bandwidth, and even custom hardware differentiate AWS instance types. There are nine instance type classes in the current generation at the time of writing, including general purpose (M4, M3), burstable performance (T2), compute optimized (C4, C3), memory intensive (R3), storage optimized (I2 for performance, or D2 for cost), and GPU enabled (G2). These in turn include multiple types with resource allotments of increasing size, bringing the total number of choices we select from above forty.
Jeff Barr of Amazon has published an interesting timeline of EC2’s instance generations.
Taking a scientific approach to benchmarking is the only way to really be sure you are using the right instance type. AWS makes it really simple to run the very same workload configuration with a succession of different instance types, considerably simplifying this task. The most common approach in the AWS user community is to start with an instance type considered high-CPU for the workload under consideration. While running top, drive the CPU to 100% using your application’s load generator of choice. Now examine memory use: if you observe the instance running out of memory before the CPU is at full throttle, switch to a higher-memory instance type. Continue this process until you achieve a reasonable balance.
Alongside fixed-performance instances, including the C4, C3, and R3 types, EC2 offers burstable performance instances like the T2 type. Burstable performance instances generally operate at a CPU performance baseline but can “burst” above this limit for a time. Bursting is governed by CPU credits that are accumulated when the instance runs without its full allotment of CPU. A CPU credit represents use of a full CPU core for one minute.
A practical example will illustrate the accounting mechanism EC2 employs: a t2.micro instance type allocates one virtual CPU to your cloud instance, with six CPU credits earned each hour, representing a 10% share of a real CPU core. Let’s assume our workload is a web server, often idling while waiting for requests. If the CPU load falls below 10%, CPU credits are added to that instance’s credit for up to 24 hours. Burstable performance is particularly useful for workloads that do not consistently use their full share of the CPU, but benefit from having access to additional, fast CPUs when the occasion arises—applications include small databases, web servers, and development systems.
Alongside the traditional CPU shares of us (user), sy (system), id (idle), and wa (IO wait), the EC2 hypervisor exposes the additional metric st, meaning stolen:
%Cpu(s):  0.1 us,  0.1 sy,  0.1 ni, 98.2 id,  1.0 wa,  0.0 hi,  0.0 si,  0.5 st
Stolen CPU time represents the share of time the instance’s virtual CPU has been waiting for a real CPU while the hypervisor is using it to service another virtual processor. Stolen CPU has gained prominence as a metric that Netflix, possibly the most prominent AWS tenant, tracks closely. Despite its present fame, Stolen CPU is not as significant for workloads that are not sensitive to network jitter or real-time in nature.
The Noisy Neighbor is a related compute cause célèbre: in any virtual environment, the noisy neighbor effect occurs when an instance starves other instances for a shared resource, causing performance issues to others on the same infrastructure. You will not observe memory or CPU contention as EC2 instances are generally not overprovisioned; any potential noisy neighbor problems will be limited to network or disk I/O.
One simple approach countering this issue is to automatically allocate a new instance, replacing the one where the performance problem was encountered. Larger instance types are less likely to present this problem on account of sharing a host with fewer neighbors. SR-IOV support (Enhanced Networking) increases storage and network I/O bandwidth, helping to minimize any noise. The ultimate solution is to use Dedicated Hosts, a facility providing complete control of your instance placement for an additional fee.
Specific instance types may provide the latest advanced features found in Intel hardware, including on-chip support for AES encryption and the Advanced Vector Extensions instruction set. The G2 instance type is currently the most prominent example of enhanced compute support, featuring more than 1,500 NVIDIA GPU cores. Advanced compute options are rapidly evolving; their most recent iteration is documented in the instance types page, which we recommend you review often.
EC2 instances can be purchased in three ways. Allocated by the hour and requiring no upfront commitment, on-demand instances are the default and are used exclusively throughout this book. Reserved instances represent a prepaid commitment on the part of a customer, which is usually rewarded by AWS with very steep discounts, up to 75% of on-demand pricing. Spot instance pricing requires no upfront commitment, and their pricing fluctuates according to the supply and demand of compute capacity. The customer may define a maximum hourly price not to be exceeded, and EC2 will automatically shut those instances down if their spot pricing tops the set threshold.


There are two options when it comes to virtual disk storage for your instances: instance storage (also known as ephemeral storage) and Elastic Block Store (or EBS). Both are simply block storage devices that can be attached to instances. Once attached, they can be formatted with your operating system’s tools and will act like a standard disk. AWS storage comes in two flavors: magnetic disks and solid-state drives (SSDs). SSDs provide higher read and write performance when compared to magnetic disks, but the cost is slightly higher.
There are some key differences between instance storage and EBS. Instance storage is directly attached to the physical host that runs your instance, whereas EBS is attached over the network. This has implications in terms of disk latency and throughput, so we recommend performing another series of benchmarks to see which is best if your application is sensitive to latency or I/O jitter.
I/O speeds are not the only difference—EBS has features that make it preferable to instance storage in nearly all usage scenarios. One of the most useful is the ability to create a snapshot from an EBS. A snapshot is a copy of an EBS volume at a particular point in time. Once you have created a snapshot, you can then create additional EBS volumes that will be identical copies of the source snapshot. You could, for example, create a snapshot containing your database backups. Every time a new instance is launched, it will have a copy of the data ready for use. EBS snapshots form the backbone of many AWS backup strategies.
When an instance is terminated, any data stored on instance storage volumes is lost permanently. EBS volumes can persist after the instance has been terminated. Given all of the additional features, using EBS volumes is clearly preferable except in a few cases, such as when you need fast temporary storage for data that can be safely discarded.
Multiple volumes (of either type) can be attached to an instance, leading to pretty flexible storage configurations. The Block Device Mapping facility allows multiple volumes to be associated with an instance at boot time. It is even possible to attach multiple volumes to an instance and build a software RAID array on them—an advantage of volumes appearing as block storage devices to the operating system.
The disk_setup and mounts modules of Cloud-init allow customization of all disks associated with an instance upon boot, including partitioning and formatting disks as well as configuring mount points in /etc/fstab. The official documentation also sheds light on the details of how many public clouds can initialize their instance storage using Cloud-init.
In June 2012, AWS began offering SSDs as a higher-performance alternative to magnetic storage, and over time introduced multiple options with different performance levels and cost. Some instance types now include an SSD-backed instance store to deliver very-high random I/O performance, with types I2 and R3 being the first to support TRIM extensions. Instance types themselves have evolved to include high-I/O instances (type I2), aimed at delivering high IOPS from up to 8 local SSD drives, while dense storage instances (type D2) offer the lowest price per-disk throughput in EC2 and balance cost and performance, using 24 local magnetic drives.
EBS Magnetic and SSD volumes are currently limited to 16 TB in size, limits easily exceeded by dense storage (d2) instances, which can boot with 48 TB of local disk storage. Whereas EBS volumes can be provisioned at any time and in arbitrary configurations, the number and size of available instance store volumes varies with instance type, and can only be attached to an instance at boot time. In addition, EBS volumes can be dynamically resized, which is also used to redefine their performance at runtime.
EBS SSD options include a number of performance flavors. General-purpose SSD volumes are provisioned with 3 IOPS per GB, with burst performance reaching 3,000 IOPS for extended periods. Provisioned IOPS SSD volumes allow the user to define the desired level of performance, up to 20,000 IOPS and 320 MB/s of throughput. A less costly option is offered by the EBS-optimized M4 type instances, which include dedicated EBS bandwidth between 450 and 4,000 Mbps depending on the specific instance type. EBS-optimized instances use an optimized configuration stack requiring corresponding support on the machine image’s part for optimal performance
Long-term storage options are best supported by the S3 service, but a block storage option is available through Cold HDD EBS volumes. Backed by magnetic drives, Cold HDD volumes offer the lowest cost per GB of all EBS volume types, and still provide enough performance to support a full-volume scan at burst speeds. EBS also supports native at-rest encryption that is transparently available to EC2 instances and requires very little effort on the administrator’s part to deploy and maintain. EBS encryption has no IOPS performance impact and shows very limited impact on latency, making it a general-purpose architectural option even when high security is not strictly required.


At its simplest, networking in AWS is straightforward—launching an instance with the default networking configuration will give you an instance with a public IP address. Many applications will require nothing more complicated than enabling SSH or HTTP access. At the other end of the scale, Amazon offers more-advanced solutions that can, for example, give you a secure VPN connection from your datacenter to a Virtual Private Cloud (VPC) within EC2.
At a minimum, an AWS instance has one network device attached. The maximum number of network devices that can be attached depends on the instance type. Running ip addr show on your instance will show that it has a private IP address in the default range. Every instance has a private IP and may have apublic IP; this can be configured at launch time or later, with the association of an Elastic-IP address.
AWS accounts created after December 2013 no longer have access to the legacy EC2-classicnetworking model. This book covers the current EC2-VPC networking model exclusively.
Amazon Virtual Private Cloud enables you to provision EC2 instances in a virtual network of your own design. A VPC is a network dedicated to your account, isolated from other networks in AWS, and completely under your control. You can create subnets and gateways, configure routing, select IP address ranges, and define its security perimeter—a series of complex tasks that are bypassed by the existence of the default VPC. The default VPC includes a default subnet in each availability zone, along with routing rules, a DHCP setup, and an internet gateway. The default VPC enables new accounts to immediately start launching instances without having to first master advanced VPC configuration, but its security configuration will not allow instances to accept connections from the internet until we expressly give our permission, by assigning our own security group settings.
The default security group allows all outbound traffic from instances to reach the internet, and also permits instances in the same security group to receive inbound traffic from one another, but not from the outside world. Instances launched in the default VPC receive both a public and a private IP address. Behind the scenes, AWS will also create two DNS entries for convenience.
For example, if an instance has a private IP of and a public IP of, their respective DNS entries will be ip-172-31-16-166.ec2.internal and ec2-54-152-163-171.compute-1.amazonaws.com. These DNS entries are known as the private hostname and public hostname.
It is interesting to note that Amazon operates a split-view DNS system, which means it is able to provide different responses depending on the source of the request. If you query the public DNS name from outside EC2 (not from an EC2 instance), you will receive the public IP in response. However, if you query the public DNS name from an EC2 instance in the same region, the response will contain the private IP:
# From an EC2 instance
$ dig ec2-54-152-163-171.compute-1.amazonaws.com +short
# From Digital Ocean
$ dig ec2-54-152-163-171.compute-1.amazonaws.com +short
The purpose of this is to ensure that traffic does not leave the internal EC2 network needlessly. This is important as AWS has a highly granular pricing structure when it comes to networking, and Amazon makes a distinction between traffic destined for the public internet and traffic that will remain on the internal EC2 network. The full breakdown of costs is available on the EC2 Pricing page.
If two instances in the same availability zone communicate using their private IPs, the data transfer is free of charge. However, using their public IPs will incur internet transfer charges on both sides of the connection. Although both instances are in EC2, using the public IPs means the traffic will need to leave the internal EC2 network, which will result in higher data transfer costs.
By using the private IP of your instances when possible, you can reduce your data transfer costs. AWS makes this easy with their split-horizon DNS system: as long as you always reference the public hostname of the instance (rather than the public IP), AWS will pick the cheapest option.

Amazon EC2 Pricing

Amazon EC2 is free to try. There are four ways to pay for Amazon EC2 instances:
  • On-Demand
  • Reserved Instances,
  • Spot Instances
  • You can also pay for Dedicated Hosts which provide you with EC2 instance capacity on physical servers dedicated for your use.

With On-Demand instances, you pay for compute capacity by per hour or per second depending on which instances you run. No longer-term commitments or upfront payments are needed. You can increase or decrease your compute capacity depending on the demands of your application and only pay the specified per hourly rates for the instance you use.
On-Demand instances are recommended for:
  • Users that prefer the low cost and flexibility of Amazon EC2 without any up-front payment or long-term commitment
  • Applications with short-term, spiky, or unpredictable workloads that cannot be interrupted
  • Applications being developed or tested on Amazon EC2 for the first time
Spot Instances
Amazon EC2 Spot instances allow you to request spare Amazon EC2 computing capacity for up to 90% off the On-Demand price.
Spot instances are recommended for:
  • Applications that have flexible start and end times
  • Applications that are only feasible at very low compute prices
  • Users with urgent computing needs for large amounts of additional capacity

Reserved Instances
Reserved Instances provide you with a significant discount (up to 75%) compared to On-Demand instance pricing. In addition, when Reserved Instances are assigned to a specific Availability Zone, they provide a capacity reservation, giving you additional confidence in your ability to launch instances when you need them.
For applications that have steady state or predictable usage, Reserved Instances can provide significant savings compared to using On-Demand instances.
Reserved Instances are recommended for:
  • Applications with steady state usage
  • Applications that may require reserved capacity
  • Customers that can commit to using EC2 over a 1 or 3 year term to reduce their total computing costs
Dedicated Hosts
A Dedicated Host is a physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licenses, including Windows Server, SQL Server, and SUSE Linux Enterprise Server (subject to your license terms), and can also help you meet compliance requirements.
  • Can be purchased On-Demand (hourly).
  • Can be purchased as a Reservation for up to 70% off the On-Demand price.
Tagged under: , ,

Adding New Custom Fields in Server Class in iTop CMDB tool

I have added four fields in Server Class which are highlighted below after following the page instructions on itophub page and which are described below too.



You can simply download the extension from here & deploy it on setup page on iTop  in case you want to configure the same four fields.

Here is the link to download the four files which would help you to deploy above four fields i.e. How to Access ; Patch History ; Cluster Details & Additional details.It is similar to the sample example below in which additional notes was added& tested too .


Goals of this tutorial

In this step-by-step tutorial you will learn to:

  • create your own extension module for iTop 2.0

  • add a new field to an existing class of object

For the purpose of this tutorial we will add a text field labeled Additional Notes to the Server object.

Customized Server

What you will need

  • iTop installed on a development machine, on which you can easily access/edit the files.

  • A text editor capable of editing PHP and XML file and supporting UTF-8. On Windows you can use Wordpad (Notepad does not like Unix line endings) or one of the excellent free development IDEs like PSPad or Notepad++.

Customization process

The customization process is the following:

  1. Install a development instance of iTop. It is always better not to experiment in production !!

  2. Install the toolkit to assist you in the customization

  3. Create a new (empty) module using the module creation wizard

  4. Copy this new module in the extensions folder on iTop and run the setup again to install the empty module

  5. Modify the module in extensions and use the toolkit to check your customizations

Repeat the last point until you are satisfied with your customization. When you are done, your new module is ready to be deployed. Copy the module folder in the extension directory on your production iTop instance and run the setup to install it.

Install the empty module

Expand the content of the zip into the extensions folder of your development iTop instance. You should now have a folder named sample-add-attribute inside the extensions folder. this folder contains the following files:

  • datamodel.sample-add-attribute.xml

  • module.sample-add-attribute.php

  • en.dict.sample-add-attribute.php

  • model.sample-add-attribute.php

Make sure that the file conf/production/config-itop.php is writable for the web server (on Windows: right click to display the file properties and uncheck the read-only flag; on Linux change the rights of the file), then launch the iTop installation by pointing your browser to http://your_itop/setup/

Launching the re-install

Click “Continue »” to start the re-installation.

Make sure that “Update an existing instance” is selected before clicking “Next »”.

Continue to the next steps of the wizard…

Your custom module should appear in the list of “Extensions”. If it's not the case, check that the module files were copied in the proper location and that the web server has enough rights to read them.

Select your custom module before clicking “Next »” and complete the installation.

Add a new field to the Server class

Using you favorite text editor, open the file datamodel.sample-add-attribute.xml.

Remove the tags <menus></menus> since the module will not contain any menu definition.

Inside the classes tag, add the following piece of code:

    <class id="Server">
        <field id="notes" xsi:type="AttributeText" _delta="define">

This instructs iTop to modify the existing class “Server” by adding a new field (notice the _delta=“define” on the field tag) of type AttributeText. This new field is named notes (since it is defined with id=“notes”). The corresponding values will be stored in the database in the column notes (thanks to the definition <sql>notes</sql>).

For more information about the meaning of the various parameters of the field tag (and also for the list of all possible types of fields) refer to the XML reference documentation.

You should now have the following XML file:

<?xml version="1.0" encoding="UTF-8"?>
<itop_design xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0">
    <class id="Server">
        <field id="notes" xsi:type="AttributeText" _delta="define">

Check your modification by running the toolkit. Point your browser to http://your_itop/toolkit.

Checking the modifications with the toolkit

If any error is reported at this stage, fix them by editing the XML file and check again your modifications by clicking on the “Refresh” button in the toolkit page.

Once all the errors have been fixed, you can apply the modifications to iTop by using the second tab of the toolkit:

Applying the modifications to iTop

Click on the button Update iTop Code to:

  1. Compile the XML data model to PHP classes

  2. Update the database schema to add the new text column.

At this point, if you look at the schema of the MySQL database, you can see the additional “notes” column added to the “server” table. However if you navigate to a Server in iTop, nothing has changed.

The udpated database schema in phpMyAdmin

This is because iTop was not instructed how to display the added field. So the field exists but is not displayed in iTop.

Make the new field visible

Let's add the new field to the “details” of the Server object, just below the “Description”. This can be achieved by redefining the way the “details” of a Server are displayed.

Using your text editor, open the file datamodels/2.x/itop-config-mgmt/datamodel.itop-config-mgmt.xml.

Search for the string <class id="Server" to locate the definition of the Server class.

Scroll down to the <presentation> tag and copy the whole content of the <details>…</details> tag.

Paste this whole definition in datamodel.sample-add-attribute.xml after the closing </fields> tag, and enclose it in<presentation>…</presentation> tags.

Change the opening tag <details> to <details _delta=“redefine”> in order to instruct iTop to redefine the presentation for the “details”.

Insert the 3 lines:

                    <item id="notes">

Just after the lines:

                    <item id="description">

You should now obtain the following XML file:

<?xml version="1.0" encoding="UTF-8"?>
<itop_design xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0">
    <class id="Server">
        <field id="notes" xsi:type="AttributeText" _delta="define">
        <details _delta="redefine">
            <item id="softwares_list">
            <item id="contacts_list">
            <item id="documents_list">
            <item id="tickets_list">
            <item id="physicalinterface_list">
            <item id="fiberinterfacelist_list">
            <item id="networkdevice_list">
            <item id="san_list">
            <item id="logicalvolumes_list">
            <item id="providercontracts_list">
            <item id="services_list">
            <item id="col:col1">
                <item id="fieldset:Server:baseinfo">
                    <item id="name">
                    <item id="org_id">
                    <item id="status">
                    <item id="business_criticity">
                    <item id="location_id">
                    <item id="rack_id">
                    <item id="enclosure_id">
                <item id="fieldset:Server:moreinfo">
                    <item id="brand_id">
                    <item id="model_id">
                    <item id="osfamily_id">
                    <item id="osversion_id">
                    <item id="oslicence_id">
                    <item id="cpu">
                    <item id="ram">
                    <item id="nb_u">
                    <item id="serialnumber">
                    <item id="asset_number">
            <item id="col:col2">
                <item id="fieldset:Server:Date">
                    <item id="move2production">
                    <item id="purchase_date">
                    <item id="end_of_warranty">
                <item id="fieldset:Server:otherinfo">
                    <item id="powerA_id">
                    <item id="powerB_id">
                    <item id="description">
                    <item id="notes">

Check your modification by running the toolkit. Point your browser to http://your_itop/toolkit.

Checking the modifications with the toolkit

If any error is reported at this stage, fix it by editing the XML file and check again your modifications by clicking on the “Refresh” button in the toolkit page.

Once all the errors have been fixed, you can apply the modifications to iTop by using the second tab of the toolkit:

Applying the modifications to iTop

If you now navigate to the details of a Server in iTop you should see the following:New field with missing dictionary

Add a label for the new field

Notice that the label of the new field is iTop is notes (by default it is equal to the name of the field). In order to change this to Additional Notes we have to add an entry in the dictionary.

Using you text editor, open the file en.dict.sample-add-attribute.php.

Insert the line:

  'Class:Server/Attribute:notes' => 'Additional Notes',

Just below the comment:

  // Dictionary entries go here

You should obtain the following file:

 * Localized data
 * @copyright   Copyright (C) 2013 Your Company
 * @license     http://opensource.org/licenses/AGPL-3.0
Dict::Add('EN US', 'English', 'English', array(
        // Dictionary entries go here
        'Class:Server/Attribute:notes' => 'Additional Notes',

One more time, check your modification by running the toolkit.

Checking the modifications with the toolkit

If errors are reported at this stage, fix them by editing the PHP file and check again your modifications by clicking on the “Refresh” button in the toolkit page.

Once all the errors have been fixed, you can apply the modifications to iTop by using the second tab of the toolkit:

Applying the modifications to iTop

If you navigate to the details of a Server in iTop, you should now see the following:

Customized Server

Final Customization Module

The final result of the customization is available in the zip file below:


Tagged under: ,

Install & Configure IT Operational Portal Using ITop On Red Hat or Cent OS

iTop, stands for IT Operational Portal, is an Open Source web based application for the day to day operations of an IT environment. iTop was designed with the ITIL best practices in mind but does not dictate any specific process, the application is flexible enough to adapt to your processes whether you want rather informal and pragmatic processes or a strict ITIL aligned behaviour.

Image result for itop

Using iTop you can :

- Document your entire IT infrastructure assets such as servers, applications, network devices,
   virtual machines, contacts.. etc.
- Manage incidents, user requests, planned outages.
- Document IT services and contracts with external providers including service level agreements.
- Export all the information in a manual or scripted manner.
- Import or synchronize/federate any data from external systems.
  • Features :

- Fully configurable CMDB.
- HelpDesk and Incident Management.
- Service and Contract Management.
- Change Management.
- Configuration Management.
- Automatic SLA management.
- Automatic impact analysis.
- CSV import tool for all data.
- Consistency audit to check data quality.
- Data synchronization (data federation).

Configuration :
Step: 1. Install EPEL Repo :
# yum -y install epel-release
Step: 2. Install Apache Server :
# yum -y install httpd httpd-devel mod_ssl wget
Step: 3. Start Apache Server :
# service httpd restart
# chkconfig httpd on
Step: 4. Install Mysql Server :
# yum -y install mysql mysql-server mysql-devel
Step: 5. Set MySQL Root Password :
# service mysqld restart
# chkconfig mysqld on
# mysql_secure_installation

Step: 6. Install PHP5 Scripting Language :

# yum -y install php php-mysql php-common php-gd php-mbstring php-mcrypt php-devel \
   php-xml php-imap php-ldap php-mbstring php-odbc php-pear php-xmlrpc php-soap \
   php-cli graphviz
Step: 7. We need to Adjust the following PHP Settings :
# vi /etc/php.ini
post_max_size = 32M
-- Save & Quit (:wq)
Step: 8. Restart Apache Server To Load the New Configuration :
# service httpd restart
Step: 9. Download & Install iTop :
# yum -y install zip unzip
# cd /var/www/html
# wget https://downloads.sourceforge.net/project/itop/itop/2.5.1/iTop-2.5.1-4123.zip?r=https%3A%2F%2Fsourceforge.net%2Fprojects%2Fitop%2Ffiles%2Flatest%2Fdownload%3Fsource%3Ddirectory&ts=1540532105
# unzip iTop-2.0.2-1476.zip
# mv web itop
# rm -rf iTop-2.0.2-1476.zip INSTALL LICENSE README
Step: 10. Create the following Directory & Make them to be Writable :
# mkdir /var/www/html/itop/conf
# mkdir /var/www/html/itop/data
# mkdir /var/www/html/itop/env-production
# mkdir /var/www/html/itop/log
# chmod 777 /var/www/html/itop/conf/
# chmod 777 /var/www/html/itop/data
# chmod 777 /var/www/html/itop/env-production/
# chmod 777 /var/www/html/itop/log
Step: 11. Finally, Install iTop Using Web Browser :

Welcome to iTop version 2.0.2 - Mozilla Firefox_001
-- Click on "Continue"
-- Select "Install a New iTOP"

Install or Upgrade choice - Mozilla Firefox_002
-- Click on "Next"
-- I Accept the Agreement.

License Agreement - Mozilla Firefox_003
-- Click Next
-- MySQL Sever Details :
Server Name: localhost,
Login: root,
Password: redhat
Database :
Select Create a new Database: itopdb
-- Click Next.

Database Configuration - Mozilla Firefox_005
Administrator Account :
Login: admin
Password: Passw0rd
Confirm password: Passw0rd

Administrator Account - Mozilla Firefox_006
-- Language: English
-- Click Next.
Sample Data :
If you directly use it in production environment, then select the second option and Click Next. I want to populate my database with some demo data’s, so checked the first option.
-- Click Next.
-- Click Next.

Miscellaneous Parameters - Mozilla Firefox_008
-- Select "Service Management for Enterprises"
-- Click Next.

Configuration Management options - Mozilla Firefox_010
-- Select "ITIL Compliant Tickets Management" & Check 'User Request Management' & 
     'Incident Management'
-- Then Click Next

Service Management options - Mozilla Firefox_012
-- Select "ITIL Change Management"
-- Click Next.

Tickets Management options - Mozilla Firefox_013
-- Check Both Option 'Known Errors Management' & 'Problem Management'
-- Click Next.

Additional ITIL tickets - Mozilla Firefox_016
-- Click Install.
-- Finally Click on Enter iTop.

Ready to install - Mozilla Firefox_018

Done - Mozilla Firefox_019

Welcome to iTop - Mozilla Firefox_020


Tagged under: , , , ,

Salt Essentials

What Is Salt?

Salt is a remote execution framework and configuration management system. It is similar to Chef, Puppet, Ansible, and cfengine. These systems are all written to solve the same basic problem: how do you maintain consistency across many machines, whether it is 2 machines or 20,000? What makes Salt different is that it accomplishes this high-level goal via a very fast and secure communication system. Its high-speed data bus allows Salt to manage just a few hosts or even a very large environment containing thousands of hosts. This is the very backbone of Salt. Once this encrypted communication channel is established, many more options open up. On top of this authenticated event bus is the remote execution engine. Then, continuing to build on existing layers, comes the state system. The state system uses the remote execution system, which, in turn, is layered on top of the secure event bus. This layering of functionality is what makes Salt so powerful.
But this is just the core of what Salt provides. Salt is written in Python, and its execution framework is just more Python. The default configuration uses a standard data format, YAML. Salt comes with a couple of other options if you don’t want to use YAML. The power of Salt is in its extensibility. Most of Salt can easily be customized—everything from the format of the data files to the code that runs on each host to how data is exchanged. Salt provides powerful application programming interfaces (APIs) and easy ways to layer new code on top of existing modules. This code can either be run on the centralized coordination host (aka the master) or on the clients themselves (aka the minions). Salt will do the “heavy lifting” of determining which host should run which code based on a number of different targeting options.

High-Level Architecture

There are a few key terms that you need to understand before moving on. First, all of your hosts are called minions. Actions performed on them are usually coordinated via a centralized machine called the master. As a host, the master is also a minion to itself. In most cases, you will initiate commands on the master giving a target, the command to run, and any arguments. Salt will expand the target into a list of minions. In the simplest case, the target can be a single minion specified by its minion ID. You can also list several minion IDs, or use globs to provide some pattern to match against. (For example, a simple * will match all minions.) You can even reach further into the minion’s data and target based on the operating system, or the number of CPUs, or any custom metadata you set.
The basic design is a very simple client/server model. Salt runs as a daemon, or background, process. This is true for both the master and the minion. When the master process comes up, it provides a location (socket) where minions can “bind” and watch for commands. The minion is configured with the location—that is, the domain name system (DNS) or IP address—of the master. When the minion daemon starts, it connects to that master socket and listens for events. As previously mentioned, each minion has an ID. This ID must be unique so that the master can exchange data with only that minion, if desired. This ID is usually the hostname, but can be configured as something else. Once the minion connects to the master, there is an initial “handshake” process where the master needs to confirm that the minion matches the ID it is advertising.
In the default case, this means you will need to manually confirm the minion. Once the minion ID is established, the master and minion can communicate along a ZeroMQ2data bus. When the master sends out a command to ZeroMQ, it is said to “publish” events, and when the minions are listening to the data bus, they are said to “subscribe” to, or listen for, those events—hence the descriptor pub-sub.
When the master publishes a command, it simply puts it on the ZeroMQ bus for all of the minions to see. Each minion will then look at the command and the target (and the target type) to determine if it should run that command. If the minion determines that it does not match the combination of target and target type, then it will simply ignore that command. When the master sends a command out to the minions, it relies on the minions being able to identify the command via the target.
We have glossed over some details here. While the minions do listen on the data bus to match their ID and associated data (e.g., grains3) to the target (using the target type to determine which set of data to use), the master will verify that the target given does or does not match any minions. In the case of a match against the name (e.g., glob, regex, or simple list), the master will match the target with a list of IDs it has (via salt-key). But, for grain matching, the master will look at a local cache of the grains to determine if any minions match. This is similar for pillar4 matching and for matching by IP address. All of that logic allows the master to build a list of minions that should respond to a given command. The master can then compare that list against the list of minions that return data and thus identify which minions did not respond in time. Also, the master can determine that no minions match the criteria given (target combined with target type) and thus not send any command.
This is only half of the communication. Once a minion has decided to execute the given command, it will return data to the master (see Figure 1-1). The first part of the communication, where the minions are listening for commands, is called publish and subscribe, or pub-sub. The minions all connect to a single port on the master to listen for these commands. But there is a second port on the master that all minions send back any data. This includes whether the command succeeded or not, and a variety of other data.
Communication Overview
Figure 1-1. Communication between master and minions
This remote execution framework provides the basic toolset upon which other functionality is built. The most notable example is salt states. States are a way to manage configurations across all of your minions. A salt state defines how you want a given host to be configured. For example, you might want a list of packages installed on a specific type of machine—for example, all web servers. Or maybe you want to have a number of users added on a shared development server. The state has those requirements enumerated, normally using YAML. Once you have the configuration defined, you give the state system the minions for which you want that particular configuration applied. The minions are defined through the same flexible targeting system mentioned earlier. A salt state gives you a very flexible way to define a “template” for setting up a given host.
There are two basic schools of thought on configuration management. In imperative management, you explicitly give Salt an ordered list of actions to perform. In declarative management, on the other hand, you merely give the desired end state and allow the system to figure out how best to enforce it.
The proponents of the declarative model argue that it simplifies the configuration and thus makes it easier to understand. Obviously, you will need to trust that the system is handling edge cases in the manner you expect. Also, if there are problems, you need to rely on the system to provide you with sufficient information to diagnose the root cause.
However, the imperative model is more natural to a programmer accustomed to providing a list of commands. The downside of it is that you must list all of the corner cases and how you want them handled.
A detailed discussion on this topic is beyond the scope of this book. However, Salt does provide the option to use either model. As you will see when we discuss how to extend Salt in  you can run commands explicitly using the remote execution environment (imperative) or you can specify your desired end state using the state system (declarative).
The last important architectural cornerstone of Salt is that all of the communication is done via a secure, encrypted channel. Earlier, we briefly mentioned that when a minion first connects to the master, there is a process whereby the minion is validated. The default process is that you must view the list and manually accept the known minions. Once the minion is validated, the minion and master exchange encryption keys. The encryption uses the industry-standard AES specification. The master will store the public key of every minion. It is therefore critical that you maintain tight security control on your master. Once the trust relationship is established, any communication between the master and all minions is secure. However, this security is dependent on that initial setup of trust and on the sustained security of the master. The minions, on the other hand, do not have any global secrets. If a minion is compromised, it will be able to watch the ZeroMQ data bus and see commands sent out to the minions. But that is all it will be able to do. The net result is that all data sent between the master and its minions remains secure. But while the communication channel is kept secure, you still need to maintain a tight security profile on your master.

Some Quick Examples

Let’s run through a couple of quick examples so you can see what Salt can do.

System Management

A common use case for a remote execution framework is to install packages. With Salt, a single command can be used to install (or upgrade) packages across your entire infrastructure. With its powerful targeting syntax, you can install a package on all hosts, or only on CentOS 5.2 hosts, or maybe only on hosts with 24 CPUs.
Here is a simple example:
salt '*' pkg.install apache
This installs the Apache package on every host (*). If you want to target a list of minions based on information about the host (e.g., the operating system or some hardware attribute), you do so by using some data that the master keeps about each minion. This data coming from the minion (e.g., operating system) is called grains. But there is another type of data: pillar data. While grains are advertised by the minion back to the master, pillar data is stored on the master and is made available to each minion individually; that is, a minion cannot see any pillar data but its own. It is common for people new to Salt to ask about grains versus pillar data, so we will discuss them further. For the moment, you can think of grains as metadata about the host (e.g., number of CPUs), while pillar is data the host needs (e.g., a database password). In other words, a minion tells the master what its grains are, while the minion asks the master for its pillar data. For now, just know that you can use either to define the target for a command.

Configuration Management

The central master can distribute files that describe how a system should be configured. As we’ve discussed, these descriptions are called states, and they are stored in simple YAML files called SLS (salt states). A state to manage the main index file for Apache might look like the following:
    - name: /var/www/index.html
    - source: salt://webserver/main.html
The first line is simply a unique identifier. Next comes the command to enforce. The description (i.e., state) says that a file is managed by Salt. The source of the file is on the Salt master in the location given. (Salt comes with a very lightweight file server that can manage files it needs—for example, configuration files, Java WAR files, or Windows installers.) The next two lines describe where the file should end up on the minion (/var/www/index.html), and where on the master to find the file (…/webserver/main.html). (The path for the source of the file is relative to the file root for the master. That will be explained later, but just know that the source is not an absolute file path, while the destination is an absolute path.)
The file server is a mechanism for Salt to send files out to the minions. Larger files will be broken up into chunks to be more easily sent over the encrypted communication channel. This makes the file server very handy. But keep in mind that Salt’s file server is not meant to be a generic file server like NFS or CIFS.
Salt comes with a number of built-in state modules to help create the descriptions that define how an entire host should be configured. The file state module is just a simple introduction. You can also define the users that should be present, the services (applications) that should be running, and which packages should be installed. Not only is there a wealth of state modules built in to Salt, but you can also write your own.
You may have noticed that when we installed the package using the execution module directly, we gave a target host: every host (*). But when we showed the state, there was no target minion given. In the state system, there is high-level abstraction that specifies which host should have which states. This is called the top file. While the states give a recipe for how a host should look, the top file says which hosts should have which recipes.

A Brief History

Like many projects and ideas, Salt was born out of necessity. I (Tom) had created a couple of in-house remote execution incarnations over the years. But I found that these and the other open sourced options didn’t quite have the power I was looking for. I then decided to base a new system on the fast ZeroMQ messaging layer. As I began adding more and more functionality, the state system just naturally appeared. Then, as the community grew, more and more functionality was added. But the core remote execution framework remained extensible.
There is a lot of speculation over why we chose the name Salt. SaltStack is based out of Salt Lake City, so that is a popular theory. But the name of the framework is not related to the city of its birth. When looking for a name for the project, I was watching the Lord of the Rings and the topic of “salted pork” came up. Then it hit me: salt makes everything better. Thus the name Salt—because it makes system management better.

Topology Options

Thus far, we have discussed Salt only as a single master with a number of connected minions. However, this is not the only option. You can divide up your minions and have them talk to an intermediate host called asyndication master. An example use case is when you have clusters of hosts that are geographically dispersed. You may have high-latency links between the clusters, but each cluster has a fast network locally. For example, you have a bunch of hosts in New York, another large cluster in Sydney, maybe another grouping in London, and, finally, all of your development in San Francisco. A syndication master will act as a proxy for the master.
You may even decide that you only want to use Salt’s execution modules and states without any master at all. A masterless minion setup is briefly discussed in “Masterless Minions”.
Lastly, you may want to allow some users to harness the power Salt provides without giving them access directly to the main master. The peer publisher system allows you to give special access to some minions. This could allow you to let developers run deployment commands without giving them access to the entire set of tools that Salt provides.
The various topologies mentioned here are not necessarily mutually exclusive. You can use them individually, or even mix and match them. For example, you could have the majority of your infrastructure managed using the standard master–minion topology, but then have your more security-sensitive host managed via a masterless setup. Salt’s basic usage and core functionality remain the same; only the implementation details differ.

Extending Salt

Out of the box, Salt is extremely powerful and comes with a number of modules to help you administer a variety of operating systems. However, no matter how powerful the system is or how complete it attempts to be, it cannot be all things to all people. As a result, Salt’s extensibility underpins the entire system. You can dynamically generate the data in the configuration files using a templating engine (e.g., Jinja or Mako), a DSL, or just straight code. Or you can write your own custom execution modules using Python. Salt provides a number of libraries and data structures, which allow custom modules to peer into the core of the Salt system to extract data or even run other modules. Once you have the concept of extending using modules, you can then write your own states to enforce whatever logic you see fit.
As powerful as custom modules or custom states may be, they are only the beginning of what you can change. As previously mentioned, the format of the state files is YAML. But you can add your own renderer to convert any data file into a data structure that Salt can handle. Even the data about a host (i.e., grains and pillar) can be altered and customized.
All of these customizations do not live in their own sandbox. They are available to the rest of Salt, and the rest of Salt is available to them. Thus, you can write your own custom execution module and call it using the state system. Or you can write your own state that uses only the modules that ship with Salt.
All of this makes Salt very powerful and a bit overwhelming. This book is here to guide you through the basics and give some very simple examples of what Salt can do. Just to sweeten the pot, Salt has a very active community that is here to help you when you run into obstacles.
There are a lot of terms and commands that are specific to Salt. Rather than discuss them at an abstract level, let’s dive right in and run some very simple commands. In order to proceed, you will need several minions set up and configured to communicate with your Salt master. In this book’s companion code, there is a Vagrant file you can use to quickly set up five hosts: a single master and four minions.  If you already have your hosts set up and Salt installed, you can skip ahead to “Starting Up”.
If you used the companion code’s Vagrant configuration, the Salt daemons are already started. However, you should still read “Starting Up” to become familiar with the process.

Single-Master Setup

The most straightforward and common use of Salt is to have a few minions attached to a single master. We will set up a single master and then configure a couple of minions to talk to that master.
But first we need to figure out how to install Salt. Both minion and master share a great deal of code. We will install all of the core libraries on all hosts: minions and our single master. There are some command-line utilities that make sense only to run on the master (e.g., salt-run and salt). If they are installed on a host without the correct master configuration, they will report an error and do nothing. It is not harmful to have the master-specific utilities installed on the minions, but it can lead to confusion. Thus, we will have slightly different installs for the master and the minions to help prevent any confusion. But all hosts will have the core libraries installed. Then the master will have some additional code (mostly the command-line interface, or CLIs) installed.
The examples in this book use both Ubuntu and CentOS minions. As a result, we are concerned only with the RPM and apt packages. However, Salt supports a wide variety of platforms, and this list is ever-changing. Therefore, we recommend that you check the Salt documentation for how to install on your specific platform.
Again,  has some instructions for how to start up a master and four additional minions using Vagrant. These instructions use the book’s utilities located on GitHub.

From Packages

It is assumed that you have some basic familiarity with the package system on your particular operating system. But, to just give you a quick flavor of what is involved, here are the basic instructions for installing using yum(and RPM).
First, you need to verify that the packages are available via the repositories you have configured:
[vagrant@master ~]$ sudo yum list salt salt-master salt-minion
Available Packages
salt.noarch                   2014.7.0-3.el6                      epel
salt-master.noarch            2014.7.0-3.el6                      epel
salt-minion.noarch            2014.7.0-3.el6                      epel
You need to install the base Salt package and the minion on every host:
# yum install -y salt salt-minion
Also, on the host you designate as the master, you will need the salt-master package as well:
# yum install -y salt-master
Note that the packages are not in the main CentOS repositories. But they are available via EPEL (Extra Packages for Enterprise Linux). You can install the EPEL repositories very easily with:
sudo rpm -Uvh \
Again, this is merely a quick example of how to install on an RPM-based operating system. The installation instructions on Salt’s documentation pages will provide details for other supported operating systems.

Bootstrap Script

Since the installation is so varied across so many different platforms, a simplified installation script was created. It lives at a simple URL: http://bootstrap.saltstack.com. This URL will provide a bash script that supports the installation of Salt on a couple of dozen different UNIX-like variants. You can find the current list of supported operating systems on the Salt bootstrap page.
This bootstrap script is meant to make it easy to install Salt, but it is not the most secure method for installation. Therefore, it is not recommended for production environments. It is simply an easy way for you to get Salt installed so you can start learning. Once you are familiar with Salt, you should be able to install it using the package manager of your choosing or even directly from the source on GitHub if you desire.

Starting Up

There are two main daemons:
  • /usr/bin/salt-minion
  • /usr/bin/salt-master
Before we start up the minions, we need to make sure they can communicate with the master. The default setting is to use a host named simply salt. The basic minion configuration is located in /etc/salt/minion. Like so much of Salt, it is a YAML-formatted file. You need to configure each minion with the DNS name or IP address of your Salt master:
[vagrant@master ~]$ sudo grep '#master:' /etc/salt/minion
#master: salt
Tagged under: , , , , ,

VMware ESXi down/red/disconnected alerts/issues

Troubleshooting common VMware ESX host server problems

VMware ESX/ESXi Monitoring

Gain in-depth insights on CPU, memory, disk, datastore, and network of your ESX/ESXi servers. Receive instant alerts when the server is down or when thresholds are exceeded.

Add a VMware ESX/ESXi Monitor

    1. Go to the Admin tab and select VMware ESX/ESXi Server (under Virtualization) in Add Monitor page.
    2. Specify the following information to add the VMware ESX/ESXi Server monitor:
      • Display Name: Provide a display name to identify the VMware ESX/ESXi Server monitor.
      • ESX/ESXi host: Specify the IP address or Domain Name for the ESX/ESXi host.
      • Port: Specify the designated port number of the managed host.
      • Username: Enter the user account for the system.
      • Password: Enter the credentials associated with your user account.
      • Monitoring locations: Select the location profile from the drop-down list from where the ESX/ESXi can be monitored.
      • Monitor Groups: Associate the ESX/ESXi monitor with a monitor group. Choose an existing monitor group or create a new one. To create a new monitor group, read our product documentation.
      • Dependent on monitor: From the drop-down, select the dependent monitor you want. Site24x7 will suppress alerts for the configured ESX/ESXi monitor if the dependent monitor is already down.
    3. Toggle yes or no to Discover and Auto-add resources. By enabling this, the resources running on this ESX/ESXi host will be discovered automatically and added as individual monitors for monitoring.
      • Virtual machines: Toggle Yes to enable monitoring virtual machines.
      • Resource pools: Toggle Yes to enable monitoring resource pools.
      • Datastores: Toggle Yes to enable monitoring datastores.
    4. Specify the following details for Configuration Profiles:
      Image result for esx down
      • Threshold and Availability: Select a threshold profile from the drop-down list or choose the default threshold set available and get notified when the resources cross the configured threshold and availability. To create a customized threshold and availability profile, refer Threshold and Availability.
      • Notification Profile: Choose a notification profile from the drop-down list or select the default profile available. Notification profile helps to configure when and who gets notified in case of downtime.
        Refer Notification Profile to create a customized notification profile.
      • User Alert Group: Select the user group that needs to be alerted during an outage.
        To add multiple users to a group, see User Groups.
      • Tags: Associate your monitor with predefined Tag(s) to help organize and manage your monitors creatively. Learn how to add Tags.
      • IT Automation: Select an automation to be executed when the VMware ESX/ESXi server is down / trouble / up / any status change / any attribute change. The defined action gets executed when there is a state change and selected user groups are alerted. To automate corrective actions on failure
      • Third Party Integration: Associate your monitor with a pre-configured third-party service. It lets you push your monitor alarms to selected services and facilitate improved incident management.
    5. Click Save.
  • 2
Get a grip on potential VMware ESX host server problems including the purple screen of death, a frozen service console, and rebuilding your network configurations after they've been lost.

Panicking at the onset of a high impact technical problem can cause impulsive decision making that enhances the problem. Before trying to troubleshoot any problem, pause and relax to approach the task with a clear mind, then address each symptom, possible cause and resolution appropriately.

In this series, I offer solutions for many common problems that arise with VMware ESX host servers, VirtualCenter, and virtual machines in general. Let's begin by addressing common issues with VMware ESX host servers.
Windows server administrators have long been familiar with the dreaded Blue Screen of Death (BSOD), which signifies a complete halt by the server. VMware ESX has a similar state called the purple screen of death (PSOD) which is typically caused by hardware problems or a bug in the VMware code.

Troubleshooting a purple screen of death

When a PSOD occurs, the first thing you want to do is note the information displayed on the screen. I suggest using a digital camera or cell phone to take a quick photo. The PSOD message consists of the ESX version and build, the exception type, register dump, what was running on each CPU at the time of the crash, back-trace, server up-time, error messages and memory core dump info. The information won't be useful to you, but VMware support can decipher it and help determine the cause of the crash.
Unfortunately, other than recording the information on the screen, your only option when experiencing a PSOD is to power the server off and back on. Once the server reboots you should find a vmkernel-zdump-* file in your server /root directory. This file will be valuable for determining the cause. You can use the vmkdump utility to extract the vmkernel log file from the file (vmkdump –l ) and examine it for clues as to what caused the PSOD. VMware support will usually want this file also. One common cause of PSOD's is defective server memory; the dump file will help identify which memory module caused the problem so it can be replaced.

Checking your RAM for errors

If you suspect your system's RAM may be at fault you can use a built-in utility to check your RAM in the background without affecting your running virtual machines. The RAM check utility runs in the VMkernel space and can be started by logging into the Service Console and typing Service Ramcheck Start.
While RAM check is running it will log all activity and any errors to the /var/log/vmware directory in files called ramcheck.log and ramcheck-err.log. One drawback, however, is that it's hard to test all of your RAM with this utility if you have virtual machines (VMs) running, as it will only test unused RAM in the ESX system. A more thorough method of testing your server's RAM is to shutdown ESX, boot from a CD, and run Memtest86+.

Using the vm-support utility

If you contact VMware support, they will usually ask you to run the vm-support utility that packages all of the ESX server log and configuration files into a single file. To run this utility, simply log in to the service console with root access, and type "vm-support" without any options. The utility will run and create a single Tar file that will be named "esx- -- 
Alternatively, you can generate the same file by using the VMware Infrastructure Client (VI Client). Select Administration, then Export Diagnostic Data, and select your host (VirtualCenter data optional) and a directory on your local PC to store the file that will be created.

Using log files for troubleshooting

Log files are generally your best tool for troubleshooting any type of problem. ESX has many log files. Which ones you should check depends on the problem you are experiencing. Below is the list of ESX log files that you will commonly use to troubleshoot ESX server problems. The VMkernel and hosted log files are usually the logs you will want to check first.
  • VMkernel - /var/log/vmkernel – Records activities related to the virtual machines and ESX server. Rotated with a numeric extension, current log has no extension, most recent has a ".1" extension.
  • VMkernel Warnings - /var/log/vmkwarning – Records activities with the virtual machines, a subset of the VMkernel log and uses the same rotation scheme.
  • VMkernel Summary - /var/log/vmksummary - Used to determine uptime and availability statistics for ESX Server; readable summary found in /var/log/vmksummary.txt.
  • ESX Server host agent log - /var/log/vmware/hostd.log - Contains information on the agent that manages and configures the ESX Server host and its virtual machines. (Search the file date/time stamps to find the log file it is currently outputting to, or open hostd.log, which is linked to the current log file.)
  • ESX Firewall log - /var/log/vmware/esxcfg-firewall.log – Logs all firewall rule events.
  • ESX Update log - /var/log/vmware/esxupdate.log – Logs all updates done through the esxupdate tool.
  • Service Console - /var/log/messages - Contains all general log messages used to troubleshoot virtual machines or ESX Server.
  • Web Access - /var/log/vmware/webAccess - Records information on web-based access to ESX Server.
  • Authentication log - /var/log/secure - Contains records of connections that require authentication, such as VMware daemons and actions initiated by the xinetd daemon.
  • Vpxa log - /var/log/vmware/vpx - Contains information on the agent that communicates with VirtualCenter. Search the file date/time stamps to find the log file it is currently outputting to or open hostd.log which is linked to the current log file.
As part of the troubleshooting process, often times you'll need to find out the version of various ESX components and which patches are applied. Below are some commands you can run from the service console to do this:
  • Type vmware –v to check ESX Server version, i.e., VMware ESX Server 3.0.1 build-32039
  • Type esxupdate –l query to see which patches are installed.
  • Type vpxa –v to check the ESX Server management version, i.e. VMware VirtualCenter Agent Daemon 2.0.1 build-40644.
  • Type rpm –qa | grep VMware-esx-tools to check the ESX Server VMware Tools installed version – i.e., VMware-esx-tools-3.0.1-32039.

If all else fails, restart the VMware host agent service

Many ESX problems can be resolved by simply restarting the VMware host agent service (vmware-hostd), which is responsible for managing most of the operations on the ESX host. To do this, log into the service console and type service mgmt-vmware restart.
NOTE: ESX 3.0.1 contained a bug that would restart all your VMs if your ESX server was configured to use auto-startups for your VMs. This bug was fixed in a patch for 3.0.1 and also in 3.0.2, but appeared again in ESX 3.5 with another patch released to fix it. It's best to temporarily disable auto-startups before you run this command.
In some cases restarting the vmware-vpxa service when you restart the host agent will fix problems that occur between ESX and both the VI Client and VirtualCenter. This service is the management agent that handles all communication between ESX and its clients. To restart it, log into the ESX host and type service vmware-vpxa restart. It is important to note that restarting either of these services will not impact the operation of your virtual machines (with the exception of the bug noted above).

Fixing a frozen service console

Another problem that can occur is your Service Console can hang and not allow you to log in locally. This can be caused by hardware lock-ups or a deadlocked condition. Your VMs may continue to operate normally when this occurs, but rebooting ESX is usually the only way to recover. Before you do that, however, try shutting down your guest VMs and/or using VMotion to migrate them to another ESX host. To do this, use the VI Client by connecting remotely via SSH or by using one of alternate/emergency consoles, which you can access by pressing Alt-F2 through Alt-F6. You can also press Alt-F12 to display VMkernel messages on the console screen.
If you are able to shutdown or move your VMs, then you can try rebooting the server by issuing the reboot command through the VI Client or alternate consoles. If not, cold-booting the server is your only option.

Lost network configurations

The problem that can occur is that you may lose part or all of your networking configurations. If this happens, you must rebuild your network by using the ESX local service console, since you will be unable to connect using the VI Client. VMware has published knowledgebase articles that detail how to rebuild your networking using the esxcfg-* service console commands and also how to verify your network settings.


In this tip, I have addressed a few of the most common problems that can occur with VMware ESX. In the next installment of this series, I will cover troubleshooting VirtualCenter issues.
Check the following llinks for solutions to other possible ESX problems:
ESX/ESXi hosts do not respond and is grayed out (1019082)


  • ESX/ESXi host is not responding to vCenter Server
  • All virtual machines that are registered to the ESX or ESXi host are grayed out.
  • You are unable to connect to the ESX or ESXi host directly using vSphere Client.
  • The vpxd.log files residing in vCenter Server may contain events indicating an error when attempting to communicate with the ESXi host. The events always contain the words vmodl.fault.HostCommunication and may appear similar to the following examples:
    [VpxLRO] -- ERROR task-internal-6433833 -- host-24499 -- vim.host.NetworkSystem.queryNetworkHint: vmodl.fault.HostCommunication:
    (vmodl.fault.HostCommunication) {
    dynamicType = <unset>,
    faultCause = (vmodl.MethodFault) null,
    msg = "",
    [VpxdMoHost::CollectRemote] Stats collection cannot proceed because host may no longer be available or reachable: vmodl.fault.HostCommunication.
    For more information on the location of the vpxd.log file
    The issue may appear on multiple hosts, keep note on the opID that identifies the offending ESX/ESXi host:
    2012-04-09T15:03:51.540-04:00 [29348 verbose 'Default' opID=f6a80d55] [ServerAccess] Attempting to connect to service at vc1.hostname.vmware.net:10443
    For more information about this type of failure
  • If this issue occurs due to a communication issue between the ESXi host and the vCenter Server, but the host is still responsive to user interaction, you may see events similar to these in the /var/log/vmware/vpxa.log files:
    Failed to bind heartbeat socket (-1). Using any IP.
    Agent can't send heartbeats.msg size: 66, sendto() returned: Network is unreachable.


This article provides troubleshooting steps to determine why an ESX/ESXi host is inaccessible from vCenter Server or vSphere Client.


To determine why an ESX/ESXi host is inaccessible:
  1. Verify the current state of the ESX/ESXi host hardware and power. Physically go to the VMware ESX/ESXi host hardware, and make note of any lights on the face of the server hardware that may indicate the power or hardware status. For more information regarding the hardware lights, consult the hardware vendor documentation or support.
    Note: Depending on the configuration of your physical environment, you may consider connecting to the physical host by using a remote hardware interface provided by your hardware vendor. For more information about how this interface interprets the condition of the hardware, consult the hardware vendor documentation or support.
    • If the hardware lights indicate that there is a hardware issue, consult the hardware vendor documentation or support to identify any existing hardware issues.
    • If the hardware is currently turned off, turn on the hardware and see Determining why a ESX/ESXi host was powered off or restarted .
  2. Determine the state of the user interface of the ESX host in the physical console.
    Note: Depending on the configuration of your physical environment, you may consider connecting to the physical host by using a remote application such as a Keyboard/Video/Mouse switch or a remote hardware interface provided by your hardware vendor. These interfaces are known to interfere with keyboard and mouse functionality. VMware recommends verifying the responsiveness at the local physical console prior to taking any action.
    • If the user interface does not respond to user interaction, see Determining why an ESX/ESXi host does not respond to user interaction at the console .
    • If the user interface displays a purple diagnostic screen, see Interpreting an ESX/ESXi host purple diagnostic screen.
  3. Verify that DNS is configured correctly on the ESX/ESXi host. For more information, see:
    • Configuring VMware ESXi Management Network from the direct console
    • Clearing the DNS cache in VMware ESXi host
    • Reconnecting or adding an VMware ESXi host to VMware vCenter Server fails with the error: Agent can't send heartbeats because socket address structure initialization is failing 
    • Identifying issues with and setting up name resolution on ESX/ESXi Server
  4. Determine if the ESX host responds to ping responses. For more information, see Testing network connectivity with the Ping command . If you are using ESXi, there are several menu options provided to test the management network. If the ESX host responds to user interaction, but does not respond to pings, you may have a networking issue.
  5. Verify that you can connect to the VMware ESX/ESXi host using vSphere Client:
    1. Open the vSphere Client.
    2. Specify the hostname or IP address of the VMware ESX/ESXi host, along with the appropriate credentials for the root user.
    3. Click Login.
      • If you receive an error indicating that a connection failure occurred, it may indicate that the agents responsible for facilitating the vSphere API are not functioning.
      • If you are able to connect to the VMware ESX/ESXI host using the vSphere Client, but it continues to show as unresponsive from vCenter Server, verify if the correct Managed IP Addressisset in vCenter Server.
  6. Determine if the ESX/ESXi host is rebooted.
    1. Physically log in to the console of the VMware ESX/ESXi host.
      • If you are using ESX, log in to the service console as root.
      • If you are using ESXi 4.0 and below, log in to Emergency Tech Support mode.
      • If you are using ESXi 4.1 and later, log in to Tech Support mode. For more information
    2. Type the command uptime to view the uptime of the VMware ESX/ESXi host. If the VMware ESX/ESXi host is recently rebooted

Related Information

High Availability
High Availability (HA) feature uses a different trigger than vCenter Server when ensuring that an ESX or ESXi is operational. The following is a brief explanation of each criteria:
  • The Host connection and power state alarm is triggered as a result of a HostCommunication fault. A HostCommunication fault occurs if vCenter Server is unable to communicate to an ESX or ESXi host using the vSphere API.
  • The HA isolation response is triggered as a result of an agent on the ESX or ESXi host that is unable to communicate with agents on other ESX or ESXi hosts (not the vCenter server). It must also fail to communicate with a designated isolation address (by default, it is the default gateway). If both of these conditions are met, the host performs the designated HA isolation response.
Both systems are managed by different agents and may communicate with different hosts on the network. Therefore, with respect to the relationship:
  • A host that is Not Responding within vCenter Server does not always trigger a high availability isolation response. It may still be maintaining a network connection with other hosts or its isolation address and thus is not isolated .
  • A host experiencing an HA isolation response is likely to appear as Not Responding within vCenter Server.

Condition 1

  1. If ESXi host in non responding state with the reside vms are in orphaned/disconnected mode.
  2. In this case we need basic checks on datastore connection/free space availability in DS.
  3. IDRAC/ILO to connect affected ESXi console and restart management agent at ESXI  and max time its resolves the connection issues and verify ESXi to connect automatically VC.
    Process to restart the ESXi management agent - Log in to ESXi using SSH as root.
    1. - Restart the ESXi host daemon and vCenter Agent services using these commands:
    2.  /etc/init.d/hostd restart
    3.  /etc/init.d/vpxa restart
  4. If ILO is not configured/not reachable work with Datcenter team to verify physical state of ESXi and restart ESXI using crash card console with the help of Datacenter team.
  5. After reboot verify ESXi connectivity with the VC and put ESXi in maintenance mode if its on the cluster.
  6. As part of root cause investigation verify ESXi recent task/events using web client console.
  7. For further analysis export affected ESXi system logs or to get it from the ESXi SSH log location: 
  8. Collect Diagnostics logs with the help of below article
  9. Share diagnostic logs with Compute team for further analysis with VMware support. .
Condition 2
  1. If ESXi host in non responding state with reside vms availability.
  2. Follow above step 2, 3
  3. Try to reconnect ESXi with root password.
  4. If it failed to connect affected ESXi, (Note- in this case vms needs to be shutdown manually with the help of change and vm owners confirmation\approval.
  5. After the activity, ESXi should be in maintenance mode and share exported logs with Compute for further analysis with VMware support.
“vm-support” command in ESX/ESXi to collect diagnostic information (1010705)



VMware Technical Support routinely requests diagnostic information from you when a support request is handled. This diagnostic information contains product specific logs, configuration files, and data appropriate to the situation. The information is gathered using a specific script or tool for each product and can include a host support bundle from the ESXi host and vCenter Server support bundle. Data collected in a host support bundle may be considered sensitive. Additionally, as of vSphere 6.5, support bundles can include encrypted information from an ESXi host. For more information on what information is included in the support bundles
This article provides procedures for obtaining diagnostic information for a VMware ESXi/ESX host using the vm-support command line utility. For other methods of collecting the same information
The diagnostic information obtained by using this article is uploaded to VMware Technical Support. To uniquely identify your information, use the Support Request (SR) number you receive when you create the new SR.


The command-line vm-support utility is present on all versions of VMware ESXi/ESX, though some of the options available with the utility differ among versions.
Running vm-support in a console session on ESXi/ESX hosts
The traditional way of using the vm-support command-line utility produces a gzipped tarball (.tgz file) locally on the host. The resulting file can be copied off the host using FTP, SCP, or another method.
  1. Open a console to the ESX or ESXi host.
    Note: Additional options can be specified to customize the log bundle collection. Use the vm-support -h command for a list of options available on a given version of ESXi/ESX.
  2. A compressed bundle of logs is produced and stored in a file with a .tgz extension in one of these locations:
    • /var/tmp/
    • /var/log/
    • The current working directory
    • To export the log bundle to a shared vmfs datastore, use this command:
      vm-support -f -w /vmfs/volumes/DATASTORE_NAME

    Note: The -f option is not available in ESXi 5.x, ESXi/ESX 4.1 Update 3, and later.
  3. The log bundle is collected and downloaded to a client, upload the logs to the SFTP/FTP site.
Streaming vm-support output from an ESXi 5.x and 6.0 host
Starting with ESXi 5.0, the vm-support command-line utility supports streaming content to the standard output. This allows to send the content over an SSH connection without saving anything locally on the ESXi host.
  1. Enable SSH access to the ESXi shell. For more information, see Enable ESXi Shell and SSH Access with the Direct Console User Interface section in the vSphere Installation and Setup guide.
  2. Using a Linux or Posix client, such as the vSphere Management Assistant appliance, log in to the ESXi host and run the vm-support command with the streaming option enabled, specifying a new local file. A compressed bundle of logs is produced on the client at the specified location. For example:
    ssh root@ESXHostnameOrIPAddress vm-support -s > vm-support-Hostname.tgz
    Note: This requires you to enter a password for the root account, and cannot be used with lockdown mode.
  3. You can also direct the support log bundle to a desired datastore location using the same command (mentioning the destination path). For example:
    ssh root@ESXHostnameOrIPAddress 'vm-support -s > /vmfs/volumes/datastorexxx/vm-support-Hostname.tgz'
  4. After the log bundle has been collected and downloaded to a client, upload the logs to the SFTP/FTP site.
HTTP-based download of vm-support output from an ESXi 5.x and 6.0 host
Starting with ESXi 5.0, the vm-support command-line utility can be invoked via HTTP. This allows you to download content using a web browser or a command line tool like wget or curl.
  1. Using any HTTP client, download the resource from:
    For example, download the resource using the wget utility on a Linux or other Posix client, such as the vSphere Management Assistant appliance. A compressed bundle of logs is produced on the client at the specified location:
  2. After the log bundle is collected and downloaded to a client, upload the logs to the SFTP/FTP site.

Related Information

There have been updates for the vm-support command-line utility for some versions of VMware ESX 2.x and 3.x. Ensure that the version of vm-support on each ESX host is up to date. The minimum version listed provides improvements required to protect the security of your data when providing support information to VMware. For more information about these security improvements
Verifying the version of the vm-support utility
Verify that your version of vm-support is at least that listed for your version of ESXi/ESX:
  • ESX Server 2.5.5 requires version 1.15 or higher
  • ESX Server 3.0.x requires version 1.29 or higher
  • ESXi/ESX Server 3.5 requires version 1.30 or higher
  • ESXi/ESX Server 4.x requires version 1.29 or higher
  • ESXi Server 5.x requires version 2.0 or higher
To see which version is installed on your system, run the vm-support command with no options and then cancel the collection, or run the command vm-support --version. For example:
  • [user@esxhost]$ cd /tmp
    [user@esxhost]$ vm-support
    VMware ESX Server Support Script 0.94
    Preparing Files: |
    [Ctrl+C to cancel]
  • [user@esxhost]$ vm-support --version
    vm-support v2.0
Updating the version of the vm-support utility on ESX
To update the vm-support utility on an ESX host:
  1. Open a console to the ESX host.
  2. Verify the version of the vm-support utility installed.
  3. Make a backup of the existing vm-support utility using the command:
    cp /usr/bin/vm-support /usr/bin/vm-support.old
  4. Download the appropriate file for your version of VMware ESX and place it in the /tmp/ directory in the service console of the ESX system.

    ESX 2.5.5

    ESX 3.0.1

    ESX 3.0.2

    ESX 3.0.3

    ESX 3.5

    ESXi/ESX 4.x
    vm-support is up to date for ESXi/ESX 4, no updates are available.
    ESXi 5.x
    vm-support is up to date for ESXi/ESX 5, no updates are available.
  5. Run this command to extract the archived file:
    tar xvzf filename.tgz
  6. Verify that the MD5 sum of the vm-support file in the attachment matches the value for your software version listed in the table above. For example:
    md5sum vm-support
    11af1759471892c240376cdf1e7a4ad0 vm-support
  7. Copy the vm-support utility to the /usr/bin/ directory, replacing the original vm-support script.
    Note: When running on an older version of ESX, the updated script might report errors about missing commands.
You can power off or restart (reboot) any ESX/ESXi host using the vSphere Client. You can also power off ESX hosts from the service console. Powering off a managed host disconnects it from vCenter Server, but does not remove it from the inventory.


Shut down all virtual machines running on the ESX/ESXi host.

Select the ESX/ESXi host you want to shut down.

From the main or right-click menu, select Reboot or Shut Down.

If you select Reboot, the ESX/ESXi host shuts down and reboots.
If you select Shut Down, the ESX/ESXi host shuts down. You must manually power the system back on.

Provide a reason for the shut down.
This information is added to the log.
Determining why an ESXi/ESX host was powered off or restarted (1019238)


  • An ESXi/ESX host is disabled (grayed out) and displays as Not Responding.
  • An ESXi/ESX host is disabled (grayed out) and displays as Disconnected.
  • Clients connected to services running in one or more virtual machines are no longer accessible.
  • Applications dependent on services running in one or more virtual machines are reporting errors.
  • One or more virtual machines are no longer responding to network connections.


This article provides steps to determine if an ESX or ESXi host was powered off or restarted.


ESX 4.x
To determine the reason for abrupt shut down or reboot an ESX host:
  1. If the host is currently turned off, turn the host back on.
  2. Ensure that there are no hardware lights that may indicate a hardware issue. For more information, engage the hardware vendor.
  3. Log in to the host at the console as the root user.
  4. Run the command:
    # cat /var/log/vmksummary
  5. Determine if the ESX host was deliberately rebooted. When a user or script reboots a VMware ESX host, it generates a series of events under /var/log/vmksummary similar to:
    localhost logger: (1265803308) hb: vmk loaded, 1746.98, 1745.148, 0, 208167, 208167, 0, vmware-h-59580, sfcbd-7660, sfcbd-3524
    localhost vmkhalt: (1268148282) Rebooting system...
    localhost vmkhalt: (1268148374) Starting system...
    localhost logger: (1268148407) loaded VMkernel
    Hostd: [<YYYY-MM-DD> <time>.284 27D13B90 info 'TaskManager'] Task Created : haTask-ha-host-vim.HostSystem.reboot-50</time>

What's Hot This Week

Featured post

How to Increase the size of a Linux LVM by adding a new disk

This post will cover how to increase the disk space for a VMware virtual machine running Linux that is using logical volume manager (LVM). F...

Total Pageviews