|By Hovhannes Avoyan||
|May 10, 2012 10:03 AM EDT||
Last time we mentioned two fundamental principles while monitoring any object:
1. The monitor should collect as much important information as possible that will allow to accurately evaluate the health state of an object.
2. The monitor should have little to no effect on the activity of the object.
Normally, it is necessary to add couple lines in your existing Node.js server code to activate the monitor-plugin:
var monitor = require('monitor');// insert monitor module-plugin
var server = … // the definition of current Node.js server
monitor.Monitor(server); //add server to monitor
Now the monitor will begin collecting and measuring data. The monitor-plugin has an embedded simple HTTP server that sends accumulated data by request and should correspond (in current implementation) to the following pattern
10010 – the listen port of monitor-plugin (configurable)
‘node_monitor’ – the pathname keyword
‘action=getdata’ – command for getting collected data
‘access_code’ – the specially generated access code that changes for every session
Please notice that monitis-plugin server (in current implementation) listens the localhost only. This and usage of the security access code for every session gives enough security while monitoring. More detailed information can be found along with implemented code in the github repository.
Server monitoring metrics
There are a largely standard set of metrics which can be used to monitor the underlying health of any server.
* CPU Usage describes the level of utilisation of the system CPU(s) and is usually broken down into three states.
o IO Wait – indicates the proportion of CPU cycles spent waiting for IO (disk or network) events. If you experience large IO Wait proportions, it can indicate that your disks are causing a performance bottleneck.
o System – indicates the proportion of CPU cycles spent performing kernel-level processing. Generally you will find only a small proportion of your CPU cycles are spent on system tasks, Hence if you see spikes it could indicate a problem.
o User – indicates the proportion of CPU cycles spent performing user instigated processing. This is where you should see the bulk of your CPU cycles consumed; it includes activities such as web serving, application execution, and every other process not owned by the kernel.
o Idle – indicates the spare CPU capacity you have – all the cycles where the CPU is, quite literally, doing nothing.
* Load Average is a metric that indicates the level of load that a server is under at a given point in time. Usually evaluated as number of requests per second.
* RAM usage by server is broken down usually into the following parts.
o Free – the amount of unallocated RAM available. Linux systems tend to keep this as low as possible; and do not free up the system’s physical RAM until it is requested by another process.
o Inactive – RAM that is in-use for buffers and page caching, but hasn’t been used recently so will be reclaimed first for use by a running process.
o Active – RAM that has been used recently and will not be reclaimed unless we have insufficient Inactive RAM to claim from. In Linux systems this is generally the one to keep an eye on. Sudden, rapid increases signal a memory hungry process that will soon cause VM swapping to occur.
* The server uptime is a metric showing the elapsed time since the last reboot. Non-linear behavior of the server uptime line indicates that the server was rebooted somehow.
* Throughput – the amount of data traffic passing through the servers’ network interface is fundamentally important. It is usually broken down into inbound and outbound throughput and normally measured as average values for some period by kbit or kbyte per sec.
* Server Response time is defined as the duration from receiving a request to sending a response. Normally, it should not exceed reasonable timeout (usually this depends on the complexity of processing) but should be as little as possible. Usually, the average and peak response time is evaluated for some time period.
* The count of successfully processed requests is evaluated as the percentage of responses with 2xx status codes (success) to requests during observing time. This value should be as close as possible to 100%.
We have used part of these metrics and added some specific statistics that are important for our task (e.g. client platform, detailed info for response codes, etc.)
The results below were obtained on a Node.js server equipped with the monitor described above on a Debian6-x64. The server listens on HTTP (81) and HTTPS (443) ports and does not have a large load.
The data can be shown in graphical view too.
Samsung VP Jacopo Lenzi, who headed the company's recent SmartThings acquisition under the auspices of Samsung's Open Innovaction Center (OIC), answered a few questions we had about the deal. This interview was in conjunction with our interview with SmartThings CEO Alex Hawkinson. IoT Journal: SmartThings was developed in an open, standards-agnostic platform, and will now be part of Samsung's Ope...
Oct. 26, 2014 01:00 AM EDT Reads: 2,719
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an In...
Oct. 25, 2014 05:00 PM EDT Reads: 1,736
The major cloud platforms defy a simple, side-by-side analysis. Each of the major IaaS public-cloud platforms offers their own unique strengths and functionality. Options for on-site private cloud are diverse as well, and must be designed and deployed while taking existing legacy architecture and infrastructure into account. Then the reality is that most enterprises are embarking on a hybrid cloud...
Oct. 25, 2014 11:45 AM EDT Reads: 1,855
In her General Session at 15th Cloud Expo, Anne Plese, Senior Consultant, Cloud Product Marketing, at Verizon Enterprise, will focus on finding the right mix of renting vs. buying Oracle capacity to scale to meet business demands, and offer validated Oracle database TCO models for Oracle development and testing environments. Anne Plese is a marketing and technology enthusiast/realist with over 19...
Oct. 25, 2014 10:00 AM EDT Reads: 1,733
StackIQ offers a comprehensive software suite that automates the deployment, provisioning, and management of Big Infrastructure. With StackIQ’s software, you can spin up fully configured big data clusters, quickly and consistently — from bare-metal up to the applications layer — and manage them efficiently. Our software’s modular architecture allows customers to integrate nearly any application wi...
Oct. 25, 2014 10:00 AM EDT Reads: 1,793
As Platform as a Service (PaaS) matures as a category, developers should have the ability to use the programming language of their choice to build applications and have access to a wide array of services. Bluemix is IBM's open cloud development platform that enables users to easily build cloud-based, creative mobile and web applications without having to spend large amounts of time and resources o...
Oct. 25, 2014 08:00 AM EDT Reads: 1,807
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at Internet of @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, will discuss how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will nee...
Oct. 24, 2014 09:30 PM EDT Reads: 1,454
When you set off to build an app that will change the world, designing your system architecture to be reliable and scalable is important but the stark reality is that, for your MVP, you probably had a “need for speed” (of development). You didn’t know what all the axes were to scale your application, where your stress points would be, and what weird and wonderful ways your customers would use it d...
Oct. 24, 2014 09:00 PM EDT Reads: 1,282
Compute virtualization has been transformational, yet security policy implementation and enforcement has lagged behind in agility and automation. There are a number of key considerations when implementing policy in private and hybrid clouds. In his session at 15th Cloud Expo, Holland Barry, VP of Technology at Catbird, will discuss the impact of this new paradigm and what organizations can do to...
Oct. 24, 2014 07:00 PM EDT Reads: 1,550
SYS-CON Events announced today that Red Hat, the world's leading provider of open source solutions, will exhibit at Internet of @ThingsExpo, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Red Hat is the world's leading provider of open source software solutions, using a community-powered approach to reliable and high-performing cloud, Linux, ...
Oct. 23, 2014 11:30 PM EDT Reads: 1,749