Monitoring Servers, Services and Links

We're pleased you're looking at this page, many people only consider IT providers internal systems as an after thought, baseing a decision purely on cost per line item or uptimes in sales material. 

Cost is very important, as can be lost revenue from down time.  Our figures are not marketing/sales numbers or from a system in a High Availability location, data presented here is taken from real world running systems monitored from a regular ADSL Connection.

Unavailability shown includes regular / planned unavailablility due to maintenace and backups performed out of hours.

Microsoft Windows Servers

2008r2, Remote Desktop

A 2008r2 VPS's RDS service over a 31 day window.  The 18 minutes down time is unavailability due to snapshotting for backup purposes (backups are taken during non working hours)

For Windows servers we routinely monitor:

  1. Host response to ICMP
  2. Free Space per drive (configured separately per drive since limits depend on volume use)
  3. CPU utilization
  4. Memory
  5. RDesktop (remote admin access)

Linux Servers

VPS Host server

A VPS Host over a 31 day window. This is purely ICMP. The 9 minute down time is neither the host nor DC connectivity.  It is a client side connectivity.

https and certificate expiry HTTPS availability on a Linux VPS (on host as above). The certificate expiry date is checked too.

For Linux servers we routinely monitor:

RRDTool Graphs

  1. Host response to ICMP
  2. Webserver (http/https)
  3. SSH (remote admin access)

 

We store (using RRDTool) data around

  1. CPU utilisation (per core)
  2. RAM
  3. Interface Traffic
  4. Disk IO (both at individual drive level and Array level), looking at throughput and IOPS as well as Drive temperature.

This is graphed every 5 minutes and checked by a human once a day for each server.  This data helps spot annomalies and changes in usage patters that are not picked up via other methods.

Applications / Services

Email End to End

An SMTP/IMAP end-to-end relay and delivery check over a 31 day window.

  1. MS Exchange - monitor end to end. Send an email to a mailbox on the server (via SMTP) and check that it appears in the (IMAP) mailbox within a certain time period.
  2. MSSQL / MySQL - count or sum an numeric column across load balanced or failover servers.

Telephone Systems

Monitoring an Asterisk SIP trunk

SIP trunk over a 31 day window. The test is to whether certain SIP peers are currently reachable by the Asterisk server.

As with any best practice ISP / IT Services company we monitor our own and certain customers servers & services.  To do this we use Nagios, the self styled, but widely accepted "Industry Standard In IT Infrastructure Monitoring".

Our monitoring system watches items across 6 sites from an independent location.

We monitor to ensure we respond quickly to issues and to provide figures for Service Level Agreement purposes.

We commit to 99.5% (or better) availability i.e. 3.6525 hours (or less) downtime per 30.4375 day window.

24 x 7 x 365 Monitoring

To ensure someone knows what's happening 24x7x365 we send notifications out via 2 seperate email infrastructures.

There is a 6-7 hour window every day (depending on Day Light Savings) where it is probable that no engineers are awake.  If any (predetermined) critical system is noted to have an issue our monitoring system will place a telephone call to the on duty engineer to wake them up.

We monitor variables at a 5 minute intervals, once an alert state is determined (or no response is received) the item is checked 3 more times at 1 minute intervals before an alert is issued.  The time between the initial failure and notification is thus 3 to 8 minutes.

In contrast you will find support companies who know your disk is full a whole day after your database application has been failling to save changes.

We do not pay someone to watch a traffic light system (although Nagios does include this) our alerts interrupt the right responder's normal/scheduled work immediately.  Most alerts are sent to 2 people.

Valid XHTML 1.1 Strict CIS ZA | CIS UK

© Commercial Internet Solutions Limited (2019-)
Registered in England and Wales, Company No. 07276867

Full QR Code

Brief QR Code Take a look at our QR code, if you hover over it you can scan our full VCard.

Commercial Internet Solutions provides internet applications and services to Small Business clients around London. from our Tier 4 hosting facility - Custodian Data Centers in Maidstone Kent using n+1 redundant Supermicro servers.

We provide fast web and secure (SSL) imap and pop3 email hosting and cheap, compliant easy to use email marketing software.

We host, manage and backup Microsoft Windows Small Business Servers, dedicated Linux servers and Asterisk/ SIP based VoIP PBX solutions.