CALL +44 (0)20 7183 3893
Blog

Monday, 14 January 2013

Comparing Amazon VPC connectivity options

In August 2009 Amazon announced its Virtual Private Cloud (VPC) service, essentially giving enterprise customers worried about security and control in the cloud a solution to that concern. Since then the Amazon VPC has matured as more and more services have become available from within the VPC.

Amazon Virtual Private Cloud allows IT administrators to provision a private, isolated section of the Amazon Web Services (AWS) Cloud where they can launch AWS resources in a virtual network that they define. They can have complete control over the virtual networking environment, including selection of IP address ranges, configuration of routing tables, subnets and network gateways.

Furthermore customers can connect their existing data centers and branch offices to the Amazon VPC and access the AWS cloud as if it is an extension of the corporate network. This connectivity between the corporate offices and the Amazon VPC can be accomplished in several ways.

In this short blog, we will explore the options available for connecting the enterprise network to the Amazon VPC whilst we compare and contrast the advantages, disadvantages and associated costs.


Amazon Direct Connect


AWS Direct Connect is an AWS service that allows you establish a dedicated network connection between your WAN network and the Amazon Web Service global network. If your corporate network has presence in one of these locations, Direct Connect facilitates dedicated 1G or 10G connectivity between your network equipment at that location and Amazon's routers.

Pricing information can be found here.

If connecting in London Telecity, a single 1G port will cost at least $223 per month for the port connection-hours. Additionally you pay $0.03 per GB for data transfers outbound from the VPC to the corporate network. Furthermore, if your corporate offices and datacenters are already reachable from the Direct Connect peering location across the enterprise WAN, only minimal configuration will be required to route traffic between the VPC and those offices.

Advantages

  • Reduces bandwidth costs for traffic-heavy applications.
  • Provides consistent network performance compared to other options.
  • Can be used for accessing AWS services outside the VPC.

Disadvantages

  • Requires existing network presence in a very limited set of locations.
  • Requires more complex network hardware and configuration, for example 802.1q VLANs, BGP ..etc.
  • If the traffic loads are not heavy enough, this is an expensive option.
  • Not very elastic, the options are 1G or 10G ports, there is nothing in between. 

Monday, 7 January 2013

Varnish and Autoscaling... a love story


While working on a cool project at Cloudreach, I stumbled upon Varnish, and fell in love with it instantly. The first thing I tried to do was to combine Varnish with the awesomeness provided by AWS Elastic Load Balancer (ELB), in a combination which looks like:





While the frontend ELB works out of the box with Varnish (no surprises here), the backend ELB doesn’t work as expected with Varnish. The problem lies on the fact that Varnish is resolving the name assigned to the ELB, and it’s caching the IP addresses until the VCL get’s reloaded. Because of the dynamic nature of the ELB, the IPs linked to the cname can change at any time, resulting in Varnish routing traffic to an IP which is not linked to the correct ELB anymore.

The problem is discussed here and here but after Googling around I couldn't find any solution which didn’t involve doing:

ELB -> VARNISH -> NGINX (or HAproxy) ->  ELB -> AUTOSCALING GROUP

Going through so many layers seemed too much, taking into consideration that Varnish can be used to load balance requests and perform health checks on the backend nodes without the need for an Internal ELB. The more I thought about it, the more I realised how simple it would be to implement a solution..... so I did it. Using Varnish to perform the load-balancing, removes the overhead of going through an internal ELB, and it will require reloading the backend nodes only when an autoscaling activity takes place.


The solution I've implemented uses varnishadmin command line tool, boto, and some bash scripting to glue all together.

First of all we need to get the backend nodes configured in Varnish and store them on a file:


varnishadm -T $HOSTPORT -S $SECRET backend.list > varnish_ips

Then, we will have to query the autoscaling group, and update the backends if any instance has been added/terminated. The following Python code does most of the job:


Let’s break it down:

  • get_autoscaling_ips gets the IPs associated with instances added to a specific autoscaling group.
  • get_varnish_ips loads the backend IPs in a Python array
  • update_vlc_file compares the two list of IPs. If there is any difference (you might want to reconsider this aspect) in the two lists of IPs, it creates a new VCL file containing the IPs retrieved from the autoscaling group.

In order to decouple the VCL section which is used to define request handling and document caching policies (unlikely to change according to the autoscaling group)  from the section which is used to configure the backends, the Python script outputs the new VCL in the following format:

include /etc/varnish/healthcheck.vcl;

node definitions


director definitions


include /etc/varnish/use.vcl

The node definition and the director definition is dynamically generated by the script, while healthcheck.vcl is a static file where the healthchek conditions are defined (what a surprise:) and use.vcl is another static Varnish config file, which makes use of the director definition.

Once the new VCL is generated, it’s just a matter of reloading it, running:

varnishadm -T $HOSTPORT -S $SECRET vcl.load $NAME $FILE
varnishadm -T $HOSTPORT -S $SECRET vcl.use $NAME


Something I noticed when creating the script, is that backend.list returns the list of the configured backends, regardless if the VCL which defines them is in use or not. This behaviour makes the all exercise of comparing VCL backends with autoscaling IPs useless, so we need to remove all the previous VCL configs running:

varnishadm -T $HOSTPORT -S $SECRET vcl.discard $OLD_VCL

The three scripts can be glued together on a bash script which runs as a cron job on each Varnish server. The code above has not been used in production yet, so please do test thoroughly before usage. II’m always curious to hear of any feedback, so get in touch if you have any comments on this.

As usual, please reach out to us if you need any help or advice using AWS!


Nicola Salvo
System Developer
Pontus is ready and waiting to answer your questions