Tuesday, November 24, 2015

Monetate open-sources Koupler: a versatile interface to Kinesis!


I'm happy to announce that Monetate has open-sourced Koupler, a versatile interface for Kinesis.  We took the best practices outlined by Amazon and codified them.

Hopefully, this will be the first of many contributions back to the community.

For the full story, check out the formal announcement.


Wednesday, November 18, 2015

Using Squid as an HTTP Proxy via SSH (to fetch remotely from Amazon yum repos)


We've been playing around with vagrant for local development.   When combined with Ansible, the pair allows you to recreate complex systems locally with high fidelity to your deployment environment.   Through magic voodoo (kudos to @jjpersch and @kmolendyke), we managed to get an Amazon AMI crammed into a virtual box.  Unfortunately, Amazon's yum repos are only available from the EC2 network.  Thus, when we fired up our vagrant machine locally, it couldn't update!

No worries.  We could use a single EC2 instance to proxy the http requests from yum.  Here's how.

First, fire up an ec2 instance and install squid on that instance.

yum install squid

Start squid with:

/etc/init.d/squid start

If it has trouble starting, have a look at:

cat /var/log/squid/squid.out

If you see something like this:

FATAL: Could not determine fully qualified hostname.  Please set 'visible_hostname'

You may need to set visible_name in the /etc/squid/squid.conf file.  Add it to the end of that file:

visible_hostname myec2

Then, squid should fire up.  By default squid runs on port 3128.
(But you can check with netstat -plant | grep squid)

Now, you are ready for HTTP proxying.  In this scenario, lets assume your ec2 instance is myec2.foo.com, and you want to proxy all HTTP requests from your laptop through that machine.

On your laptop, you would run:

ssh -L 4444:localhost:3128 myec2.foo.com

This will ssh into your ec2 instance, simultaneously setting up port forwarding from your laptop.  Every packet sent to port 4444 on your laptop will be forwarded over the secure tunnel to port 3128 on your ec2 instance.  And since squid is running on that port, squid will in turn forward any HTTP requests along, but they will now look like they are coming from your ec2 instance!

Shazam.  You can test out the setup using wget.  For example, previously we couldn't fetch a package list from amazon's yum repos.  The following wget would fail:

wget http://packages.ap-southeast-2.amazonaws.com/2014.09/updates/d1c36cf420e2/x86_64/repodata/repomd.xml

wget pays attention to the http_proxy environment variable.   When set, wget will forward all requests to that proxy.  Thus, to test out our proxy, set http_proxy with the following:

export http_proxy=http://localhost:4444

Once set, you should be able to re-run the wget and it should succeed!  Finally, if you have a similar situation, and you want to proxy all yum requests.  Go into /etc/yum.conf and set the following:

proxy=http://localhost:4444

With that setting, you should be able to yum update all day long from your vagrant machine. =)