Stanley

Buddycloud Blog

imaginator

26 March 2014

Using Docker with Github and Jenkins for repeatable deployments

Docker helps us deploy to Buddycloud’s testing and production environments. We use Docker to run Buddycloud’s hosted environment. And deploy after every commit to our testing and production Git branches.

Again and again.

Automatically.

Background: we’re holded up in the Austrian mountains working on the new Buddycloud hosting platform. We need to be able to reliably test and deploy code. What generally happens is that, unless servers are completely rebuilt between test or production, they slowly degrade over time. A new library here, a config change there. And soon your tests are no longer against a vanilla environment.

image

Also, we want to be testing on exactly the same environment that we are running in production.

Core services and Docker deployments

What services should be run under docker and what are “core” services that other apps use?

We separate core services like databases, DNS and LDAP (for server authentication) from services that are updated and managed by Docker. These is is our plumbing - it’s just got to be there.

This has the knock on effect that when we do deployments to services that use a database: the deployed service or the application MUST take care of ensuring that the database is at the correct schema version.

This is a good thing: it keeps us honest about upgrades working smoothly. And catching failures before they hit production.

image

By default docker exposes no ports. We need to expose some ports to the outside world, and some ports to localhost so that processes can intercommunication. For example our Apache instances run a reverse proxy back to the buddycloud-http-api. The ports are exposed using the docker startup commands.

For example Apache uses

  
docker run -d --name=apache -h=si.buddycloud.com -P -p 80:80 -p 443:443 -t apache

image

So we have designed the following code flow that takes code from a developers laptop to SI and if that’s passing, we manually make a call to push to production.

image

We ended up with the following code > deployment flow:

  1. Developer checks out https://github.com/buddycloud/hosting.git works on their change or feature branch.
  2. when they are ready to test in a system integration environment they commit back to the dev-candy branch
  3. Github triggers a Jenkins hook that fires off the scripts from https://github.com/buddycloud/buddycloud-package/tree/master/projects/hosting/docker
  4. Jenkins tries to build the project
  5. If successful Jenkins calls the docker scripts that deploy to either production or system integration environments
  6. the service is started and visible on https://hosting.buddycloud.com

Buddycloud helps you quickly add communication to your app. If you would like to try it out see Buddycloud Getting Started page.


TAGS: docker buddycloud

abmargb

7 February 2014

Buddycloud Crawler in XEP-0060 and the Buddycloud XEP

A couple of years ago I wrote the buddycloud channel directory as my GSoC’12 project and as my first buddycloud project to go live in production.

The component is running fine in search.buddycloud.org, however the crawler wasn’t crawling as we required. The crawler was designed on top of XEP-0060, and uses smack with its pub-sub extensions to discover Buddycloud channels and items on the Buddycloud channel server.

But then comes a day when you get the guts to look at old code. Reason: the crawler was dumb dumb. It crawled every single item of every single node every time it ran. Stupid, eh? But there’s more: by using smack’s XEP-0060 implementation, the crawler missed an important feature that the Buddycloud XEP specifies - XEP-0059, a.k.a. RSM.

The Buddycloud XEP uses RSM in several occasions, eg.: disco#items and pubsub <items>.

<iq type=’result’ from=’channelserver.example.com’ to=’channeluser@example.com/KillerApp’ id=’items1’>

<pubsub xmlns=’http://jabber.org/protocol/pubsub’>
<items node=’/user/kitteh@example.com/posts’>
<item id=”1291048810046”>
<entry xmlns=”http://www.w3.org/2005/Atom” xmlns:thr=”http://purl.org/syndication/thread/1.0”>
<author>
<name>Dirk</name>
<jid>fahrertuer@example.com</jid>
</author>
<content type=”text”>A comment, wondering what all this testing does</content>
<published>2010-11-29T16:40:10Z</published>
<updated>2010-11-29T16:40:10Z</updated>
<id>/user/channeluser@example.com/posts:1291048810046</id>
</entry>
</item>
<!— [more items…] —>
</items>
<set xmlns=’http://jabber.org/protocol/rsm’>
<first>1291048810046</first>
<last>1131048810046</last>
<count>19827</count>
</set>
</pubsub>
</iq>

Thus, we needed to somehow integrate RSM on smack to get the crawler properly working, so that it could page through all results that the channel server provides and avoid looking into the past when not needed.

Finally, we achieved that by creating a class that works as a packet extension called RSMSet and by overriding some of the PubSub classes offered by smack, such as the PubSub class itself. Now, when we need to do RSM, we ask the PubSubManager for a new PubSub packet and then inject the RSMSet extension into it. As in:

DiscoverItems discoRequest = new DiscoverItems();
discoRequest.setTo(discoverInfo.getFrom());
RSMSet rsmSet = new RSMSet();
rsmSet.setAfter(discoverInfo.getRsmSet().getLast());
discoRequest.setRsmSet(rsmSet);

And then, we store the last item crawled in every node inside the directory’s database, so the crawler can be smarter about what to crawl.

Job done.

Next step on performance improvement list: crawl the /firehose when possible.

imaginator

26 January 2014

Buddycloud in the Mountains Update

We have enough food, drink, wood for the fire, good wifi and it looks like we might be snowed in. We’re in the Austrian mountains working on a really simple and quick way to host a new Buddycloud site.

Some background: We have worked hard to organise our install docs and make buddycloud easy to install with .deb packages. Guilherme has also worked on a protocol tester to check buddycloud instances really are working and installed correctly.

Buddycloud’s design is to be as modular as possible. This gives you choice:

  • Don’t like the media server, slide it out and use a replacement.
  • Don’t like how we do the webclient, slide it out and use your own frontend. 

The problem with this approach is that there are more moving pieces to setup and more places to make a mistake at install time. And this is before you have even decided that Buddycloud is right for your needs.

So we’re building a way for you to get your own Buddycloud and XMPP account. When you grow, you can spin up your own Buddycloud site on your own hardware and migrate your data over.

The hosting platform (at least v0.0.1) will:

  • host your buddycloud on a sub-domain (eg: megacorp.buddycloud.net)

  • your own buddycloud website, API, buddycloud service, media service

  • user management: invite, add, delete, disable users

  • and a nice UI and API to manage everything.

  • tested updates using Docker (helps us ship working code).

Expect more news and code soon.

Rene starting the fire

Doors are not very big


imaginator

17 January 2014

Buddycloud in the Mountains

Imagine 5 days of being in a snowbound cabin deep in the mountains. Eating good food and hacking on great code.

We’re doing something a little special: going to the mountains for a few days of distraction free hacking. 

The plan, use the time build a Buddycloud hosting platform that helps you spin up and run your own buddycloud site (we’ll be prepping everything before so that we can hit the ground with a good spec and idea of how this will work)

  • ui / ux
  • backend code
  • DNS updating
  • adaptions to bc server
  • documentation
  • deployment

Details:

  • we will be staying at the Mesnerhof house (15 mins to nearest town). More photos on the Facebook Album
  • from Saturday 25th January - Thursday 30th January (before the XMPP summit and FOSDEM)
  • food: I’ll be bringing in a lot of food and wine. Then we’re cooking and eating in the house
  • getting there: There is a pickup from Munich on Saturday.
We depart on on Thursday morning and fly to the XMPP summit in Brussels followed by FOSDEM where we are looking forward to presenting the results of our mountain hacking.


imaginator

31 August 2013

Scaling buddycloud with Fanout.io

We want the buddycloud social network stack to be useful for everyone. From hobbyists to large service providers. In order to help address the scaling needs at the higher end of the spectrum, we’ve added support for Fanout.io in buddycloud’s HTTP API component.

Fanout.io is a cloud service that handles realtime data delivery. It’s kind of like Pusher and PubNub, but with an emphasis on APIs. Normally, the buddycloud HTTP API component sends realtime updates to clients directly. To scale up to really large numbers of connections it becomes necessary to distribute the connection pooling across multiple servers. This is where Fanout can help. Rather than having to add extra servers on your own to handle high load, you can route the pushes through the Fanout CDN instead. This way, you get limitless scaling of realtime pushes without having to manage any additional servers.

To use it, just put your Fanout realm & key in the gripProxies section of buddycloud HTTP API config.js:

exports.production = {
  ...
  gripProxies: [
    ...
    // fanout.io
    {
      key: new Buffer('{key}', 'base64'),
      controlUri: 'http://api.fanout.io/realm/{realm}',
      controlIss: '{realm}'
    }
  ]
}

Then, make sure your API domain is registered within Fanout and set as a CNAME:

api.yourdomain.com  IN  CNAME  httpfront-{id}.fanout.io

That’s all it takes. If you’re using buddycloud as part of a large service then you can rest easy knowing that realtime updates will never be a bottleneck. Thousands of people could have a channel page open and see updates instantly. We’ve got this enabled on our demo site as well.

How we did it

Some months back, Fanout’s founder, Justin Karneges, approached us about a possible integration. At first we were skeptical. We understood the value of a CDN, but it was important to us that buddycloud always have the ability to run standalone and not depend on a third-party service. Therefore, the feature would have to be optional. Additionally, we didn’t want this to break our API. We already had a long-polling API defined, and we didn’t want the API to be different depending on if Fanout routing was enabled or not.

Fanout uses a concept called “GRIP” proxying that addresses all of these concerns. The approach dictates that a special proxy server sit in front of our webserver, handling the work of pushing out lots of data to clients. Because it acts as an HTTP proxy, we’re able to retain our API. Our webserver then speaks the open GRIP protocol to the proxy. This is the same protocol that the open source Pushpin proxy uses as well. What’s neat is that buddycloud only needs to support the GRIP protocol, and it can be fronted by either a local Pushpin instance or the Fanout cloud. Or any future GRIP-supporting proxy. This means we have a singular code path, and the code isn’t even Fanout-specific. We like standards.

Below is a diagram of how a standalone (non-Fanout) server setup looks:

image

In this case, a local Pushpin instance sits on the same physical server as the buddycloud HTTP API component, so that the server itself is capable of performing its own realtime deliveries. This means the server maintains all the open connections with any clients. The end result is more or less the same as before we converted the HTTP API code to use GRIP, although we hope the code will be more maintainable this way, too.

Now for the fun part. If you configure buddycloud to use the Fanout cloud service, then you essentially replace Pushpin in the scheme:

image

Here the Fanout CDN represents Fanout’s global cluster of servers. With this setup, the buddycloud server doesn’t need to maintain any of the open connections, and scaling concerns are handled by Fanout.

Code

There were two main changes we had to make to get this to work. First, at any place where we were holding an HTTP request open as a long-poll, we changed the code to reply immediately with a special GRIP response instead. This special response tells the proxy service to hold the connection open on our behalf. The held connection is bound to a channel identifier so that we can later push to it using that channel. Second, whenever new data is available we now publish it to the GRIP service. What’s nice is we publish the data regardless of whether or not there are any held connections. This means we no longer need to deal with connection management.

The buddycloud HTTP API component is written in Node with Express, so we used the Nodegrip library for our GRIP support. Below we define some utility methods.

First, the sendHoldResponse() method sends an HTTP response with GRIP instructions. It specifies the channel to bind to and defines the HTTP response to use in case of a timeout. We use this method wherever we need to cause a request to long-poll.

var griplib = require('grip');

exports.sendHoldResponse = function(req, res, channelBase, prevId) {
  var origin = req.header('Origin', '*');
  if (origin == 'null') {
    origin = '*';
  }
  var headers = {};
  headers['Access-Control-Allow-Origin'] = origin;
  headers['Access-Control-Allow-Credentials'] = 'true';
  headers['Access-Control-Allow-Methods'] = 'GET, POST, PUT, DELETE';
  headers['Access-Control-Allow-Headers'] =
    'Authorization, Content-Type, X-Requested-With, X-Session-Id';
  headers['Access-Control-Expose-Headers'] = 'Location, X-Session-Id';

  var channel;
  var contentType;
  var body;
  if (req.accepts('application/atom+xml')) {
    channel = channelBase + '-atom';
    contentType = 'application/atom+xml';
    body = '<?xml version="1.0" encoding="utf-8"?>\n' +
      '<feed xmlns="http://www.w3.org/2005/Atom"/>\n';
  } else if (req.accepts('application/json')) {
    channel = channelBase + '-json';
    contentType = 'application/json';
    body = {};
    body['last_cursor'] = prevId;
    body['items'] = [];
    body = JSON.stringify(body);
  } else {
    res.send(406);
    return;
  }

  var channelObj = new griplib.Channel(channel, prevId);
  headers['Content-Type'] = contentType;
  var response = new griplib.Response({'headers': headers, 'body': body});
  var instruct = griplib.createHoldResponse(channelObj, response);

  console.log('grip: sending hold for channel ' + channel);
  res.send(instruct, {'Content-Type': 'application/grip-instruct'});
};

The code’s a bit large since we want our timeout response to support CORS along with both Atom and JSON encodings. The actual GRIP stuff is in the last few lines.

Next, the publishAtomResponse() method pushes the HTTP response that we want to send to any connected clients. As we want to support both Atom and JSON formats, this is done by having a channel for each format (suffixed with “-atom” or “-json”). We bind to one or the other channel based on the Accept header in a request, and whenever we publish we send to both channels.

exports.publishAtomResponse = function(origin, channelBase, doc, id, prevId) {
  if (origin == 'null') {
    origin = '*';
  }
  var headers = {};
  headers['Access-Control-Allow-Origin'] = origin;
  headers['Access-Control-Allow-Credentials'] = 'true';
  headers['Access-Control-Allow-Methods'] = 'GET, POST, PUT, DELETE';
  headers['Access-Control-Allow-Headers'] =
    'Authorization, Content-Type, X-Requested-With, X-Session-Id';
  headers['Access-Control-Expose-Headers'] = 'Location, X-Session-Id';

  headers['Content-Type'] = 'application/atom+xml';
  var response = doc.toString();
  var channel = channelBase + '-atom';
  console.log('grip: publishing on channel ' + channel);
  grip.publish(channel, id, prevId, headers, response);

  headers['Content-Type'] = 'application/json';
  if (id != null) {
    response = {};
    response['last_cursor'] = id;
    response['items'] = atom.toJSON(doc);
    response = JSON.stringify(response);
  } else {
    response = JSON.stringify(atom.toJSON(doc));
  }
  channel = channelBase + '-json';
  console.log('grip: publishing on channel ' + channel);
  grip.publish(channel, id, prevId, headers, response);
};

The grip.publish() method seen above is another convenience method. It handles the GRIP payload formatting, and also supports publishing to multiple GRIP proxies at once:

exports.publish = function(channel, id, prevId, rheaders, rbody, sbody) {
  if (!config.gripProxies || config.gripProxies.length < 1) {
    return;
  }

  if (!pubs) {
    pubs = [];
    for(var i = 0; i < config.gripProxies.length; ++i) {
      var gripProxy = config.gripProxies[i];
      var auth = null;
      if(gripProxy.controlIss) {
        auth = new pubcontrol.Auth.AuthJwt({'iss': gripProxy.controlIss},
          gripProxy.key);
      }
      pubs.push(new pubcontrol.PubControl(gripProxy.controlUri, auth));
    }
  }

  var formats = [];
  if (rbody != null) {
    formats.push(new griplib.HttpResponseFormat(200, 'OK', rheaders, rbody));
  }
  if (sbody != null) {
    formats.push(new griplib.HttpStreamFormat(sbody));
  }

  var item = new pubcontrol.Item(formats, id, prevId);

  for(var i = 0; i < config.gripProxies.length; ++i) {
    (function() {
      var gripProxy = config.gripProxies[i];
      pubs[i].publish(channel, item, function(success, message) {
        if (!success) {
          console.log("grip: failed to publish to " + gripProxy.controlUri +
            ", reason: " + message);
        }
      });
    }());
  }
}

With these convenience methods in place, implementing our /notifications/posts endpoint was easy. When a request is received but we have no items to deliver, we call sendHoldResponse() to cause the request to stay open. Whenever a new item becomes available, we call publishAtomResponse() to have it delivered down any open connections.

The buddycloud HTTP API maintains an item queue per logged-in user, with each one backed by a persistent XMPP session. The GRIP channels are unique per-user then, using the form “np-{jid}” (“np” standing for “notifications/posts”). Requests by unauthenticated users share a single anonymous JID, and that’s where the big scaling win is. If there are 1000 anonymous browsers viewing a channel, the GRIP service will deliver updates to all of them with a single push from the buddycloud HTTP API. In a future revision we will look into how to better scale updates for logged-in users.

API

With our changes in place, what does an API call look like? (long lines wrapped)

GET /api/notifications/posts?since=cursor:1
Host: demo.buddycloud.org
Accept: application/json

HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: Authorization, Content-Type, X-Requested-With, X
-Session-Id
Access-Control-Allow-Methods: GET, POST, PUT, DELETE
Access-Control-Expose-Headers: Location, X-Session-Id
Access-Control-Allow-Origin: *
Content-Length: 250
Content-Type: application/json

{"last_cursor":"2","items":[{"id":"02befca2-5b5f-4ae1-801f-5d2113bf8ba5","sour
ce":"test@topics.buddycloud.org/posts","author":"justin@buddycloud.org","publi
shed":"2013-05-13T20:17:56.537Z","updated":"2013-05-13T20:17:56.537Z","content
":"test post"}]}

Answer: it looks exactly the same as it did before this change. That’s the whole point. There may be a proxy service in the middle, but to the outside world it looks like buddycloud’s API.

Multiple GRIP configurations

The buddycloud HTTP API can be configured to use multiple GRIP proxies at once. This makes it possible for the component to work with both a local Pushpin instance and the Fanout.io service simultaneously. To do this, keep Pushpin in the network path, and set upstream_key in pushpin.conf with your Fanout key:

upstream_key=base64:{key}

The resulting topology looks like this:

image

Why would you want to do this? The main reason is that it helps with transitioning from a non-Fanout setup to a Fanout setup. With this configuration, the buddycloud HTTP API publishes messages to both services, such that either one is capable of handling realtime connections at any moment. This can be very useful right when you set your CNAME domain to point at Fanout. While you’re waiting for the DNS change to propagate, realtime pushes will still work via the local Pushpin instance. In fact, turning Fanout on and off at this point is just a matter of a DNS change. No additional configuration or restarts of buddycloud is required. You can even point your /etc/hosts file at Fanout within your test environment to confirm everything is fine before making the official DNS change. This greatly reduces the risk involved in configuring and activating Fanout.

Conclusion

We are really impressed with Fanout’s approach to scaling realtime. Of all the similar such services available, it’s the only one we could ever think of using, and we love the very open way it goes about things. buddycloud will soon be offering hosted service for large businesses, and it’s nice to know that the scalability of realtime deliveries is one less thing we’ll have to worry about.


imaginator

23 April 2013

Presenting at realtimeconf.eu


imaginator

12 February 2013

buddycloud presenting at the Realtime Conference: April 22, 23 2013 Lyon, France

The Realtime conference goes from strength to strength and we’re excited to be invited to present there.

If you build amazing services on the internet you really want to get there.

The conference is a smaller, more intimate gathering of experts and that means that there are limited tickets. To join us, get your’s at http://realtimeconf.eu/

Date check: APRIL 22, 23 2013 LYON, FRANCE


imaginator

11 February 2013

Last week we presented at FOSDEM, under the theme of: Secure, Federated Communication. Here’s the video.


TAGS: FOSDEM buddycloud secure communication federation

imaginator

1 February 2013

The buddycloud FOSDEM release

Today, just in time for FOSDEM, we’re announcing version 0.9 of the buddycloud webclient and updates to all the buddycloud components.

And everything is nicely packaged at downloads.buddycloud.com

This release also includes new releases of the 

We’re be demoing the new software at FOSDEM and ready to help you setup your own buddycloud instance.

Find us at the Realtime Lounge in FOSDEM (K Building, level 2).

And see you in FOSDEM. Until then this is what we’re doing:

image


imaginator

29 January 2013

Easily Upload Pictures and Video Into Your buddycloud Channel

From the buddycloud department of extreme-alphaware:

Abmar threw together a quick Android upload intent that will post to your channel. Nothing more, nothing less.

To try it out, check out: http://dl.dropbox.com/u/65800395/buddycloud.apk 

I’ll be using it at FOSDEM to keep the pictures flowing.

image