Developers that specialise only in one language are a dying breed. Successful technologists need to master more than one, juggle many technologies and expand their toolsets significantly to maintain their edge. The same applies to organisations that need to keep adding ingredients to their technological stack to overcome new obstacles. Hence, it’s natural for their engineers to use the right tools for the task they need to accomplish and add more. However, managers can sometimes become reluctant, fearing that their mix of technologies might become too elaborate or peculiar. As an aspiring small startup we wanted to calculate our exposure to all the different languages we employ in a way that is easily accessible and reproducible.
There is a “thing” called Github!
When such tasks arise Github can be of great help, with their full-fledged API and it just happens that we are using it for all our projects, so that was the first place to check. However, when you want to create holistic views on an organisational level with statistics for all the repos or the contributions of all the members in the organisation you quickly hit your first obstacle, as this is not something their API offers straightaway.
What about those guys over there?
You probably have heard about some of the popular integration services like IFTTT and Zapier, with their many integrations and seamless user experience. their “trigger-response” model is great when the problem you are trying to solve is connecting simple events from different services. However, if you need to process relationships between events or join them with external data sources, you will need a different tool. That’s where our platform comes handy, since we can process new streams of data alongside older archived data and provide capabilities ranging from simple data collection to machine learning. That being said, by combining both platforms into one workflow both parties can benefit greatly. Our platform gets all the wonderful integrations they currently provide and they can enhance their current set of triggers with events that can be triggered as a result of data being processed in our platform. So we thought it would be a great opportunity to put that to the test.
Before we dive any deeper, if this is the first time you stumbled onto one of our posts, I would recommend reading the previous blog post from my colleague that gives an overview of our architecture and then continue with some of our other posts that will give you more insight to our vision and the experiments we have carried out until now.
Our take on this attempt to explore synergies could be summarised along as:
- Trigger an event each time a new commit is pushed to any of the repositories of our organisation.
- Update the existing stats for the exposure in the different languages the organisation has and the user level stats across all the repos of the organisation.
- Lastly trigger notifications on interesting events. (e.g. big changes in the codebase)
Finding the missing pieces.
We first started looking at IFTTT, with its minimalistic and user friendly interface. From the first few clicks it’s evident that integrations with services like Github can be completed in a matter of seconds. Unfortunately, the functionality we needed to capture the commit events happening across the repos of our organisation didn’t exist. So we decided to put that service to the side for now to use it towards the end.
Next stop was Zapier, which sacrifices some simplicity to support solutions with higher complexity. Thus we were able to hook it up with Github’s API, receive the commit events we were looking for, parse them for information relevant to us and forward them to our platform. First step done! Yay!
Moving forward, the next step was to build the required Sift in the Redsift platform. Probably this is not your first rodeo with us so let’s move forward to explain a bit the computational graph or “DAG” that is powering our solution. In order to capture the required metrics for the repos we had to tap into Github’s Statistics API and the Languages endpoint for each repository. As a side note, it’s worth mentioning that although their API is only limited by our imagination it does have it’s own set of tricks like following paginated links to construct all the dataset for a resource, keep polling for computationally expensive requests such as stats and of course avoid their rate limits. Without further ado let’s discuss a high overview of what it’s happening here.
- The first time our Sift is installed, the user needs to authenticate with Github that is needed for any requests we are going to make to their API using OAuth. Once this is completed we grab a snapshot of all the data up to this point and push it through our DAG where we generate the required metrics for the organisation.
- The integration with Zapier is going to notify us which repo had a new commit, so we can fetch again all the information for that repo and update our calculations and our visualisations.
- Slack commands have been proven really useful, so we added a trigger there to export a snapshot of our data with a visualisation inside one of our channels.
- Finally, by comparing each new snapshot with the previous one, we are able to trigger events that our integration with IFTTT will listen to and then propagate them to any of the many integrations it supports.
We now have a service that can enhance triggers from other services such as Zapier and IFTTT by allowing data computation on the fly. Most importantly it is easily accessible for people who are not that tech savvy. At the moment we have the Slack integration with an in channel command and later on, when we launch our product, anyone will be able to use it since Sifts are meant to be written once and used by anyone.
We are handing out a new batch of invites for our early access program in the coming months and we would love to see the interesting Sifts you can come up with. If hacking your own data and potentially creating solutions that can help thousands (or millions) of other people sounds interesting to you, why not register here for early access or email us?