Git subtree introduction

[NOTE: I recovered this post from my old wordpress blog]

We often find ourselves developing projects that depend on other vendor’s libraries or even on our own produced external software components.

Git subtree provides a way to incorporate that external project into another one (normally bigger) by copying it inside the parent one and making it share the parent’s commit history from that moment on.

This is known as system-based approach in development, where you architect your design by taking the different interconnected projects as a whole. That strategy involves tagging, merging and pushing the whole repository constantly. One commit history to rule them all.

How does it work? Imagine we have two projects: the big one (the-backend), and the small one (the-frontend). Former is the main project, being constantly changed and under heavy commit routine. The latter is the mobile web application that consumes the backend’s API. Stable, and only being changed every backend’s major release.

It is therefore interesting to manage both projects independently, with sepparate commit histories but maintaining cohesion of the project itself. One way to do it is well, with git subtrees.

NOTE: this is an extremely simple use case of git subtree just to get the sense of it, any corrections are more than welcome.

the-backend/ (parent project)
the-frontend/ (child project)

Both the-backend/ and the-frontend/; are independent projects. They can be in the same server or in remote servers, only references will change.

λ mkdir the-backend/
λ touch the-backend/file1 the-backend/file2
λ mkdir the-frontend/
λ touch the-frontend/item1 the-frontend/item2

We want to add the-frontend/ as a dependence to the-backend/ project but letting it stand as an independent repository.

λ cd the-backend/
λ git init
λ git add .
λ git commit -a -m "Two first files added to the-backend parent project "
λ cd ../the-frontend/
λ git init
λ git add .
λ git commit -a -m "Two first files added to the-frontend child project"

Now with both repositories initialized, we add the the-frontend/ repo as a remote repo to the-backend/ project.

λ (In the-backend/ folder)
λ git remote add frontend-subtree ../the-frontend/
λ git subtree add --prefix=frontend frontend-subtree master
Creates a subtree of the-frontend/ project inside the-backend/ project under the prefix specified (prefix obligatory)

Now if we git log inside the-backend/ we see a commit message something like: “Add ‘the-frontend/’ from commit ”.
Say that now we want to add some changes in the-frontend/ project outside the-backend/ one.

λ (In the-frontend/ folder, not the-backend/frontend/)
λ touch item3
λ git add item3 && git commit item3 -m "Item3 added inside the-frontend/ outside project."
λ git log
* c557ce6 - (HEAD, master) Item3 added inside the-frontend/ outside project. (4 seconds ago) <Esteban>
* 45074d3 - Two first files added to the-frontend child project (4 days ago) <Esteban>

However this changes are only visible in the-frontend/ project, while in the-backend/frontend/..

λ ls the-backend/frontend
item1 item2
λ git log
* e009438 - (HEAD, master) Add 'the-frontend/' from commit '45074d397c99079acd20cb24e9d8b8830afcf802' (4 days ago) <Esteban>
| * 45074d3 - (frontend-subtree/master) Two first files added to the-frontend child project (4 days ago) <Esteban>
* 3ea1a67 - Two first files added to the-backend parent project (4 days ago) <Esteban>

If we want those changes visible in the-backend/frontend/ subtree folder, we have to git pull subtree:

In the-backend/ root folder, above frontend/, otherwise we get a message like 'You need to run this command from the toplevel of the working tree.')

λ git subtree pull --prefix=frontend/ frontend-subtree master
λ git log
* 7dd7677 - (HEAD, master) Merge commit from the parent the-backend/ pulling changes from the the-frontend/ subtree (28 seconds ago) <Esteban>
| * c557ce6 - (frontend-subtree/master) Item3 added inside the-frontend/ outside project. (6 minutes ago) <Esteban>
* | e009438 - Add 'the-frontend/' from commit '45074d397c99079acd20cb24e9d8b8830afcf802' (4 days ago) <Esteban>
|\ \
| |/
| * 45074d3 - Two first files added to the-frontend/ child project (4 days ago) <Esteban>
* 3ea1a67 - Two first files added to the-backend/ parent project (4 days ago) <Esteban>

See also how we got into the-backend/ parent folder the local commit from the-frontend/ repository. This may be also avoided adding the –squash option in the git subtree pull command. That option will compress all local commits in one single commit for the subtree pull.

Nonetheless, we might aswell do it inversely. If someone working from the parent the-backend/ folder makes a change to the frontend/ added subtree, and we want to see those changes reflected in the outsider the-frontend/ repository:

(In the-backed/frontend)

λ touch item4_from_parent
λ git add && git commit item4_from_parent -m "Item4 added from the parent to frontend/ subtree folder"
λ git log
* 4a50c88 - (HEAD, master) Item4 added from the parent to frontend/ subtree folder (9 seconds ago)
( . . . )
λ git subtree push --prefix=frontend/ frontend-subtree new-branch-from-master

This will create a new branch in the-frontend/ project with those changes. It is extremely cumbersome to create a new branch in the frontend external project every time a change is made in the subtree from the parent, but git subtree does not allow to overwrite the master branch by default due to potential inconsistent state restrictions.

One alternative to git subtree is git submodules, but that is topic for another article.


Routing in Javascript


Originally web applications consisted in interconnected html documents that one could navigate through links between them. Every time a user clicked a link on a website a new document would be generated in the server and sent back to the browser to be rendered in their screen.

Around the year 2005 the term Single-Page Application (SPA) became popular. Said term encompassed a new way or architecting websites to make them behave more like desktop applications: snappy, with graphical animations and smooth transitions between links.
This was achieved by taking advantage of javascript, html & css, as new APIs became available to give the browser more native-like capabilities.

SPAs are based on a single document model. This means that web applications' lifespan happens on a single html page, along with the transitions between the different views. But since links no longer imply the fetching and generation of a new document, how are those transitions modelled? they are achieved by using a router.

What is a Javascript Router?

A Javascript router is a key component in most frontend frameworks. It is the piece of software in charge to organize the states of the application, switching between different views. For example, the router will render the login screen initially, and when the login is successfull it will perform the transition to the user's welcome screen.

How it works.

The router will be in charge of simulating transitions between documents by watching changes on the URL. When the document is reloaded or the URL is modified somehow, it will detect that change and render the view that is associated with the new URL.

I wrote a small router in javascript to illustrate the idea. At the beginning we need two objects, one to store the routes, and other to store the templates, along with two simple functions to register them.

Templates are just one way of describing the DOM that will be generated when the transition from one route to the other is completed. The whole javascript application will live in a div element.

// Application div
const appDiv = "app";

// Both set of different routes and template generation functions
let routes = {};
let templates = {};

// Register a template (this is to mimic a template engine)
let template = (name, templateFunction) => {
  return templates[name] = templateFunction;

// Define the routes. Each route is described with a route path & a template to render
// when entering that path. A template can be a string (file name), or a function that
// will directly create the DOM objects.
let route = (path, template) => {
    if (typeof template == "function") {
      return routes[path] = template;
    else if (typeof template == "string") {
      return routes[path] = templates[template];
    else {

Now we will be able to register templates and routes, creating the mapping between them:

// Register the templates.
template('template1', () => {
    let myDiv = document.getElementById(appDiv);
    myDiv.innerHTML = "";
    const link1 = createLink('view1', 'Go to view1', '#/view1');
    const link2 = createLink('view2', 'Go to view2', '#/view2');

    return myDiv.appendChild(link2);

template('template-view1', () => {
    let myDiv = document.getElementById(appDiv);
    myDiv.innerHTML = "";
    const link1 = createDiv('view1', "<div><h1>This is View 1 </h1><a href='#/'>Go Back to Index</a></div>");
    return myDiv.appendChild(link1);

template('template-view2', () => {
    let myDiv = document.getElementById(appDiv);
    myDiv.innerHTML = "";
    const link2 = createDiv('view2', "<div><h1>This is View 2 </h1><a href='#/'>Go Back to Index</a></div>");
    return myDiv.appendChild(link2);

// Define the mappings route->template.
route('/', 'template1');
route('/view1', 'template-view1');
route('/view2', 'template-view2');

For the templates we match a template name with a function that will generate javascript elements and append the resulting DOM to the div where the application lives. This functionality in a real router would be taken over by the templating engine. For the routes, we just do the mapping between a route path and the corresponding template.

The createLink & createDiv are auxiliary functions to generate DOM:

// Generate DOM tree from a string
let createDiv = (id, xmlString) => {
    let d = document.createElement('div'); = id;
    d.innerHTML = xmlString;
    return d.firstChild;

// Helper function to create a link.
let createLink = (title, text, href) => {
    let a = document.createElement('a');
    let linkText = document.createTextNode(text);
    a.title = title;
    a.href = href;
    return a;

What is left is to have the logic to detect changes in the URL and resolve them to render the template. To do so, listen for the load & hashchange events. The former fires then a document is loaded, and the latter when the URL hash changes.

// Give the correspondent route (template) or fail
let resolveRoute = (route) => {
    try {
     return routes[route];
    } catch (error) {
        throw new Error("The route is not defined");

// The actual router, get the current URL and generate the corresponding template
let router = (evt) => {
    const url = window.location.hash.slice(1) || "/";
    const routeResolved = resolveRoute(url);

// For first load or when routes are changed in browser url box.
window.addEventListener('load', router);
window.addEventListener('hashchange', router);

That's it! Of course many functionality is lacking: the use of controllers to transform data before passing it to the views, nested routes, the use of history api, etc.. but the idea of javascript routing is quite easy to grasp. The code together can be found in this gist.



Have fun!

Ember Websockets & nginx integration

In a previous article I explained our approach at work to deploy an Ember.js application in an nginx server running on docker. Today I had to integrate an instance of that application to communicate with another microservice using WebSockets.

A simplified diagram of the architecture would be:

All this services run using a docker-compose script like this one (it is a very simplified version):


    image: bde2020/ember-swarm-ui-frontend:0.6.0
      - "88:80"
      - dispatcher:backend
      - ./config/frontend:/etc/nginx/conf.d

    image: semtech/mu-dispatcher:1.0.1
      - push-service:push-service
      - ./config/dispatcher:/config

    image: tenforce/virtuoso:1.2.0-virtuoso7.2.2

    image: tenforce/mu-push-service
      - MU_SPARQL_ENDPOINT=http://database:8890/sparql
      - db:database
      - "83:80"

The Dispatcher will proxy calls to other microservices based on the request path. This is very useful to avoid the frontend to have any information about other microservices host names. More information here.

The push-service will listen for GET requests in the root path (/), open a websocket and start sending some json. The code in the frontend to test just taken from ember-websockets:

export default Ember.Component.extend({
  websockets: Ember.inject.service(),

  didInsertElement() {
    const socket = this.get('websockets').socketFor('ws://localhost/push-service/');

But this is not enough, nginx needs additional configuration to open and maintain a connection using WebSockets. Luckily, nginx has support for it since version 1.3, and can be activated by specifying the set of headers that start the handshake for the websockets protocol.

# Set the server to proxy requests to when used in configuration
upstream backend_app {
    server backend;

# Server specifies the domain, and location the relative url
server {

    # WebSockets support
    location /push-service {
      proxy_pass http://backend_app;
      proxy_http_version 1.1;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection "upgrade";

The call from the frontend is to http://location/push-service, but the Push-Service only understands calls to /, then why specify the location = /push-service in nginx's configuration file? This is thanks to the Dispatcher, that will detect the call to http://localhost/push-service and rewrite it to http://push-service/, having a link to it in the docker-compose.yml file.

Have fun!


Ember & nginx docker deployment with multi-stage builds


At work we use docker as the virtualization technology of choice to perform our project's deployments. We follow a microservices architecture that allows us to do rapid development & testing, quickly trying new ideas and iterating on new functionality using a
modular approach, choosing the best language/framework that best adapts to our needs for each
particular use case.

The Problem

Our fronted stack consists of Ember.js happily running in an nginx server inside a docker container. The initial building & deployment process that we had was effective but a little cumbersome. I will use TenForce's webcat repository as example.

Initially the Ember application is built via command line (ember build --prod), generating a file. The file is then uploaded to the repository releases with a new tag assigned.

Afterwards, when building the nginx docker image, from the Dockerfile we detect the current version of the frontend reading it from the package.json file and fetch it from the github releases, unpacking the zip file contents into the nginx serving directory.

The Dockerfile is self explanatory:

FROM semtech/mu-nginx-spa-proxy

MAINTAINER Aad Versteden <>

RUN apt-get update; apt-get upgrade -y; apt-get install -y unzip wget;
COPY package.json /package.json
RUN mkdir /app; cd /app; wget$(cat /package.json | grep version | head -n 1 | awk -F: '{ print $2 }' | sed 's/[ ",]//g')/
RUN cd /app; unzip; mv dist/* .
RUN rm /app/ package.json

Now this has two problems:

  • We have to manually build the ember application and upload it to the github releases url.
  • Builds are not deterministic since each person has their own node, npm, bower & ember-cli combination. This has already accounted for some time lost looking on why seemingly identic builds some failed and some not.

The Solution

The solution came by using a combination of two new approaches:

  1. Using a docker image with node,npm, bower & ember-cli installed, therefore guaranteeing that every build would be with the same versions.
  2. Using Docker's multi-stage builds. Simply put, it allows to use the output of a given image as the input of the next one , avoiding fat images and simplifying the building process.

The first part is achieved by using the docker-ember image, ensuring fixed versions for the build tools:

FROM ubuntu:16.04
MAINTAINER Aad Versteden <>

# Install nodejs as per
RUN apt-get -y update; apt-get -y install wget python build-essential git libfontconfig
RUN wget -qO- >
RUN bash
RUN apt-get -y install nodejs
RUN npm install -g bower@1.7.9
RUN echo '{ "allow_root": true }' > /root/.bowerrc
RUN npm install -g ember-cli@2.14.0


The second part is achieved by using the multi-stage build in the process, building the ember app and copying the resulting dist output folder inside nginx's serving directory.

FROM madnificent/ember:2.14.0 as ember
MAINTAINER Esteban Sastre <>

COPY . /app
RUN npm install && bower install
RUN ember build

FROM semtech/mu-nginx-spa-proxy
COPY --from=ember /app/dist /app

This way, all the building process is limited to a simple docker build .

Have fun!

Understanding HTML terminology


I know, I know. "But Esteban, this article would have been useful 20 years ago, now it is a little outdated to say the least". I cannot disagree with that, but this kind of posts serve more as a reminder to me, and also perhaps to satisfy your curiosity. I don't have a comments section, but feel free to reach me out over twitter, email or linkedin.


When The Web was born, it did it as a system of internet servers that would allow to access documents via a Web Browser. From those documents you could access others via links, as well as other formats like graphics, or video, or audio, conforming a big network of interconnected documents called the web.

Web Browsers (at the time) were simple programs running on users' machines that would fetch the document from a web server, read it, and show it in user's screen.

HTML was born as a declarative way to structure those documents to tell the web browsers how they should paint and show the documents. HTML stands for HyperText Markup Language. Hypertext describes the ability to link to other documents from the current one, and markup defines the structure of a web page or a web application. An example document would be:

    <title>My Website</title>
      <p>My paragraph</p>
      <p>My other paragraph</p>

So the web browser would fetch that document from some server somewhere, and start interpreting it's content: "Oh hey, this is an HTML document! and.. oh yes, it has a title of My Title, so I will write that into the tab title, and also I see you have a document body, with a paragraph inside a section!, so I will paint that paragraph. But then I have another paragraph, so this means they must be separated by an empty line, since they are different blocks."

One important characteristic about HTML is that it is not strictly parsed. It means that in the event of receiving wrong code, for instance an unclosed tag, the web browser won't fail to load and show the page but will do the best it can to correct the mistake and paint the document.

So the author of the document wrote that "code" seen before, and the user who visited the web page of the author would see this:

It really is as simple as that. Without taking into account CSS to style those documents, giving them colours, shapes, you name it, and javascript to interact with them, websites were just are text documents written in a concrete way.


The Standard Generalized Markup Language came before HTML. One could say that HTML was derived from SGML although they were developed more or less in parallel. HTML would focus more on how the data reflected in the document looks. SGML is more generic, it is a (meta)language to define other markup languages, while in HTML you have a limited set of tags that define the structure of the document.

With SGML, you would need to specify:

  • The SGML declaration, enumerating the characters and delimiters that may appear in the application. You can find the charset declaration for HTML 4.0 here.
  • The Document Type Definition, defining the syntax of the markup constructs, for example:
<!DOCTYPE tvguide [
<!ELEMENT tvguide - - (date,channel+)>
<!ELEMENT date - - (#PCDATA)>
<!ELEMENT channel - - (channel_name,format?,program*)>
<!ATTLIST channel teletext (yes|no) "no">
<!ELEMENT format - - (#PCDATA)>
<!ELEMENT program - - (name,start_time,(end_time|duration))>
<!ATTLIST program
     min_age CDATA #REQUIRED
     lang CDATA "es">
<!ELEMENT name - - (#PCDATA)>
<!ELEMENT start_time - - (#PCDATA)>
<!ELEMENT end_time - - (#PCDATA)>
<!ELEMENT duration - - (#PCDATA)>
  • A specification describing the semantics of the markup.
  • Document instances containing data and markup.


XML was based on SGML and was designed to describe a set of rules that encode data in both a human readable and machine-readable formats. It was thought to focus primarily on what the data is, rather on how it is represented.

That is why it is used often as the exchange data format accross services over the internet. An example of XML would be: (examples taken from

<?xml version="1.0" encoding="UTF-8"?>
  <body>Don't forget me this weekend!</body>
  <body>Who cares right?</body>

The first line specifies the version and encoding. The rest of the document represents two notes, with information associated to them. A service can receive this xml document, parse it and do something with the data.

XML documents also can have a DTD just as the SGML documents, formalizing the structure of the document and describing a common ground to exchange the data for multiple users. You can add a DTD adding this line to the document: <!DOCTYPE note SYSTEM "Note.dtd">


<!DOCTYPE note
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT heading (#PCDATA)>

XML documents have other features like XML Schema, XML namespaces or XPath to name a few, but they are specific mechanisms for the xml documents. More information on w3schools


XHTML is simply HTML but expressed as valid XML. It has the same functionality, but is compliant with the most strict representations of the XML standard. This means that rules that were overlooked for HTML if they when not followed, must adhere to the strict set of rules of XML. For example:

  • In HTML you can write <br>, in XHTML it must be <br></br> or <br/> or <br />.
  • In HTML you can write <em><strong>Texto</em></strong>, in XHTML it has to be <em><strong>Texto</strong></em>, following the correct opening/closing order.


This term was introduced by Microsoft when Internet Explorer 4 came out and has no clear meaning. DHTML (Dynamic HTML) encompasses the set of technologies that allow to create interactive and animated web sites. This means that a site made with HTML, styled with CSS and additional interactivity accomplished using Javascript would fall into the DHTML category.