Documentation of a project

In my previous post I described 10 steps we should take to improve security of web applications. In this article I'm going to describe the purpose of documenting a project and what information should be included.
Every successful project requires documentation to communicate a project goal and requirements to reach it.

Purpose of documentation

The reason documentation exists is to communicate in a clear, understandable language with all people involved. For web applications this documentation should describe the idea as a concept, steps needed to convert the idea into an application, what tools, applications and services are required in the proces of building the application, how the application is architected and who are the people involved.

Since documentation is never complete, a good structure is required to allow additions or corrections being made over time. I prefer to use a Wiki type of platform to write my documentation. A wiki has often a simplified markup language and allows for multiple editors to collaborate on the documentation. Wikipedia is a great example of using a wiki to write documentation. But other tools are equally good like Google Docs, Microsoft Sharepoint or Microsoft Word documents on a shared drive in your organisation.

For all our project we are using Phabricator as we can integrate references to tasks, commits, users and a whole lot more features in our documentation wiki, making it a true source of reference. With Phabricator we're able to create global knowledge in our team and in our whole organisation.

Example screens of Phabricator
Example screens of Phabricator

Structuring documentation

As mentioned earlier, documentation changes over time for a lot of reasons. Here are a few common reasons why I needed to change or add documentation:
  • Ideas change over time
  • Team members come and go
  • Unforeseen problems have surfaced
  • More features are requested
  • Unexpected events occured
  • Hardware requirements changed
  • Regulation, compliance or company structure changed
  • ...
To prepare your documentation for these and other changes, a documentation structure is required. What works for me and the people I work with is a structure based on "expanding knowledge". This is a structure that allows your documentation to grow, without becoming a single massive document.
  • project name: just a name you give your project
    • base: this is the front-cover of the project containing just a brief description of the idea and goal of the project. It also serves as the base for your documentation hierarchy.
      • project: This is a container location where you can create documentation describing the project phases or goals like "Initial phase", "MVP", "Alpha release", "RC4", etc…
      • architecture: This is also a container location where we describe all the requirements needed to run our application, segmented into specific environments like development, testing, staging and production. We even provide a segment to describe our CI/CD. Examples are hardware details, cloud services, applications like PHP, MySQL and Nginx. But also network, certificates and optionally what type of secrets are required to access and interact with our systems (not the actual certificates and secrets)
      • meeting notes: I like to keep meeting notes part of a project because it helps explain why certain choices were made and gives full transparency to the team
At the start of every project I'm also using Phabricator's feature to create a "space", "team", "project" and "tag". Let me first explain what they mean.
  • project: identifies the main project and can have sub-projects
  • team: allows me to select members of a team, often given the name of the project like "Acme Website Team" or "Corp CRM Refactoring Team" which I can assign on a project or sub-project.
  • space: this is a restricted area that I can assign on a project and give permissions to the project team members to access the information kept within the space
  • tag: I use tags just to label certain aspects in the lifecycle of a project. Tags are common across multiple projects like "meeting", "infrastructure", "easy pick", etc… and allow users to find information more quickly
The combination of spaces, teams, projects and tags with the documentation architecture described above gives me and the teams I work with clarity and insights about what needs to be done and what goals are to be reached.

Security aspect

As you're defining the components needed to work on your web application, don't forget to include security in your documentation! Most projects I worked on have been approached from a functional perspective where features and functionality are well documented, the architecture defined in terms of hardware and software requirements and who's doing what.

What I define as "security" documentation can be listed as the following:
  • servers: the physical, virtual or container bases systems where (parts of) your project resides on, files are kept and data is persisted
  • network: the physical or virtual connection between all the components, sockets included
  • protocol: the way components of your application are communicating with each other
  • direction: the direction communication goes, sometimes one way, sometimes both ways
  • ports: which ports are required to communicate over? Even though this can be considered part of Network documentation, I consider it important enough to give it a specific mentioning
  • operating system: the choice of operating system is important for the purpose of the application and defines the tools to be used to install the required services on it
  • services: each component of your application requires a separate service and it's a good practice to define them in advance, including the technology stack in which you build your web application
  • secrets and certificates: define upfront what type of certificates you need and how secrets are managed so you have a procedure ready when you need to apply it (I like to use HashiCorp Vault for management of these "secrets")

Automation is key

Setting up environments with all required bells and whistles is a repetitive task that should be automated. It allows you to set up systems faster, but allows you to make improvements over time. This results in more stable and better protected systems. It also removes the fear that a step is skipped or a configuration setting forgotten.

Automation Tools
Automation tools to configure and provision infrastructure as code

GIT SCM

GIT is not an automation tool but is an essential part of the automation process. GIT is a version control system where mostly developers keep their application code. But because GIT can be used for all text based (and limited binary) tracking of changes, we can also use it to store the automation artefacts in it.

CI/CD

A continuous integration (CI) and continuous deployment (CD) platform is not essential but highly recommended for the purpose of automation. My personal favourite is Jenkins CI, but there are many alternatives you can install on-prem or are being provided as a SaaS. Just search for "CI/CD Tool" in a search engine of choice you'll find plenty of solutions. NOTE: All automation tools mentioned below can also be triggered by manual commands on the command line.

Terraform

HashCorp Terraform is a great tool to treat infrastructure as code. It allows you to quickly create your VM instances by using blueprints. Especially for Amazon Web ServicesAzure or Google Cloud Platform cloud services it works really well. If you're using VMWare vSphere or Docker Containers, Terraform can be a true help. A full list of supported infrastructure can be found at www.terraform.io/docs/providers/index.html.

Ansible

RedHat Ansible is a provisioning tool that aims at automating the installation and configuration of tools or services on your platform without the usage of agents. This means fast, repeatable setup of one, a group or all instances you have created.

Vagrant

HashiCorp Vagrant is an environment deployment tool to create virtual environments using Virtualbox or VMWare on a local system. It's great for creating local development environments and works great with RedHat Ansible to provision these virtualised platforms.

Docker

Docker containers are small, computational units of software, packaged as a platform with OS, network and application, that run everywhere the same way. Because there's no virtualisation layer between your application and your OS, applications run faster and require less resources. Even though Docker containers are great for development of micro-services, we now see more and more containers being deployed in production.

Kubernetes

Kubernetes is an open source container orchestration technology that allows you to manage, scale and deploy containers with ease. If you're already using Docker containers, managing them with Kubernetes is highly advisable.

Next steps

Now that we have defined how we should structure our documentation, what aspects we need to focus on to make our application secure by design and how automation can help us ensuring that we have repeatable setup, provisioning and configuration of our infrastructure as code, we should get started applying what we have seen so far onto a real project.

In my next article I will document a web application project and provide examples how this would look like with the suggestions described in this article. Hope to see you next time.

Comments

Popular Posts