Skip to main content

Documentation of a project

In my previous post I described 10 steps we should take to improve security of web applications. In this article I'm going to describe the purpose of documenting a project and what information should be included.
Every successful project requires documentation to communicate a project goal and requirements to reach it.

Purpose of documentation

The reason documentation exists is to communicate in a clear, understandable language with all people involved. For web applications this documentation should describe the idea as a concept, steps needed to convert the idea into an application, what tools, applications and services are required in the proces of building the application, how the application is architected and who are the people involved.

Since documentation is never complete, a good structure is required to allow additions or corrections being made over time. I prefer to use a Wiki type of platform to write my documentation. A wiki has often a simplified markup language and allows for multiple editors to collaborate on the documentation. Wikipedia is a great example of using a wiki to write documentation. But other tools are equally good like Google Docs, Microsoft Sharepoint or Microsoft Word documents on a shared drive in your organisation.

For all our project we are using Phabricator as we can integrate references to tasks, commits, users and a whole lot more features in our documentation wiki, making it a true source of reference. With Phabricator we're able to create global knowledge in our team and in our whole organisation.

Example screens of Phabricator
Example screens of Phabricator

Structuring documentation

As mentioned earlier, documentation changes over time for a lot of reasons. Here are a few common reasons why I needed to change or add documentation:
  • Ideas change over time
  • Team members come and go
  • Unforeseen problems have surfaced
  • More features are requested
  • Unexpected events occured
  • Hardware requirements changed
  • Regulation, compliance or company structure changed
  • ...
To prepare your documentation for these and other changes, a documentation structure is required. What works for me and the people I work with is a structure based on "expanding knowledge". This is a structure that allows your documentation to grow, without becoming a single massive document.
  • project name: just a name you give your project
    • base: this is the front-cover of the project containing just a brief description of the idea and goal of the project. It also serves as the base for your documentation hierarchy.
      • project: This is a container location where you can create documentation describing the project phases or goals like "Initial phase", "MVP", "Alpha release", "RC4", etc…
      • architecture: This is also a container location where we describe all the requirements needed to run our application, segmented into specific environments like development, testing, staging and production. We even provide a segment to describe our CI/CD. Examples are hardware details, cloud services, applications like PHP, MySQL and Nginx. But also network, certificates and optionally what type of secrets are required to access and interact with our systems (not the actual certificates and secrets)
      • meeting notes: I like to keep meeting notes part of a project because it helps explain why certain choices were made and gives full transparency to the team
At the start of every project I'm also using Phabricator's feature to create a "space", "team", "project" and "tag". Let me first explain what they mean.
  • project: identifies the main project and can have sub-projects
  • team: allows me to select members of a team, often given the name of the project like "Acme Website Team" or "Corp CRM Refactoring Team" which I can assign on a project or sub-project.
  • space: this is a restricted area that I can assign on a project and give permissions to the project team members to access the information kept within the space
  • tag: I use tags just to label certain aspects in the lifecycle of a project. Tags are common across multiple projects like "meeting", "infrastructure", "easy pick", etc… and allow users to find information more quickly
The combination of spaces, teams, projects and tags with the documentation architecture described above gives me and the teams I work with clarity and insights about what needs to be done and what goals are to be reached.

Security aspect

As you're defining the components needed to work on your web application, don't forget to include security in your documentation! Most projects I worked on have been approached from a functional perspective where features and functionality are well documented, the architecture defined in terms of hardware and software requirements and who's doing what.

What I define as "security" documentation can be listed as the following:
  • servers: the physical, virtual or container bases systems where (parts of) your project resides on, files are kept and data is persisted
  • network: the physical or virtual connection between all the components, sockets included
  • protocol: the way components of your application are communicating with each other
  • direction: the direction communication goes, sometimes one way, sometimes both ways
  • ports: which ports are required to communicate over? Even though this can be considered part of Network documentation, I consider it important enough to give it a specific mentioning
  • operating system: the choice of operating system is important for the purpose of the application and defines the tools to be used to install the required services on it
  • services: each component of your application requires a separate service and it's a good practice to define them in advance, including the technology stack in which you build your web application
  • secrets and certificates: define upfront what type of certificates you need and how secrets are managed so you have a procedure ready when you need to apply it (I like to use HashiCorp Vault for management of these "secrets")

Automation is key

Setting up environments with all required bells and whistles is a repetitive task that should be automated. It allows you to set up systems faster, but allows you to make improvements over time. This results in more stable and better protected systems. It also removes the fear that a step is skipped or a configuration setting forgotten.

Automation Tools
Automation tools to configure and provision infrastructure as code

GIT SCM

GIT is not an automation tool but is an essential part of the automation process. GIT is a version control system where mostly developers keep their application code. But because GIT can be used for all text based (and limited binary) tracking of changes, we can also use it to store the automation artefacts in it.

CI/CD

A continuous integration (CI) and continuous deployment (CD) platform is not essential but highly recommended for the purpose of automation. My personal favourite is Jenkins CI, but there are many alternatives you can install on-prem or are being provided as a SaaS. Just search for "CI/CD Tool" in a search engine of choice you'll find plenty of solutions. NOTE: All automation tools mentioned below can also be triggered by manual commands on the command line.

Terraform

HashCorp Terraform is a great tool to treat infrastructure as code. It allows you to quickly create your VM instances by using blueprints. Especially for Amazon Web ServicesAzure or Google Cloud Platform cloud services it works really well. If you're using VMWare vSphere or Docker Containers, Terraform can be a true help. A full list of supported infrastructure can be found at www.terraform.io/docs/providers/index.html.

Ansible

RedHat Ansible is a provisioning tool that aims at automating the installation and configuration of tools or services on your platform without the usage of agents. This means fast, repeatable setup of one, a group or all instances you have created.

Vagrant

HashiCorp Vagrant is an environment deployment tool to create virtual environments using Virtualbox or VMWare on a local system. It's great for creating local development environments and works great with RedHat Ansible to provision these virtualised platforms.

Docker

Docker containers are small, computational units of software, packaged as a platform with OS, network and application, that run everywhere the same way. Because there's no virtualisation layer between your application and your OS, applications run faster and require less resources. Even though Docker containers are great for development of micro-services, we now see more and more containers being deployed in production.

Kubernetes

Kubernetes is an open source container orchestration technology that allows you to manage, scale and deploy containers with ease. If you're already using Docker containers, managing them with Kubernetes is highly advisable.

Next steps

Now that we have defined how we should structure our documentation, what aspects we need to focus on to make our application secure by design and how automation can help us ensuring that we have repeatable setup, provisioning and configuration of our infrastructure as code, we should get started applying what we have seen so far onto a real project.

In my next article I will document a web application project and provide examples how this would look like with the suggestions described in this article. Hope to see you next time.

Comments

Popular posts from this blog

PHP Arrays - Associative Arrays or Hash Maps

Associative array or hash maps are listings of key and value pairs with a posibility to nest additional keys and values. An associative array is a very powerful construct within PHP.

In our previous article we discussed simple arrays, which in their turn are indexed associative arrays under the hood. Take the following example:

$array = [
'apple',
'banana',
'chocolate',
]; 

Is in fact an indexed associative array under the hood:

$array = [
0 => 'apple',
1 => 'banana',
2 => 'chocolate',
]; 

But associative arrays can be so much more than just an indexed array, and you will find many database operations returning arrays where the fields of a table are the keys in the array while their values are also the values within the array.

$productRowData = [
'product_id' => 1234,
'brand_id' => 321,
'product_name' => 'Our awesome product',
'prodcut_description' => 'This is our most awesome product.&#…

Speeding up database calls with PDO and iterators

When you review lots of code, you often wonder why things were written the way they were. Especially when making expensive calls to a database, I still see things that could and should be improved.
No framework development When working with a framework, mostly these database calls are optimized for the developer and abstract the complex logic to improve and optimize the retrieval and usage of data. But then developers need to build something without a framework and end up using the basics of PHP in a sub-optimal way.

$pdo = new \PDO( $config['db']['dsn'], $config['db']['username'], $config['db']['password'] ); $sql = 'SELECT * FROM `gen_contact` ORDER BY `contact_modified` DESC'; $stmt = $pdo->prepare($sql); $stmt->execute(); $data = $stmt->fetchAll(\PDO::FETCH_OBJ); echo 'Getting the contacts that changed the last 3 months' . PHP_EOL; foreach ($data as $row) { $dt = new \DateTime('2015-04-…

Deploy Docker containers fast to Microsoft Azure

DEPLOY DOCKER CONTAINERS FAST TO MICROSOFT AZURE It’s hard to ignore the fact thatDockeris a way to move forward for rapid application development, distributed architectures and microservices. For developersDockeroffers great advantages as they can build their containers specifically for the task they work on. They grab a base image of a container, modify it for their purpose and prepare the functionality inside the container. Quality, testing and security teams now have a single instance to look at and ensure all functional and regulatory requirements are met. System engineers now don’t have to worry about providing a system with the required specs as the container is already provisioned for that purpose. But where do you deploy yourDockercontainers? You can set up your existing bare metal infrastructure to allow them to run containers, but this also means you need to learn about securing your container infrastructure, which is not an easy task. Luckily “the cloud” offers container …