Pages

2013/03/17

Look mama, no databases

I recently got a question about testing services, where this person was under the impression that when calling services you did not need to know what was going on. You just pass in data and a result is returned. How this result was given was of no importance, all you needed to test was that you got a result and it matched up with what was expected.

The only problem in this case was that he was talking about testing the business logic in the service and I replied he should perform tests on that business logic instead of just asserting the returned result. After some further discussions it was clear that he tested the service "as is", meaning the service makes a call to a database to gather its results and returns the calculation on those results.

When I state "as is", I truly mean the way it's being used in production. So the database call collects real data on which business logic is applied. You can see this is not a healthy situation, especially when you also have services that apply business logic on data and store it back into the database.

In "The Grumpy Programmer's PHPUnit Cookbook", author Chris Hartjes wrote this one sentence that says it all: "Unit test suites are meant to be testing code, not the ability of a database server to return results". And he's right, you shouldn't use database connections when your testing business rules and functional logic.

DISCLAIMER: I'm still using PHPUnit 3.4.15 for testing Zend Framework 1 projects, so all tests here should work with higher versions of PHPUnit. If not, let me know in the comments.

Say you have a service that calculates margins for sold products, it's easy to connect to a sandboxed database and retrieve the product purchase and sales prices, retrieve the commission for the sales person on top of it all and return the calculated margin. Easy enough, right? But say you have thousands of tests like this that all connect to several tables in the database. Running tests will only go slower and slower until it reaches the point developers on the team don't run them because they will take too long to run.

Example: Product.php

<?php

class Product
{
    protected $_db;

    public function __construct()
    {
        $dsn = 'mysql:dbname=test;host=localhost;';
        $user = 'testuser';
        $pass = 'test123';
        $this->_db = new PDO($dsn, $user, $pass);
    }
    public function calculateMargins($productId)
    {
        $sql = 'SELECT * FROM product
                    LEFT JOIN employee
                        ON product.employeeId = employee.employeeId
                    WHERE product.id = ?';
        $stmt = $this->_db->prepare(
            $sql, array (PDO::ATTR_CURSOR => PDO::CURSOR_SCROLL));

        $stmt->execute(array ($productId));
        $result = $stmt->fetchAll();
        $margin = ($result[0]['salesPrice'] - $result[0]['purchasePrice'])
                * (1 - $result[0]['commission']);
        return $margin;
    }
}

This can be just a normal class like you must have dozens off. In the case where a test is being made to validate this business logic, this class will connect to your database, retrieves the data and performs the calculation.

Example test that will pass when the values are the following:

  • purchasePrice: 299.95
  • salesPrice: 399.95
  • commission: 15%

<?php
require_once 'Product.php';

class ProductTest extends PHPUnit_Framework_TestCase
{
    /**
     * @backupGlobals disabled
     * @backupStaticAttributes disabled
     */
    public function testCalculateProductMargin()
    {
        $expectedMargin = 85.00;
        $product = new Product();
        $result = $product->calculateMargins(1);
        $this->assertSame($expected, $result,
            sprintf('Expected margin of %02f did not match actual margin %02f',
                $expected, $result));
    }
}

The main issue I have with this test is that I'm too depending on the database to deliver me the data. And what happens if another developer needs to modify the data in the database for his tests? Exactly, my test will fail.

How can we solve this so we can test the business logic without connecting to the database?

The first thing we need to do is to ensure we can pass in a database replacement into our Product class, our mock object. So we rewrite the Product class a little so the constructor accepts a dbAdapter as parameter.

<?php

class Product
{
    protected $_db;

    public function __construct($db = null)
    {
        if (null === $db) {
            $dsn = 'mysql:dbname=test;host=localhost;';
            $user = 'testuser';
            $pass = 'test123';
            $db = new PDO($dsn, $user, $pass);
        }
        $this->_db = $db;
    }
    public function calculateMargins($productId)
    {
        $sql = 'SELECT * FROM product
                    LEFT JOIN employee
                        ON product.employeeId = employee.employeeId
                    WHERE product.id = ?';
        $stmt = $this->_db->prepare(
            $sql, array (PDO::ATTR_CURSOR => PDO::CURSOR_SCROLL));

        $stmt->execute(array ($productId));
        $result = $stmt->fetchAll();
        $margin = ($result[0]['salesPrice'] - $result[0]['purchasePrice'])
                * (1 - $result[0]['commission']);
        return $margin;
    }
}

The next thing we need to do is to modify our ProductTest class so we can set our values directly in our test, without relying on databases for providing us this data.

<?php
require_once 'Product.php';

class ProductTest extends PHPUnit_Framework_TestCase
{
    /**
     * @backupGlobals disabled
     * @backupStaticAttributes disabled
     */
    public function testCalculateProductMargin()
    {
        $data = array (
            array (
                'purchasePrice' => 299.95,
                'salesPrice' => 399.95,
                'commission' => 0.15,
            ),
        );

        $stmt = $this->getMock('PDOStatement', array ('execute','fetchAll'));
        $stmt->expects($this->any())
             ->method('execute')
             ->will($this->returnValue(true));
        $stmt->expects($this->any())
             ->method('fetchAll')
             ->will($this->returnValue($data));

        $pdo = $this->getMock('PDO', array('prepare'),
            array('sqlite:dbname=:memory'),'PDOMock',true);
        $pdo->expects($this->any())
            ->method('prepare')
            ->will($this->returnValue($stmt));

        $expectedMargin = 85.00;
        $product = new Product($pdo);
        $result = $product->calculateMargins(1);
        $this->assertSame($expected, $result,
            sprintf('Expected margin of %02f did not match actual margin %02f',
                $expected, $result));
    }
}

Yes, we need to mock out our PDO object and our PDOStatement object just to mock the behaviour of our database call, but the benefit is now we can focus on the business logic instead of expecting everything will be provided. We now have just one set of values, but we can just add more and more values to it and move it to a data provider.

<?php
require_once 'Product.php';

class ProductTest extends PHPUnit_Framework_TestCase
{
    public function marginDataProvider()
    {
        return array (
            array (299.95, 399.95, 0.15, 85.00),
            array (1200.00, 1500.00, 0.10, 270.00),
            array (1200.00, 1200.00, 0.10, 0.00),
            array (1200.00, 1000.00, 0.10, -180.00),
        );
    }
    /**
     * @backupGlobals disabled
     * @backupStaticAttributes disabled
     * @dataProvider marginDataProvider
     */
    public function testCalculateProductMargin($pp, $sp, $comm, $expected)
    {
        $data = array (
            array (
                'purchasePrice' => $pp,
                'salesPrice' => $sp,
                'commission' => $comm,
            ),
        );

        $stmt = $this->getMock('PDOStatement', array ('execute','fetchAll'));
        $stmt->expects($this->any())
             ->method('execute')
             ->will($this->returnValue(true));
        $stmt->expects($this->any())
             ->method('fetchAll')
             ->will($this->returnValue($data));

        $pdo = $this->getMock('PDO', array('prepare'),
            array('sqlite:dbname=:memory'),'PDOMock_' . uniqid(),true);
        $pdo->expects($this->any())
            ->method('prepare')
            ->will($this->returnValue($stmt));

        $product = new Product($pdo);
        $result = $product->calculateMargins(1);
        $this->assertSame($expected, $result,
            sprintf('Expected margin of %02f did not match actual margin %02f',
                $expected, $result));
    }
}

So now we can test multiple values over and over again, and if for some reason we need to test a specific combination of values, we just add it to our data provider method. In this example we added values to see what happens when we have no margin and when we have sold products below purchase price.

The end result is you have a test that will test the business requirements, isn't making an expensive database call (runs faster) and allows you to modify the test in case the scope of business logic changes (e.g. no commission is paid when sales price is equal or below the purchase price).

8 comments:

  1. You could also decouple calculateMargins from the actual database query code so you don't have to build up a mock object.

    I mean, the calculation itself seems to be what you're after and I would aim to test this seperately. E.g.: calculateMargin($salesPrice, $purchasePrise, $commission);

    Make it a "stupid" method which doesn't need to know about where the data comes from.

    This mock-object build up always seems to overcomplicate tests for me. E.g. when business logic becomes more heavy and depends on database functions (stored procedures, triggers), I'd rather use a sqlite database and do a clear setup and teardown in my test suite so I don't run into issues where a colleague updated a row in the database and all of the sudden my tests fail.

    But that setup seems to be more painful with something like straight PDO or Zend_Db — at least IMHO. This is where something like Doctrine really excels.

    ReplyDelete
  2. Hey Till,


    I completely agree with your suggestion of having a dumb method that just calculate margins based on given parameters.

    I just took the example as it was implemented and added the functionality necessary to overcome the issue to retrieve data from the database.

    But thanks for the feedback, it's a good practice to keep your methods as "dumb" as possible.

    ReplyDelete
  3. Anonymous17/3/13 16:43

    His tests are actually a very healthy situation, the "unhealthy" part is that he only has integration tests and not unit tests too. Both are needed for proper smoke testing, so no need to discourage writing this.

    ReplyDelete
    Replies
    1. Indeed, it wasn't specified which tests were intended by developer. If you're writing unit tests, then an in-memory database will serve you well. In case of integration tests, I would still have suggested an isolated database to avoid contamination from other developers and to have an ability to reset before each sequence.

      Delete
  4. I wanted to properly test database related calls as well, but it's making the test code even less readable than the code of tested method itself. In the end i've ended with writting functional tests instead. They are slower to run, but easy to write and maintain.

    In the end it's still better for me if the tests run in a 100ms than writting 100 lines code just to assert one database query.

    To avoid problem with your colleagues just create fixtures and test database with preloaded fixtures. If you run into problem just recreate database with these fixtures and you are good to go.

    ReplyDelete
    Replies
    1. Yes, the easiest path with less resistant is often taken and it was just this discussion that let me into writing this article.

      But I've seen many tests fail because something on the database side changed and using tools like DBUnit have given developers many headaches because all their tests needed to get updated after the database schema was reconfigured. And also, when you run 5000 unit test and more, developers had to wait over 30 minutes to run their tests.

      My principle here is basically the same as why I advocate others to write unit tests: invest some extra time in the early stage, so you can benefit later.

      Same is with mocking your database, web service, socket, streams and filesystem

      Delete
  5. Anonymous25/3/13 00:58

    You could create a gateway (http://martinfowler.com/eaaCatalog/gateway.html) to access the database and a model class. Two completely different responsibilities, they should be splitted.
    My 2 cents.

    ReplyDelete