Pages

2016/01/04

PHP arrays - the basics




In 2015 we celebrated the twentieth birthday of PHP and we even had the release of php-7.0.0. Life is all good!

I wish it was all this peachy. In my work as a consultant or when I'm working on OSS projects I cannot help noticing that some of the basics of PHP are missing, especially when dealing with arrays.

To give you an example: developers and site owners often complain about their web application being slow. When I get called in to improve performance, I stumble on a huge number of foreach-loops when processing data retrieved from a database. The first iteration happens right at data retrieval where the result set is iterated over to produce a new array of data models.

/**
 * Fetches all entries, optionally matching provided conditions,
 * optionally ordered by provided order, optionally limited by
 * provided number of entries with an optional provided offset.
 *
 * @param array $where
 * @param array $order
 * @param int $limit
 * @param int $offset
 * @return array
 */
public function fetchAll($where = [], $order = [], $limit = 0, $offset = 0)
{
    $sql = 'SELECT * FROM ' . $this->getTable();
    if ([] !== $where) {
        $sql .= ' WHERE ';
        $fields = array_keys($where);
        $sql .= implode(' AND ', $fields);
    }
    if ([] !== $order) {
        $sql .= ' ORDER BY ' . implode(', ', $order);
    }
    if (0 < $limit) {
        $sql .= ' LIMIT ';
        if (0 < $offset) {
            $sql .= $offset . ', ';
        }
        $sql .= $limit;
    }
    if (false === ($statement = $this->pdo->prepare($sql))) {
        $this->error($this->pdo->errorInfo());
    }
    if (false === $statement->execute(array_values($where))) {
        $this->error($statement->errorInfo());
    }
    $result = $statement->fetchAll(\PDO::FETCH_ASSOC);
    $collection = [];
    foreach ($result as $entry) {
        $collection[] = new $this->modelName($entry);
    }
    return $collection;
}

A second iteration is often found somewhere at the controller or services level where the collection of data models is enriched or changed as preparation for the view (or output).

/**
 * Apply a discount percentage on all articles
 *
 * @param float $discount
 * @return array
 */
public function applyDiscountAsPercentage($discount)
{
    $entries = $this->mapper->fetchAll();
    $discounts = [];
    foreach ($entries as $entry) {
        $discountProduct = new DiscountProduct($entry->toArray());
        $discountCalc = round(($entry->getPrice() * (100 - $discount)) / 100, 2);
        $discountProduct->setDiscountPrice($discountCalc);
        $discounts[] = $discountProduct;
    }
    return $discounts;
}

A third iteration is often found at the output, where the collection is presented in a listing, table or grid.

require_once __DIR__ . '/php_arrays_examples.php';

use \DragonBe\ProductService;
use \DragonBe\ProductMapper;
use \DragonBe\ProductGateway;

$pdo = new \PDO('sqlite:phparray.db');

$productService = new ProductService(new ProductMapper(new ProductGateway($pdo)));
$discounts = $productService->applyDiscountAsPercentage(15);

echo sprintf('%-25s %10s %10s', 'Product', 'Sales', 'Promo') . PHP_EOL;
foreach ($discounts as $discountProduct) {
    echo sprintf(
        '%-25s %10.2f %10.2f',
        $discountProduct->getTitle(),
        $discountProduct->getPrice(),
        $discountProduct->getDiscountPrice()
    ) . PHP_EOL;
}

At this point, we already count 3 iterations between fetching data and outputting data. Often there are  a whole lot more iterations in between. Only to display a simple list of products with discounts.

Product                        Sales      Promo
demo_phone                    295.95     251.56
demo_computer                1999.95    1699.96
demo_tablet                   675.00     573.75
demo_drive                      5.99       5.09
demo_charger                   12.45      10.58
demo_coffee_mug                24.95      21.21
demo_phone_case                29.00      24.65
demo_usb_cable                 45.95      39.06
demo_external_screen          199.95     169.96
Added prodcut                 129.95     110.46
Added prodcut                 129.95     110.46
Added prodcut                 129.95     110.46

So what is the big deal here? Well, in development you probably test these routines with maybe 5 data entries (or 10 for extra edge cases) and performance is great. But in production, especially over time, you're dealing with a couple of thousand records. Millions if you're working for a large company. Simple math will give you an idea how 3 times many will start to slow things down. Unfortunately this is where PHP gets a bad reputation for being slow, even though I see similar mistakes in other technologies as well.

Luckily PHP has a few powerful array functions to help developers improve performance and their code.

In next articles I will highlight some of these functions and give real world examples where they will make a difference in performance.