Module design, part 6: Inter-module queries

Module design, part 6: Inter-module queries

August 24, 2013 - by Patrick van Bergen

Tags: ,

shake hands

I will now address a tricky part of module separation: database queries that access the data stores of separate modules.

We have seen that modules can have clear-cut boundaries. Module A should not access Module B's internal datastructure directly. But is that even possible?

In any serious application it is quite common that a web page shows an overview of data that is retrieved from several modules. For small lists this is not a problem, and the objects can be aggregated by calling the API functions of several modules. But let's consider the case of a paged overview of a dataset that contains 100.000 rows. And you want the user to filter and search this data real-time. Such a thing is the strong suit of a relational database. But it requires the query to have several joins. Joins that combine the data of several modules. So this requires accessing the internal datastructures of several modules at one place (the query), doesn't it? And it requires Module A to access the internal structures of Module B. It seems unavoidable.

What you would want to do is to split up the query and have each module compose part of it. The first thing you need to have for such a setup is a builder to create queries. We call ours Q. You can tell Q which clauses to add to the query, and then its __toString method will create the query. Here's an example of a Q-query:

        $Q = Q::createSelect(array(
                'id'                => Q::field('bike.id'),
                'name'              => Q::sql('CONCAT(%1$s, " - ", %2$s)', Q::field('bike.brand'), Q::field('bike.description_short')),
                'brand'             => Q::field('bike.brand'),
                'price'             => Q::field('bike.price'),
                'internal_number'   => Q::field('bike.internal_number'),
                'owner_name'        => Q::field('relation.name'),
            ))
            ->fromTable('mod_bike_bike_data', 'bike')
            ->joinTable('mod_bike_category', 'category', Q::JOIN_INNER, Q::link('category.id', 'bike.category'))
            ->joinTable('mod_bike_brand', 'brand', Q::JOIN_LEFT, Q::link('brand.id', 'bike.brand'))
            ->joinTable('mod_relation', 'relation', Q::JOIN_LEFT, Q::link('bike.owner_id', 'relation.id'));

When you call "echo $Q", you will see:

SELECT 
    `bike`.`id` `id`,
    CONCAT(`bike`.`brand`, " - ", `bike`.`description_short`) `name`,
    `bike`.`brand` `brand`,
    `bike`.`price` `price`,
    `bike`.`internal_number` `internal_number`,
    `relation`.`relation_name` `owner_name`,  
FROM `mod_bike_bike_data` `bike`
INNER JOIN `mod_bike_category` `category` ON `category`.`id` = `bike`.`category`
LEFT JOIN `mod_bike_brand` `brand` ON `brand`.`id` = `bike`.`brand`
LEFT JOIN `mod_relation` `relation` ON `bike`.`owner_id` = `relation`.`id` 

As you can see, Q uses a fluent interface to build queries. Its methods are self-evident, I suspect. Except for the Q::sql() part. It allows you to insert any plain SQL code.

The Q query selects fields from two different modules: mod_bike and mod_relation. most of the datastructures accessed (the tables 'mod_bikes_bike_data', 'mod_bikes_category', and 'mod_bikes_brand') are located in the module 'mod_bikes'. One datastructure, however, is stored in 'mod_relation'. We want mod_bike to create part of the query, and mod_relation the other part.

Now we'll see how we can split the responsibilities for this query between modules. For this purpose each module has a Q API. mod_bike's Q API looks like this:

class BikeQApi
{
    public function fieldBikeId($tableAlias)
    {
        return Q::field($tableAlias . '.id');
    }
    
    public function fieldBikeBrand($tableAlias)
    {
        return Q::field($tableAlias . '.brand');
    }    
    
    public function fieldBikeCategory($tableAlias)
    {
        return Q::field($tableAlias . '.category');
    }    
    
    public function fieldBikeOwnerId($tableAlias)
    {
        return Q::field($tableAlias . '.owner_id');
    }

    public function fromBike(QuerySelect $Q, $tableAlias)
    {
        $Q->fromTable('mod_bike_bike_data', $tableAlias);
    }

    public function joinBrand(QuerySelect $Q, $tableAlias, $joinType, $joinCondition)
    {
        $Q->joinTable('mod_bike_brand', $tableAlias, $joinType, $joinCondition);
    }

    public function joinCategory(QuerySelect $Q, $tableAlias, $joinType, $joinCondition)
    {
        $Q->joinTable('mod_bike_category', $tableAlias, $joinType, $joinCondition);
    }
}   

and mod_relation's Q API looks like this:

class RelationQApi
{
    public function fieldRelationName($tableAlias)
    {
        return Q::field($tableAlias . '.name');
    }
    
    public function fromRelation(QuerySelect $Q, $tableAlias)
    {
        $Q->fromTable('mod_relation', $tableAlias);
    }

    public function joinRelation(QuerySelect $Q, $id, $joinType, $joinCondition)
    {
        $Q->joinTable('mod_relation', $tableAlias, $joinType, $joinCondition);
    }
}

These API's create custom Q functions for specific datastructures. With these API's we can compose a query like this:

$Q = Q::createSelect(array(
    'id'             => $BikeQApi->fieldBikeBrand('bike'),
    'name'           => Q::sql('CONCAT(%1$s, " - ", %2$s)', $BikeQApi->fieldBikeBrand('bike'), $BikeQApi->fieldBikeShortDescription('bike')),
    'price'          => $BikeQApi->fieldBikePrice('bike'),
    'article_number' => $BikeQApi->fieldBikeArticleNumber('bike'),
    'owner_name'     => $RelationQApi->fieldRelationName('relation'),
));

$BikeQApi->fromBike($Q, 'bike');
$BikeQApi->joinCategory($Q, 'category', Q::JOIN_INNER, Q::link($BikeQApi->fieldBikeCategory('bike'), $BikeQApi->fieldCategoryId('category')));
$BikeQApi->joinBrand($Q, 'brand', Q::JOIN_LEFT, Q::link($BikeQApi->fieldBikeBrand('bike'), $BikeQApi->fieldBrandId('brand')));
$RelationQApi->joinRelation($Q, 'relation', Q::JOIN_LEFT, Q::link($BikeQApi->fieldBikeOwnerId('bike'), $RelationQApi->fieldRelationId('relation')));

As you can see, each part of the query is now built by the module that controls the datastructure needed for it. You may also appreciate the fact that the query does not show a tablename or even a table field. This makes it easier to change the datastructures of the modules.

Work in progress

This approach is interesting, but we actually we have just started thinking about the problem of inter-module queries. The approach shown here is a possible one, but it creates a lot of overhead (or boiler-plate) code. It's a work in progress.

You might ask what happens when Module A uses a different type of datastore than B. In that case a simple query is out of the question. We haven't come across this case in practise, and we will cross that bridge when we get there :)


Photo: handshake

Share this post!

Comments

Leave a comment!

Italic and bold

*This is italic*, and _so is this_.
**This is bold**, and __so is this__.

Links

This is a link to [Procurios](http://www.procurios.nl).

Lists

A bulleted list can be made with:
- Minus-signs,
+ Add-signs,
* Or an asterisk.

A numbered list can be made with:
1. List item number 1.
2. List item number 2.

Quote

The text below creates a quote:
> This is the first line.
> This is the second line.

Code

A text block with code can be created. Prefix a line with four spaces and a code-block will be made.