SQL Workshop - Selecting columns without including a non-aggregate column in the group by clause

New Book Reviews!

 SQL Workshop - Selecting columns without including a non-aggregate column in the group by clause
Written by Nikos Vaggalis
Article Index
SQL Workshop - Selecting columns without including a non-aggregate column in the group by clause
The Solution

Thinking in terms of sets and set operations can be difficult at first but after a while you discover that you can do things without needed to drop down to procedural approaches.

This scenario requires us to be members of a hospital's Dietary Department and with the end of the year approaching we are assigned the task of estimating the amount of money needed for next year’s resource shopping, to keep the patients fed for the coming twelve months.

So we need to find the sum of the mean amount spent on the resources/raw material (vegetables, fruit, meat etc) grouped by Account Category (i.e. the account that serves for fruit) and Account Id (actual account number) used for their shopping, and use that as the basis for our new season’s budget estimate.

When a request for, say, fruit comes in, we translate that request into the amount of money consumed using a formula based on the fruit’s dynamically updated Mean Value, the Quantity of the request and a Ratio.

When the unit of measurement is 'PIECES' then our formula is :

`Amount = Mean Value x Quantity x Ratio`

while when the unit of measurement is a ‘KGR’ then the formula becomes:

`Amount = Mean Value x Quantity / Ratio`

The nature and meaning of the two formulae isn't as important as the fact that they vary according to the value of the Unit field.

 Materials Year Material_id Unit Ratio Mean_ value 1/1/2013 TX002 PIECES 1.000 \$36.00 1/1/2013 TX003 KGR 2.000 \$22.67

 Accounts Year Material_id Account_category Account_id 1/1/2013 TX002 01 220201511 1/1/2013 TX003 01 220201511

 Requests Year Req_id Material_id Quantity 1/1/2013 1 TX002 10 1/1/2013 1 TX003 60

It is this variation of the formulas that will create the most trouble, as we’ll soon find out.

Let’s get a preview of our data together with an attempt to implement the formulas:

`SELECT  r.material_id, m.unit,  CASE   WHEN m.unit = 'PIECES'   THEN m.mean_value * r.quantity * m.ratio    //formula 1 ELSE IF m.unit = 'KGR'   THEN m.mean_value * r.quantity / m.ratio  // formula 2 END as amount  from requests r, materials m   where r.year = '1/1/13' and        r.req_id = 1 and        r.material_id = m.material_id`

The desired result set would be :

 Result Set 1 Material_id Unit Amount TX002 PIECES \$360 TX003 KGR \$680.01

If we would also require their Account Category and Account Id (where the money for their purchasing is deducted from), the above query would become :

`SELECT  m.material_id,  m.unit, m.mean_value * r.quantity * m.ratio as amount, ``               a.account_category,``               a.account_id from requests r, materials m, accounts a where r.year = '1/1/13' and       r.material_id = m.material_id and       ``r.material_id = a.material_id and``       r.year = m.year and``       r.year = a.year and``       m.unit = 'PIECES' UNION ALL``SELECT  m.material_id,  m.unit, m.mean_value * r.quantity / m.ratio as     amount,``   a.account_category,``   a.account_idfrom requests r, materials m, accounts awhere r.year = '1/1/13' and      r.material_id = m.material_id and``      r.material_id = a.material_id and``      r.year = m.year and      ``r.year = a.year and``      m.unit = 'KGR'`

Producing a result set like :

 Result Set 2 Material_id Unit Amount Account_Category Account_Id ... PIECES \$360 01 220201511 … KGR \$680.01 01 220201511 PIECES \$580 01 220201511 KGR \$220 01 220201511 PIECES \$139 01 220201511 KGR \$200 01 220201511 … … ...

Of course our result set would include more units of measurement than PIECES and KGR and in a moment we will generalize in order to include additional values.

Using procedural logic we could now write a little program in Perl or any programming language which would treat the result set as a collection of values, iterate through it, and add to a variable called \$AmountPieces when encountering a unit of PIECES and \$AmountKGR when encountering a unit of KGR, doing our calculations row by row.

But let’s choose another path, that of pure SQL and set based logic.

In any case, since we are exclusively interested in the monetary aggregates, we can forgo the attributes of Material_id and Unit in the final answer as long as we get the formulas right.

Let's say that for reasons of simplicity, only PIECES are calculated differently and the rest of the units are all calculated using the same formula, something that simplifies our query from something like:

`CASE  WHEN m.unit = 'PIECES'  THEN m.mean_value * r.quantity * m.ratio  ELSE IF m.unit = 'KGR'  THEN m.mean_value * r.quantity / m.ratio  ELSE IF m.unit = 'LT'   THEN ELSE IF m.unit=….  THEN …. END as amount`

to:

`CASE  WHEN m.unit = 'PIECES'  THEN r.mean_value * r.quantity * m.ratio  ELSE r.mean_value * r.quantity / m.ratio END as amount`

Our first attempt to get to the sums would be:

`SELECT  a.account_category, a.account_id, CASE  WHEN m.unit = 'PIECES'  THEN sum(m.mean_value * r.quantity * m.ratio)   ELSE sum(m.mean_value * r.quantity / m.ratio) END as amount from requests r, materials m, accounts a where r.year = '1/1/13' and      r.material_id = m.material_id and``      r.material_id = a.material_id and``      r.year = m.year and``      r.year = a.year``      GROUP BY account_category, account_id`

but instead of the results we are expecting we are confronted with the following error message instead :

E_US0B63 line 1, The columns in the SELECT clause must be contained in the
GROUP BY clause.