parser

About compatibility of SQL queries

Misha v.3 [February 16, 2007]

Much may be said about the necessity of writing a code that would work on different platforms, and the argument concerning its compatibility would be long.

Perfectly, we would like every piece of our code to be written once and for all, but … “‘perfectly’ is such a liar…”

Today I will tell you what I do to minimize changes that I would have to make in my Parser3 code when moving from one SQL-server to another.

As it is said in FAQ, Parser3 uses the same interface to work with different database servers, which means that you won’t have to rewrite Parser’s queries-related commands. That’s as good as it gets, but if the queries are written for some specific server, the task of moving to another server may prove a hard nut to crack.

No doubt, the compatibility shouldn’t become an end in itself. If the server you are working with allows building tree structures by means of just one query (e.g. Oracle does just that) you better avoid doing it by means of Parser. An SQL-server will make it faster and better, minimizing memory requirements and the size of the code itself. Anyway, the process of building a tree structure like this would better be made a separate Parser method, just in case you may have to move to a server that doesn’t provide this possibility: in this case the changes will be made in a specific, logically complete code fragment.

The first thing you should do when writing a database query is use the functions provided by Parser.

We strongly recommend that you avoid writing MySQL’s LIMIT straight in the queries as long as this function is not provided by some other servers (such as Oracle or Informix). What you should do instead is change this query:

$tNews[^table::sql{
	SELECT
		id,
		name
	FROM
		news
	ORDER BY
		dt DESC
	LIMIT 5
}]
with this:
$tNews[^table::sql{
	SELECT
		id,
		name
	FROM
		news
	ORDER BY
		dt DESC
}[
	$.limit(5)
]]

Parser will fix the query code by itself, depending on the specific server you are working with, and this, in turn, guarantee the correct result.

RULE 1: avoid LIMIT/OFFSET in your queries.

Another thing is that when you are writing a database query and use a function which is provided by this specific server but which is not included in SQL92, put it into a separate method of the class designed for working with this very database server.

At present, I have several classes for the SQL-servers that I work with. Each class is a set of methods bearing the same names, so when moving the code written for MySQL to MSSQL, I simply link up another class and this is it!

RULE 2: put specific SQL functions into separate methods/classes.

Here are some methods from the class mysql.p for MySQL:

@today[]
$result[CURDATE()]
#end @today[]

@now[]
$result[NOW()]
#end @now[]

@year[source]
$result[YEAR($source)]
#end @year[]

@month[source]
$result[DATE_FORMAT($source,'%m')]
#end @month[]

@ymd[source]
$result[DATE_FORMAT($source,'%Y-%m-%d')]
#end @ymd[]

@date_format[source;format_string]
$result[DATE_FORMAT($source, '^if(def $format_string){$format_string}{%Y-%m-%d}')]
#end @date_format[]

Methods made to serve the same purpose in MSSQL:

@today[]
$result[CONVERT(char, GETDATE(), 111)]
#end @today[]

@now[]
$result[CONVERT(char, GETDATE(), 20)]
#end @now[]

@year[source]
$result[YEAR($source)]
#end @year[]

@month[source]
$result[MONTH($source)]
#end @month[]

@ymd[source]
$result[CONVERT(char, $source, 111)]
#end @ymd[]

@date_format[source;format_string]
$result[There is no such method in MSSQL :(]
#end @date_format[]

Of course, there are some useful functions, which are provided by some servers but are missing with others. In this case, I try to reach a compromise by using slightly different but compatible algorithms.

For example, if using MySQL I want to use flags indicating whether the message was added today, I may write it like this:

$tMessage[^table:sql{
	SELECT
		forum_message_id,
		name,
		....
		IF(^pSQL.date_format[dt_published;'%Y-%m-%d'] = ^pSQL.today[], 1, 0) AS new
	FROM
		forum_message
	...
}]

But when trying to transfer this query to MSSQL I will face a problem: MSSQL doesn’t provide opportunity to output a date in an arbitrary format (or maybe I just don’t know that it’s there; if so, please let me know).

If I rewrite the query like this:

$tMessage[^table:sql{
	SELECT
		forum_message_id,
		name,
		....
		IF(dt_published >= ^pSQL.today[], 1, 0) AS new
	FROM
		forum_message
	...
}]

I will have an opportunity to get the same result by means of MSSQL and Parser code will turn out compatible.

Another example:
When I wrote a code for MySQL not knowing that MSSQL didn’t support format strings, my code was like this:

$tNews[^table:sql{
	SELECT
		id,
		name,
	FROM
		news
	WHERE
		^pSQL.date_format[dt;%Y-%m] = '${year}-$month'
}]

I was so upset to learn that MSSQL didn’t support such formatting and I had to rewrite the code. I could rewrite the code for one specific project, but I would then have to rewrite it again and again for each and every project, so decided to supplement sql-classes with methods ^year[], ^month[], ^date[], ^ymd[] and change the query to the following:

$tNews[^table:sql{
	SELECT
		id,
		name,
	FROM
		news
	WHERE
		^pSQL.year[dt] = '$year' AND
		^pSQL.month[dt] = '$month'
}]

This made the query fully logical for MSSQL.

IMPORTANT!
This example is not so good regarding performance as long as the SQL-server will have to calculate functions year and month for every entry and only then select the needed entries. When such a query is processed, the server will not the indexes feature.

Sometimes, making changes in query logic is not worth messing with; it would be better to simply put the whole query into a separate method to be completely rewritten when you move to a different database server.

RULE 3: construction SELECT * FROM... is a bad dream to be forgotten. Always list the names of fields you need, even when you deal with a temporary code.

This is not only because this temporary code will gradually become permanent (and will be often copied from one place to another), and not only because the number of table fields will grow, and you will retrieve them all (cramming your memory) regardless of whether you need them or not at the very moment. This is also because when you do not list exactly the fields you need, everyone (including yourself) will find it rather hard to read a code like this.

As a result SQL classes were developed. Besides unification they allow to get information on the time all queries took to be processed while the document was formed, record “slow” queries, and cache the result of complex queries.