![]() |
About compatibility of SQL queries
Misha v.3 [February 16, 2007]
Much may be said about the necessity of writing a code that would work on different platforms, and the argument concerning its compatibility would be long.
Perfectly, we would like every piece of our code to be written once and for all, but … “‘perfectly’ is such a liar…”
Today I will tell you what I do to minimize changes that I would have to make in my Parser3 code when moving from one SQL-server to another.
As it is said in FAQ, Parser3 uses the same interface to work with different database servers, which means that you won’t have to rewrite Parser’s queries-related commands. That’s as good as it gets, but if the queries are written for some specific server, the task of moving to another server may prove a hard nut to crack.
No doubt, the compatibility shouldn’t become an end in itself. If the server you are working with allows building tree structures by means of just one query (e.g. Oracle does just that) you better avoid doing it by means of Parser. An SQL-server will make it faster and better, minimizing memory requirements and the size of the code itself. Anyway, the process of building a tree structure like this would better be made a separate Parser method, just in case you may have to move to a server that doesn’t provide this possibility: in this case the changes will be made in a specific, logically complete code fragment.
The first thing you should do when writing a database query is use the functions provided by Parser.
We strongly recommend that you avoid writing MySQL’s LIMIT straight in the queries as long as this function is not provided by some other servers (such as Oracle or Informix). What you should do instead is change this query:
$tNews[^table::sql{ SELECT id, name FROM news ORDER BY dt DESC LIMIT 5 }]with this:$tNews[^table::sql{ SELECT id, name FROM news ORDER BY dt DESC }[ $.limit(5) ]]Parser will fix the query code by itself, depending on the specific server you are working with, and this, in turn, guarantee the correct result.
RULE 1: avoid LIMIT/OFFSET in your queries.
Another thing is that when you are writing a database query and use a function which is provided by this specific server but which is not included in SQL92, put it into a separate method of the class designed for working with this very database server.
At present, I have several classes for the SQL-servers that I work with. Each class is a set of methods bearing the same names, so when moving the code written for MySQL to MSSQL, I simply link up another class and this is it!
RULE 2: put specific SQL functions into separate methods/classes.
Here are some methods from the class mysql.p for MySQL:
@today[] $result[CURDATE()] #end @today[] @now[] $result[NOW()] #end @now[] @year[source] $result[YEAR($source)] #end @year[] @month[source] $result[DATE_FORMAT($source,'%m')] #end @month[] @ymd[source] $result[DATE_FORMAT($source,'%Y-%m-%d')] #end @ymd[] @date_format[source;format_string] $result[DATE_FORMAT($source, '^if(def $format_string){$format_string}{%Y-%m-%d}')] #end @date_format[]Methods made to serve the same purpose in MSSQL:
@today[] $result[CONVERT(char, GETDATE(), 111)] #end @today[] @now[] $result[CONVERT(char, GETDATE(), 20)] #end @now[] @year[source] $result[YEAR($source)] #end @year[] @month[source] $result[MONTH($source)] #end @month[] @ymd[source] $result[CONVERT(char, $source, 111)] #end @ymd[] @date_format[source;format_string] $result[There is no such method in MSSQL :(] #end @date_format[]Of course, there are some useful functions, which are provided by some servers but are missing with others. In this case, I try to reach a compromise by using slightly different but compatible algorithms.
For example, if using MySQL I want to use flags indicating whether the message was added today, I may write it like this:
$tMessage[^table:sql{ SELECT forum_message_id, name, .... IF(^pSQL.date_format[dt_published;'%Y-%m-%d'] = ^pSQL.today[], 1, 0) AS new FROM forum_message ... }]But when trying to transfer this query to MSSQL I will face a problem: MSSQL doesn’t provide opportunity to output a date in an arbitrary format (or maybe I just don’t know that it’s there; if so, please let me know).
If I rewrite the query like this:
$tMessage[^table:sql{ SELECT forum_message_id, name, .... IF(dt_published >= ^pSQL.today[], 1, 0) AS new FROM forum_message ... }]I will have an opportunity to get the same result by means of MSSQL and Parser code will turn out compatible.
Another example:
When I wrote a code for MySQL not knowing that MSSQL didn’t support format strings, my code was like this:$tNews[^table:sql{ SELECT id, name, FROM news WHERE ^pSQL.date_format[dt;%Y-%m] = '${year}-$month' }]I was so upset to learn that MSSQL didn’t support such formatting and I had to rewrite the code. I could rewrite the code for one specific project, but I would then have to rewrite it again and again for each and every project, so decided to supplement sql-classes with methods ^year[], ^month[], ^date[], ^ymd[] and change the query to the following:
$tNews[^table:sql{ SELECT id, name, FROM news WHERE ^pSQL.year[dt] = '$year' AND ^pSQL.month[dt] = '$month' }]This made the query fully logical for MSSQL.
IMPORTANT!
This example is not so good regarding performance as long as the SQL-server will have to calculate functions year and month for every entry and only then select the needed entries. When such a query is processed, the server will not the indexes feature.Sometimes, making changes in query logic is not worth messing with; it would be better to simply put the whole query into a separate method to be completely rewritten when you move to a different database server.
RULE 3: construction SELECT * FROM... is a bad dream to be forgotten. Always list the names of fields you need, even when you deal with a temporary code.
This is not only because this temporary code will gradually become permanent (and will be often copied from one place to another), and not only because the number of table fields will grow, and you will retrieve them all (cramming your memory) regardless of whether you need them or not at the very moment. This is also because when you do not list exactly the fields you need, everyone (including yourself) will find it rather hard to read a code like this.
It happened so that I was overly dissatisfied with the functionality of standard ^table::sql{}, ^hash::sql{} & Co. For example, I wanted to occasionally get information on the time all queries took to be processed while the document was formed, record “slow” queries, and cache the result of complex queries.
Since I deemed it incorrect to make changes directly in Parser’s code, I decided to write SQL classes, which would provide the functionality I needed.
To use appropriate class I put the next code into method @auto[]: (of cause I mustn’t forget to include this class by means of @USE or ^use[]):
$pSQL[^mysql::init[$SQL.connect-string; $.cache_dir[/../data/sql_cache] ]]And then I have, in $pSQL, an instance of class mysql, the methods of which may be addressed from any place in Parser code:@main[] <html> ^pSQL.server{ ^rem{ *** retrieve a part of data into Parser table $tPage *** } <head> <title>$tPage.title</title> </head> ... ^rem{ *** all methods which can perform database queries, since we’re already connected *** } ^printPageMenu[] ^rem{ *** add an entry to table info *** } ^pSQL.void{INSERT INTO info (name) VALUES ('$form:name')} ^rem{ *** retrieve last_insert_id() and write it into the field sort_order of the last inserted entry *** } $tmp[^pSQL.set_last_insert_id[info;sort_order]] ... } </html>Note: I always use one ^connect[] per page (with ^SQL.server{} starting at the beginning of the page and ending at the end). This rule holds true except for those cases when I need sql transactions (by the way, with the new classes, you don’t have to place sql queries inside server{} anymore).
To receive debug information, the $pSQL object should be initialized in a different way:
$pSQL[^mysql::init[$SQL.connect-string; $.is_debug(1) $.cache_dir[/../data/sql_cache] ]]and the statistics can be received by means of a method placed in @postprocess[]:
@postprocess[sBody] $result[$sBody] ^if(def $pSQL){ ^if(def $form:mode && ^form:tables.mode.locate[field;debug]){ ^rem{ *** if addressing ?mode=debug we receive and save information on all sql queries on page *** } ^pSQL.printStatistics[$.file[/../data/sql.txt]] }{ ^rem{ *** by default, we store to log only the information related to slow queries and pages with multiple queries *** } ^pSQL.printStatistics[ $.file[/../data/sql.log] $.debug_time_limit(500) $.debug_queries_limit(25) # $.debug_result_limit(3000) # $.is_expand_exceeded_queries_to_log(1) ] } }If the document is addressed with ?mode=debug, information on all sql queries we had to form it through object $pSQL will be stored to file sql.txt. In usual mode, whenever an sql query takes more than 500ms, or the number of queries put in page generation exceeds the limit specified in debug_queries_limit, a record will be made in sql.log.
This feature allows me to always use $.is_debug(1) initializing and make all sql queries through object $pSQL. I also occasionally check sql.log and see which queries were poorly written and must be fixed.
It’s worth mentioning that, when outputting debug information with mysql, every query is accompanied by an appropriate explanation, which is also recorded in the log-file. This makes reading the log easier in that the problem becomes clearer. It’s also worth mentioning that only problematic issues and statistics are accompanied by explanations, and $.is_debug(1) is used for all queries only to measure time/memory taken by each query. This makes the whole procedure less time- and memory-consuming.
And, in conclusion, let me say a word or two about caching the results of the queries.
Working with SQL server is very convenient: the stored information is structured and may be retrieved in a specific way. Sometimes, however, it is difficult, if not impossible, to optimize a complex query without overcomplicating it. If such a query operates data which is rarely modified, the simplest way out of it is to cache its results.
Actually, SQL servers provide the machinery of caching the queries but “poultry farmers” don’t always have access to its settings. In this case, the sql classes we are discussing may come in handy and are easy to use:
$tReqult[^pSQL.table{ SELECT ... FROM table_1, table_2, table_3 WHERE ... GROUP BY ... HAVING ... ORDER BY ... }[ $.limit(10) ][ $.file[hardcore_query.txt] $.cache_interval(1/24) ]]In this case, a complex sql-query will be performed and its results will be stored in file hardcore_query.txt. When the results of the query are needed (e.g. when another visitor requests the page), the sql-query will not be repeated. Its results will be taken from the saved file.
The specified caching parameter indicates that the lifetime of the results saved to file expire every hour. The file will then be deleted and a new visit will initiate a new query, the results of which will be stored to file, and then the new caching cycle will be started.
Some more parameters may be used: $.cache_expiration_time[time] will clear the cache-file with the first address after the specified time. This parameter may be used together with cache_interval. In this case, the cache file will be deleted once per specified interval as well as when the expiration time comes.
To force deleting the cache file, the following method should be called:
^pSQL.clear[hardcore_query.txt]It may be useful when you work in administrator interface and you need to force deleting cache file, as long as the data in the related tables is modified.
Download:
Sql.zip (30.10.2007 15.6 KB)
Classes for working with MySQL, Oracle, MSSQL and PgSQLIMPORTANT!
The classes use some operators described in the corresponding example (unfortunately, this article hasn't been translated yet, so you can just download the needed class and use it).
| E-mail: mailbox@parser.ru | Copyright © 1997–2010 | Art Lebedev Studio |