The correct way to avoid SQL injection attacks, no matter which database you use, is to separate the data from SQL, so that data stays data and will never be interpreted as commands by the SQL parser. It is possible to create SQL statement with correctly formatted data parts, but if you don't fully understand the details, you should always use prepared statements and parameterized queries. These are SQL statements that are sent to and parsed by the database server separately from any parameters. This way it is impossible for an attacker to inject malicious SQL.
You basically have two options to achieve this:
Using PDO (for any supported database driver):
$stmt = $pdo->prepare('SELECT * FROM employees WHERE name = :name');
$stmt->execute([ 'name' => $name ]);
foreach ($stmt as $row) {
// Do something with $row
}
Using MySQLi (for MySQL):
$stmt = $dbConnection->prepare('SELECT * FROM employees WHERE name = ?');
$stmt->bind_param('s', $name); // 's' specifies the variable type => 'string'
$stmt->execute();
$result = $stmt->get_result();
while ($row = $result->fetch_assoc()) {
// Do something with $row
}
If you're connecting to a database other than MySQL, there is a driver-specific second option that you can refer to (for example, pg_prepare()
and pg_execute()
for PostgreSQL). PDO is the universal option.
Correctly setting up the connection
Note that when using PDO to access a MySQL database real prepared statements are not used by default. To fix this you have to disable the emulation of prepared statements. An example of creating a connection using PDO is:
$dbConnection = new PDO('mysql:dbname=dbtest;host=127.0.0.1;charset=utf8', 'user', 'password');
$dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
In the above example the error mode isn't strictly necessary, but it is advised to add it. This way the script will not stop with a Fatal Error
when something goes wrong. And it gives the developer the chance to catch
any error(s) which are throw
n as PDOException
s.
What is mandatory, however, is the first setAttribute()
line, which tells PDO to disable emulated prepared statements and use real prepared statements. This makes sure the statement and the values aren't parsed by PHP before sending it to the MySQL server (giving a possible attacker no chance to inject malicious SQL).
Although you can set the charset
in the options of the constructor, it's important to note that 'older' versions of PHP (before 5.3.6) silently ignored the charset parameter in the DSN.
Explanation
The SQL statement you pass to prepare
is parsed and compiled by the database server. By specifying parameters (either a ?
or a named parameter like :name
in the example above) you tell the database engine where you want to filter on. Then when you call execute
, the prepared statement is combined with the parameter values you specify.
The important thing here is that the parameter values are combined with the compiled statement, not an SQL string. SQL injection works by tricking the script into including malicious strings when it creates SQL to send to the database. So by sending the actual SQL separately from the parameters, you limit the risk of ending up with something you didn't intend.
Any parameters you send when using a prepared statement will just be treated as strings (although the database engine may do some optimization so parameters may end up as numbers too, of course). In the example above, if the $name
variable contains 'Sarah'; DELETE FROM employees
the result would simply be a search for the string "'Sarah'; DELETE FROM employees"
, and you will not end up with an empty table.
Another benefit of using prepared statements is that if you execute the same statement many times in the same session it will only be parsed and compiled once, giving you some speed gains.
Oh, and since you asked about how to do it for an insert, here's an example (using PDO):
$preparedStatement = $db->prepare('INSERT INTO table (column) VALUES (:column)');
$preparedStatement->execute([ 'column' => $unsafeValue ]);
Can prepared statements be used for dynamic queries?
While you can still use prepared statements for the query parameters, the structure of the dynamic query itself cannot be parametrized and certain query features cannot be parametrized.
For these specific scenarios, the best thing to do is use a whitelist filter that restricts the possible values.
// Value whitelist
// $dir can only be 'DESC', otherwise it will be 'ASC'
if (empty($dir) || $dir !== 'DESC') {
$dir = 'ASC';
}
The CASE
statement is the closest to IF in SQL and is supported on all versions of SQL Server.
SELECT CAST(
CASE
WHEN Obsolete = 'N' or InStock = 'Y'
THEN 1
ELSE 0
END AS bit) as Saleable, *
FROM Product
You only need to use the CAST
operator if you want the result as a Boolean value. If you are happy with an int
, this works:
SELECT CASE
WHEN Obsolete = 'N' or InStock = 'Y'
THEN 1
ELSE 0
END as Saleable, *
FROM Product
CASE
statements can be embedded in other CASE
statements and even included in aggregates.
SQL Server Denali (SQL Server 2012) adds the IIF statement which is also available in access (pointed out by Martin Smith):
SELECT IIF(Obsolete = 'N' or InStock = 'Y', 1, 0) as Saleable, * FROM Product
Best Solution
I think the very short answer to your question is to use WITH clauses to generate your own.
Unfortunately, the big names in databases don't have built-in queryable number-range pseudo-tables. Or, more generally, easy pure-SQL data generation features. Personally, I think this is a huge failing, because if they did it would be possible to move a lot of code that is currently locked up in procedural scripts (T-SQL, PL/SQL, etc.) into pure-SQL, which has a number of benefits to performance and code complexity.
So anyway, it sounds like what you need in a general sense is the ability to generate data on the fly.
Oracle and T-SQL both support a WITH clause that can be used to do this. They work a little differently in the different DBMS's, and MS calls them "common table expressions", but they are very similar in form. Using these with recursion, you can generate a sequence of numbers or text values fairly easily. Here is what it might look like...
In Oracle SQL:
This is admittedly quite verbose. Oracle's recursion functionality is limited. The syntax is clunky, it's not performant, and it is limited to 500 (I think) nested levels. This is why I chose to use recursion only for the first 10 digits, and then cross (cartesian) joins to combine them into actual numbers.
I haven't used SQL Server's Common Table Expressions myself, but since they allow self-reference, recursion is MUCH simpler than it is in Oracle. Whether performance is comparable, and what the nesting limits are, I don't know.
At any rate, recursion and the WITH clause are very useful tools in creating queries that require on-the-fly generated data sets. Then by querying this data set, doing operations on the values, you can get all sorts of different types of generated data. Aggregations, duplications, combinations, permutations, and so on. You can even use such generated data to aid in rolling up or drilling down into other data.
UPDATE: I just want to add that, once you start working with data in this way, it opens your mind to new ways of thinking about SQL. It's not just a scripting language. It's a fairly robust data-driven declarative language. Sometimes it's a pain to use because for years it has suffered a dearth of enhancements to aid in reducing the redundancy needed for complex operations. But nonetheless it is very powerful, and a fairly intuitive way to work with data sets as both the target and the driver of your algorithms.