The correct way to avoid SQL injection attacks, no matter which database you use, is to separate the data from SQL, so that data stays data and will never be interpreted as commands by the SQL parser. It is possible to create SQL statement with correctly formatted data parts, but if you don't fully understand the details, you should always use prepared statements and parameterized queries. These are SQL statements that are sent to and parsed by the database server separately from any parameters. This way it is impossible for an attacker to inject malicious SQL.
You basically have two options to achieve this:
Using PDO (for any supported database driver):
$stmt = $pdo->prepare('SELECT * FROM employees WHERE name = :name');
$stmt->execute([ 'name' => $name ]);
foreach ($stmt as $row) {
// Do something with $row
}
Using MySQLi (for MySQL):
$stmt = $dbConnection->prepare('SELECT * FROM employees WHERE name = ?');
$stmt->bind_param('s', $name); // 's' specifies the variable type => 'string'
$stmt->execute();
$result = $stmt->get_result();
while ($row = $result->fetch_assoc()) {
// Do something with $row
}
If you're connecting to a database other than MySQL, there is a driver-specific second option that you can refer to (for example, pg_prepare() and pg_execute() for PostgreSQL). PDO is the universal option.
Correctly setting up the connection
Note that when using PDO to access a MySQL database real prepared statements are not used by default. To fix this you have to disable the emulation of prepared statements. An example of creating a connection using PDO is:
$dbConnection = new PDO('mysql:dbname=dbtest;host=127.0.0.1;charset=utf8', 'user', 'password');
$dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
In the above example the error mode isn't strictly necessary, but it is advised to add it. This way the script will not stop with a Fatal Error when something goes wrong. And it gives the developer the chance to catch any error(s) which are thrown as PDOExceptions.
What is mandatory, however, is the first setAttribute() line, which tells PDO to disable emulated prepared statements and use real prepared statements. This makes sure the statement and the values aren't parsed by PHP before sending it to the MySQL server (giving a possible attacker no chance to inject malicious SQL).
Although you can set the charset in the options of the constructor, it's important to note that 'older' versions of PHP (before 5.3.6) silently ignored the charset parameter in the DSN.
Explanation
The SQL statement you pass to prepare is parsed and compiled by the database server. By specifying parameters (either a ? or a named parameter like :name in the example above) you tell the database engine where you want to filter on. Then when you call execute, the prepared statement is combined with the parameter values you specify.
The important thing here is that the parameter values are combined with the compiled statement, not an SQL string. SQL injection works by tricking the script into including malicious strings when it creates SQL to send to the database. So by sending the actual SQL separately from the parameters, you limit the risk of ending up with something you didn't intend.
Any parameters you send when using a prepared statement will just be treated as strings (although the database engine may do some optimization so parameters may end up as numbers too, of course). In the example above, if the $name variable contains 'Sarah'; DELETE FROM employees the result would simply be a search for the string "'Sarah'; DELETE FROM employees", and you will not end up with an empty table.
Another benefit of using prepared statements is that if you execute the same statement many times in the same session it will only be parsed and compiled once, giving you some speed gains.
Oh, and since you asked about how to do it for an insert, here's an example (using PDO):
$preparedStatement = $db->prepare('INSERT INTO table (column) VALUES (:column)');
$preparedStatement->execute([ 'column' => $unsafeValue ]);
Can prepared statements be used for dynamic queries?
While you can still use prepared statements for the query parameters, the structure of the dynamic query itself cannot be parametrized and certain query features cannot be parametrized.
For these specific scenarios, the best thing to do is use a whitelist filter that restricts the possible values.
// Value whitelist
// $dir can only be 'DESC', otherwise it will be 'ASC'
if (empty($dir) || $dir !== 'DESC') {
$dir = 'ASC';
}
Syntax:
ALTER TABLE {TABLENAME}
ADD {COLUMNNAME} {TYPE} {NULL|NOT NULL}
CONSTRAINT {CONSTRAINT_NAME} DEFAULT {DEFAULT_VALUE}
WITH VALUES
Example:
ALTER TABLE SomeTable
ADD SomeCol Bit NULL --Or NOT NULL.
CONSTRAINT D_SomeTable_SomeCol --When Omitted a Default-Constraint Name is autogenerated.
DEFAULT (0)--Optional Default-Constraint.
WITH VALUES --Add if Column is Nullable and you want the Default Value for Existing Records.
Notes:
Optional Constraint Name:
If you leave out CONSTRAINT D_SomeTable_SomeCol then SQL Server will autogenerate
a Default-Contraint with a funny Name like: DF__SomeTa__SomeC__4FB7FEF6
Optional With-Values Statement:
The WITH VALUES is only needed when your Column is Nullable
and you want the Default Value used for Existing Records.
If your Column is NOT NULL, then it will automatically use the Default Value
for all Existing Records, whether you specify WITH VALUES or not.
How Inserts work with a Default-Constraint:
If you insert a Record into SomeTable and do not Specify SomeCol's value, then it will Default to 0.
If you insert a Record and Specify SomeCol's value as NULL (and your column allows nulls),
then the Default-Constraint will not be used and NULL will be inserted as the Value.
Notes were based on everyone's great feedback below.
Special Thanks to:
@Yatrix, @WalterStabosz, @YahooSerious, and @StackMan for their Comments.
Best Solution
The short answer is yes, yes there is a way to get around
mysql_real_escape_string(). #For Very OBSCURE EDGE CASES!!!The long answer isn't so easy. It's based off an attack demonstrated here.
The Attack
So, let's start off by showing the attack...
In certain circumstances, that will return more than 1 row. Let's dissect what's going on here:
Selecting a Character Set
For this attack to work, we need the encoding that the server's expecting on the connection both to encode
'as in ASCII i.e.0x27and to have some character whose final byte is an ASCII\i.e.0x5c. As it turns out, there are 5 such encodings supported in MySQL 5.6 by default:big5,cp932,gb2312,gbkandsjis. We'll selectgbkhere.Now, it's very important to note the use of
SET NAMEShere. This sets the character set ON THE SERVER. If we used the call to the C API functionmysql_set_charset(), we'd be fine (on MySQL releases since 2006). But more on why in a minute...The Payload
The payload we're going to use for this injection starts with the byte sequence
0xbf27. Ingbk, that's an invalid multibyte character; inlatin1, it's the string¿'. Note that inlatin1andgbk,0x27on its own is a literal'character.We have chosen this payload because, if we called
addslashes()on it, we'd insert an ASCII\i.e.0x5c, before the'character. So we'd wind up with0xbf5c27, which ingbkis a two character sequence:0xbf5cfollowed by0x27. Or in other words, a valid character followed by an unescaped'. But we're not usingaddslashes(). So on to the next step...mysql_real_escape_string()
The C API call to
mysql_real_escape_string()differs fromaddslashes()in that it knows the connection character set. So it can perform the escaping properly for the character set that the server is expecting. However, up to this point, the client thinks that we're still usinglatin1for the connection, because we never told it otherwise. We did tell the server we're usinggbk, but the client still thinks it'slatin1.Therefore the call to
mysql_real_escape_string()inserts the backslash, and we have a free hanging'character in our "escaped" content! In fact, if we were to look at$varin thegbkcharacter set, we'd see:Which is exactly what the attack requires.
The Query
This part is just a formality, but here's the rendered query:
Congratulations, you just successfully attacked a program using
mysql_real_escape_string()...The Bad
It gets worse.
PDOdefaults to emulating prepared statements with MySQL. That means that on the client side, it basically does a sprintf throughmysql_real_escape_string()(in the C library), which means the following will result in a successful injection:Now, it's worth noting that you can prevent this by disabling emulated prepared statements:
This will usually result in a true prepared statement (i.e. the data being sent over in a separate packet from the query). However, be aware that PDO will silently fallback to emulating statements that MySQL can't prepare natively: those that it can are listed in the manual, but beware to select the appropriate server version).
The Ugly
I said at the very beginning that we could have prevented all of this if we had used
mysql_set_charset('gbk')instead ofSET NAMES gbk. And that's true provided you are using a MySQL release since 2006.If you're using an earlier MySQL release, then a bug in
mysql_real_escape_string()meant that invalid multibyte characters such as those in our payload were treated as single bytes for escaping purposes even if the client had been correctly informed of the connection encoding and so this attack would still succeed. The bug was fixed in MySQL 4.1.20, 5.0.22 and 5.1.11.But the worst part is that
PDOdidn't expose the C API formysql_set_charset()until 5.3.6, so in prior versions it cannot prevent this attack for every possible command! It's now exposed as a DSN parameter.The Saving Grace
As we said at the outset, for this attack to work the database connection must be encoded using a vulnerable character set.
utf8mb4is not vulnerable and yet can support every Unicode character: so you could elect to use that instead—but it has only been available since MySQL 5.5.3. An alternative isutf8, which is also not vulnerable and can support the whole of the Unicode Basic Multilingual Plane.Alternatively, you can enable the
NO_BACKSLASH_ESCAPESSQL mode, which (amongst other things) alters the operation ofmysql_real_escape_string(). With this mode enabled,0x27will be replaced with0x2727rather than0x5c27and thus the escaping process cannot create valid characters in any of the vulnerable encodings where they did not exist previously (i.e.0xbf27is still0xbf27etc.)—so the server will still reject the string as invalid. However, see @eggyal's answer for a different vulnerability that can arise from using this SQL mode.Safe Examples
The following examples are safe:
Because the server's expecting
utf8...Because we've properly set the character set so the client and the server match.
Because we've turned off emulated prepared statements.
Because we've set the character set properly.
Because MySQLi does true prepared statements all the time.
Wrapping Up
If you:
mysql_set_charset()/$mysqli->set_charset()/ PDO's DSN charset parameter (in PHP ≥ 5.3.6)OR
utf8/latin1/ascii/ etc)You're 100% safe.
Otherwise, you're vulnerable even though you're using
mysql_real_escape_string()...