Python – Using sqlalchethe session to execute sql DRASTICALLY slows execution time


I have a rather long query (was 7 joins, now is 7 subselects because in raw sql 7 subselects was considerably faster– I don't even know when the 7 joins would have finished if I'd let it run, but longer than 1 min versus .05-.1 seconds with subselects)

When I run it on the db it takes, as I said, .05-.1 seconds to execute. Simply using session.execute() slows it down to over a minute!

Is there anything I can do?

Let me know if you need more info– I kind of suspect this is a general sqlalchemy thing– like maybe sqlalchemy is setting up a query plan instead of just letting mysql do it? Or…?

EDIT: ran explain on both and they seem identical except that sqlalchemy adds a "using temporary; using filesort" to the extra column. Is that what's slowing it down? How do I stop it from doing that?

EDIT 2: DEFINITELY sqlalchemy. I tried using a MySQL cursor to execute instead of an SA session and got the same .05 second runtime.


The code to create our engine:

engine_ro = create_engine(
    config.ro_database_url, #string with username, password, db
    pool_size=config.database_pool_size, #int
    max_overflow=config.database_max_overflow, #int
    pool_timeout=config.database_timeout, # int
    echo=config.database_echo, #False
    echo_pool=config.database_echo, #same as echo #False
    listeners=[GoneAway()] if config.database_use_listeners else None)

where GoneAway() is a method that executes a SELECT 1 to check the connection.

To create the session object:

SessionRO = scoped_session(sessionmaker(bind=engine_ro, autocommit=False))

where scoped_session and sessionmaker are sqlalchemy functions.

Then, the code that executes the query:

session = SessionRO()
results = session.execute(sql, params)

EDIT 4: In case anyone is wondering, if I comment out the listeners bit, it's still slow. As well if I just use sessionmaker without scoped_session.

Best Solution

sqlalchemy doesn't set up a query plan, or anything else fancy. It just generates SQL and sends it over a DB-API-2.0 connection. So, if you explicitly call execute with the same statement that sqlalchemy generates, it will run in exactly the same way.*

The simplest way to see what queries sqlalchemy is generating is to pass echo=True as an extra param on the create_engine call.

In your case, the query generated by sqlalchemy was in fact different from your manual query, because it was testing an integer parameter with a string, instead of with an int.

* This isn't 100% guaranteed; you have to make sure that any connection parameters in the DB-API-2.0 connect function are the same, and that neither you nor sqlalchemy executed any PRAGMA statements. But you can test those in much the same way you can test the query itself.