Connecting to MYSQL with Python 2 in three steps
1 - Setting
You must install a MySQL driver before doing anything. Unlike PHP, Only the SQLite driver is installed by default with Python. The most used package to do so is MySQLdb but it's hard to install it using easy_install. Please note MySQLdb only supports Python 2.
For Windows user, you can get an exe of MySQLdb.
For Linux, this is a casual package (python-mysqldb). (You can use sudo apt-get install python-mysqldb
(for debian based distros), yum install MySQL-python
(for rpm-based), or dnf install python-mysql
(for modern fedora distro) in command line to download.)
For Mac, you can install MySQLdb using Macport.
2 - Usage
After installing, Reboot. This is not mandatory, But it will prevent me from answering 3 or 4 other questions in this post if something goes wrong. So please reboot.
Then it is just like using any other package :
#!/usr/bin/python
import MySQLdb
db = MySQLdb.connect(host="localhost", # your host, usually localhost
user="john", # your username
passwd="megajonhy", # your password
db="jonhydb") # name of the data base
# you must create a Cursor object. It will let
# you execute all the queries you need
cur = db.cursor()
# Use all the SQL you like
cur.execute("SELECT * FROM YOUR_TABLE_NAME")
# print all the first cell of all the rows
for row in cur.fetchall():
print row[0]
db.close()
Of course, there are thousand of possibilities and options; this is a very basic example. You will have to look at the documentation. A good starting point.
3 - More advanced usage
Once you know how it works, You may want to use an ORM to avoid writing SQL manually and manipulate your tables as they were Python objects. The most famous ORM in the Python community is SQLAlchemy.
I strongly advise you to use it: your life is going to be much easier.
I recently discovered another jewel in the Python world: peewee. It's a very lite ORM, really easy and fast to setup then use. It makes my day for small projects or stand alone apps, Where using big tools like SQLAlchemy or Django is overkill :
import peewee
from peewee import *
db = MySQLDatabase('jonhydb', user='john', passwd='megajonhy')
class Book(peewee.Model):
author = peewee.CharField()
title = peewee.TextField()
class Meta:
database = db
Book.create_table()
book = Book(author="me", title='Peewee is cool')
book.save()
for book in Book.filter(author="me"):
print book.title
This example works out of the box. Nothing other than having peewee (pip install peewee
) is required.
Your observation suggests that myapp is terminating without reading (all of the) input. Not knowing anything about myapp, that's hard to confirm, but consider for example
$ echo 'hello world' | tr 'l' 'L'
heLLo worLd
now...:
>>> cmd = ['/usr/bin/tr']
>>> p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, stdin=subprocess.PIPE)
>>> out,err = p.communicate('hello world')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/subprocess.py", line 668, in communicate
return self._communicate(input)
File "/usr/lib/python2.5/subprocess.py", line 1218, in _communicate
bytes_written = self._write_no_intr(self.stdin.fileno(), buffer(input, input_offset, 512))
File "/usr/lib/python2.5/subprocess.py", line 997, in _write_no_intr
return os.write(fd, s)
OSError: [Errno 32] Broken pipe
because...:
>>> p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stdin=subprocess.PIPE)
>>> /usr/bin/tr: missing operand
Try `/usr/bin/tr --help' for more information.
and if we fix the bug:
>>> cmd = ['/usr/bin/tr', 'l', 'L']
>>> p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stdin=subprocess.PIPE)
>>> out,err = p.communicate('hello world')>>> print out
heLLo worLd
>>> print err
None
...it fixes everything. What happens if you omit the stderr redirection -- do you perchance see any error messages from myapp...?
Best Solution
You'd be a little happier with the following.
Delegate part of the work to the shell. Let it connect two processes with a pipeline.
You'd be a lot happier rewriting 'script.awk' into Python, eliminating awk and the pipeline.
Edit. Some of the reasons for suggesting that awk isn't helping.
[There are too many reasons to respond via comments.]
Awk is adding a step of no significant value. There's nothing unique about awk's processing that Python doesn't handle.
The pipelining from awk to sort, for large sets of data, may improve elapsed processing time. For short sets of data, it has no significant benefit. A quick measurement of
awk >file ; sort file
andawk | sort
will reveal of concurrency helps. With sort, it rarely helps because sort is not a once-through filter.The simplicity of "Python to sort" processing (instead of "Python to awk to sort") prevents the exact kind of questions being asked here.
Python -- while wordier than awk -- is also explicit where awk has certain implicit rules that are opaque to newbies, and confusing to non-specialists.
Awk (like the shell script itself) adds Yet Another Programming language. If all of this can be done in one language (Python), eliminating the shell and the awk programming eliminates two programming languages, allowing someone to focus on the value-producing parts of the task.
Bottom line: awk can't add significant value. In this case, awk is a net cost; it added enough complexity that it was necessary to ask this question. Removing awk will be a net gain.
Sidebar Why building a pipeline (
a | b
) is so hard.When the shell is confronted with
a | b
it has to do the following.Fork a child process of the original shell. This will eventually become b.
Build an os pipe. (not a Python subprocess.PIPE) but call
os.pipe()
which returns two new file descriptors that are connected via common buffer. At this point the process has stdin, stdout, stderr from its parent, plus a file that will be "a's stdout" and "b's stdin".Fork a child. The child replaces its stdout with the new a's stdout. Exec the
a
process.The b child closes replaces its stdin with the new b's stdin. Exec the
b
process.The b child waits for a to complete.
The parent is waiting for b to complete.
I think that the above can be used recursively to spawn
a | b | c
, but you have to implicitly parenthesize long pipelines, treating them as if they'rea | (b | c)
.Since Python has
os.pipe()
,os.exec()
andos.fork()
, and you can replacesys.stdin
andsys.stdout
, there's a way to do the above in pure Python. Indeed, you may be able to work out some shortcuts usingos.pipe()
andsubprocess.Popen
.However, it's easier to delegate that operation to the shell.