Python – urllib2 – post request

http-requestpythonurllib2

I try to perform a simple POST-request with urllib2.
However the servers response indicates that it receives a simple GET. I checked the type of the outgoing request, but it is set to POST.
To check whether the server behaves like I expect it to, I tried to perform a GET request with the (former POST-) data concatenated to the url. This got me the answer I expected.
Does anybody have a clue what I misunderstood?

def connect(self):
    url = 'http://www.mitfahrgelegenheit.de/mitfahrzentrale/Dresden/Potsdam.html/'
    user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
    header = { 'User-Agent' : user_agent }

    values = {
      'city_from' : 69,
      'radius_from' : 0,
      'city_to' : 263,
      'radius_to' : 0,
      'date' : 'date',
      'day' : 5,
      'month' : 03,
      'year' : 2012,
      'tolerance' : 0
    }

    data = urllib.urlencode(values)
    # req = urllib2.Request(url+data, None, header) # GET works fine
    req = urllib2.Request(url, data, header)  # POST request doesn't not work

    self.response = urllib2.urlopen(req)

This seems to be a problem like the one discussed here: Python URLLib / URLLib2 POST but I'm quite sure that in my case the trailing slash is not missing. 😉

I fear this might be a stupid misconception, but I'm already wondering for hours!

EDIT: A convenience function for printing:

def response_to_str(response):
    return response.read()

def dump_response_to_file(response):
    f = open('dump.html','w')
    f.write(response_to_str(response))

EDIT 2: Resolution:
I found a tool to capture the real interaction with the site, http://fiddler2.com/fiddler2/. Apparently the server takes the data from the input form, redirects a few times and and then makes a GET request with this data simply appended to the url.
Everything is fine with urllib2 and I apologize for misusing your time!

Best Solution

Things you need to check:

  • Are you sure you are posting to the right URL?
  • Are you sure you can retrieve results without being logged in?
  • Show us some example output for different post values.

You can find correct post URL using Firefox's Firebug or Google Chromes DevTools.

I provided you with some code that supports cookies so that you can log-in first and use the cookie to make the subsequent request with your post parameters.

Finally, if you could show us some example HTML output, that will make life easier.

Here's is my code which has worked for me quite reliably so far for POST-ing to most webpages including pages protected with CSRF/XSRF (as long as you are able to correctly figure out what to post and where (which URL) to post to).

import cookielib
import socket
import urllib
import urllib2

url = 'http://www.mitfahrgelegenheit.de/mitfahrzentrale/Dresden/Potsdam.html/'
http_header = {
                "User-Agent" : "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.46 Safari/535.11",
                "Accept" : "text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,text/png,*/*;q=0.5",
                "Accept-Language" : "en-us,en;q=0.5",
                "Accept-Charset" : "ISO-8859-1",
                "Content-type": "application/x-www-form-urlencoded",
                "Host" : "www.mitfahrgelegenheit.de",
                "Referer" : "http://www.mitfahrgelegenheit.de/mitfahrzentrale/Dresden/Potsdam.html/"
                }

params = {
  'city_from' : 169,
  'radius_from' : 0,
  'city_to' : 263,
  'radius_to' : 0,
  'date' : 'date',
  'day' : 5,
  'month' : 03,
  'year' : 2012,
  'tolerance' : 0
}

# setup socket connection timeout
timeout = 15
socket.setdefaulttimeout(timeout)

# setup cookie handler
cookie_jar = cookielib.LWPCookieJar()
cookie = urllib2.HTTPCookieProcessor(cookie_jar)

# setup proxy handler, in case some-day you need to use a proxy server
proxy = {} # example: {"http" : "www.blah.com:8080"}

# create an urllib2 opener()
#opener = urllib2.build_opener(proxy, cookie) # with proxy
opener = urllib2.build_opener(cookie) # we are not going to use proxy now

# create your HTTP request
req = urllib2.Request(url, urllib.urlencode(params), http_header)

# submit your request
res = opener.open(req)
html = res.read()

# save retrieved HTML to file
open("tmp.html", "w").write(html)
print html