Skip to content
iTecNote
  • Python
  • Javascript
  • PHP
  • Java
  • Android
  • iOS
  • jQuery
  • MySQL

C# – Techniques for logging into websites programmatically

browserc++login-automation

I am trying to automate logging into Photobucket for API use for a project that requires automated photo downloading using stored credentials.

The API generates a URL to use for logging in, and using Firebug i can see what requests and responses are being sent/received.

My question is, how can i use HttpWebRequest and HttpWebResponse to mimic what happens in the browser in C#?

Would it be possible to use a web browser component inside a C# app, populate the username and password fields and submit the login?

Best Solution

I've done this kind of thing before, and ended up with a nice toolkit for writing these types of applications. I've used this toolkit to handle non-trivial back-n-forth web requests, so it's entirely possible, and not extremely difficult.

I found out quickly that doing the HttpWebRequest/HttpWebResponse from scratch really was lower-level than I wanted to be dealing with. My tools are based entirely around the HtmlAgilityPack by Simon Mourier. It's an excellent toolset. It does a lot of the heavy lifting for you, and makes parsing of the fetched HTML really easy. If you can rock XPath queries, the HtmlAgilityPack is where you want to start. It handles poorly foormed HTML quite well too!

You still need a good tool to help debug. Besides what you have in your debugger, being able to inspect the http/https traffic as it goes back-n-forth across the wire is priceless. Since you're code is going to be making these requests, not your browser, FireBug isn't going to be of much help debugging your code. There's all sorts of packet sniffer tools, but for HTTP/HTTPS debugging, I don't think you can beat the ease of use and power of Fiddler 2. The newest version even comes with a plugin for firefox to quickly divert requests through fiddler and back. Because it can also act as a seamless HTTPS proxy you can inspect your HTTPS traffic as well.

Give 'em a try, I'm sure they'll be two indispensable tools in your hacking.

Update: Added the below code example. This is pulled from a not-much-larger "Session" class that logs into a website and keeps a hold of the related cookies for you. I choose this because it does more than a simple 'please fetch that web page for me' code, plus it has a line-or-two of XPath querying against the final destination page.

public bool Connect() {
   if (string.IsNullOrEmpty(_Username)) { base.ThrowHelper(new SessionException("Username not specified.")); } 
   if (string.IsNullOrEmpty(_Password)) { base.ThrowHelper(new SessionException("Password not specified.")); }

   _Cookies = new CookieContainer();
   HtmlWeb webFetcher = new HtmlWeb();
   webFetcher.UsingCache = false;
   webFetcher.UseCookies = true;

   HtmlWeb.PreRequestHandler justSetCookies = delegate(HttpWebRequest webRequest) {
      SetRequestHeaders(webRequest, false);
      return true;
   };
   HtmlWeb.PreRequestHandler postLoginInformation = delegate(HttpWebRequest webRequest) {
      SetRequestHeaders(webRequest, false);

      // before we let webGrabber get the response from the server, we must POST the login form's data
      // This posted form data is *VERY* specific to the web site in question, and it must be exactly right,
      // and exactly what the remote server is expecting, otherwise it will not work!
      //
      // You need to use an HTTP proxy/debugger such as Fiddler in order to adequately inspect the 
      // posted form data. 
      ASCIIEncoding encoding = new ASCIIEncoding();
      string postDataString = string.Format("edit%5Bname%5D={0}&edit%5Bpass%5D={1}&edit%5Bform_id%5D=user_login&op=Log+in", _Username, _Password);
      byte[] postData = encoding.GetBytes(postDataString);
      webRequest.ContentType = "application/x-www-form-urlencoded";
      webRequest.ContentLength = postData.Length;
      webRequest.Referer = Util.MakeUrlCore("/user"); // builds a proper-for-this-website referer string

      using (Stream postStream = webRequest.GetRequestStream()) {
         postStream.Write(postData, 0, postData.Length);
         postStream.Close();
      }

      return true;
   };

   string loginUrl = Util.GetUrlCore(ProjectUrl.Login); 
   bool atEndOfRedirects = false;
   string method = "POST";
   webFetcher.PreRequest = postLoginInformation;

   // this is trimmed...this was trimmed in order to handle one of those 'interesting' 
   // login processes...
   webFetcher.PostResponse = delegate(HttpWebRequest webRequest, HttpWebResponse response) {
      if (response.StatusCode == HttpStatusCode.Found) {
         // the login process is forwarding us on...update the URL to move to...
         loginUrl = response.Headers["Location"] as String;
         method = "GET";
         webFetcher.PreRequest = justSetCookies; // we only need to post cookies now, not all the login info
      } else {
         atEndOfRedirects = true;
      }

      foreach (Cookie cookie in response.Cookies) {
         // *snip*
      }
   };

   // Real work starts here:
   HtmlDocument retrievedDocument = null;
   while (!atEndOfRedirects) {
      retrievedDocument = webFetcher.Load(loginUrl, method);
   }


   // ok, we're fully logged in.  Check the returned HTML to see if we're sitting at an error page, or
   // if we're successfully logged in.
   if (retrievedDocument != null) {
      HtmlNode errorNode = retrievedDocument.DocumentNode.SelectSingleNode("//div[contains(@class, 'error')]");
      if (errorNode != null) { return false; }
   }

   return true; 
}


public void SetRequestHeaders(HttpWebRequest webRequest) { SetRequestHeaders(webRequest, true); }
public void SetRequestHeaders(HttpWebRequest webRequest, bool allowAutoRedirect) {
   try {
      webRequest.AllowAutoRedirect = allowAutoRedirect;
      webRequest.CookieContainer = _Cookies;

      // the rest of this stuff is just to try and make our request *look* like FireFox. 
      webRequest.UserAgent = @"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3";
      webRequest.Accept = @"text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
      webRequest.KeepAlive = true;
      webRequest.Headers.Add(@"Accept-Language: en-us,en;q=0.5");
      //webRequest.Headers.Add(@"Accept-Encoding: gzip,deflate");
   }
   catch (Exception ex) { base.ThrowHelper(ex); }
}

Related Solutions

C# – What are the correct version numbers for C#

C# language version history:

These are the versions of C# known about at the time of this writing:

  • C# 1.0 released with .NET 1.0 and VS2002 (January 2002)
  • C# 1.2 (bizarrely enough); released with .NET 1.1 and VS2003 (April 2003). First version to call Dispose on IEnumerators which implemented IDisposable. A few other small features.
  • C# 2.0 released with .NET 2.0 and VS2005 (November 2005). Major new features: generics, anonymous methods, nullable types, and iterator blocks
  • C# 3.0 released with .NET 3.5 and VS2008 (November 2007). Major new features: lambda expressions, extension methods, expression trees, anonymous types, implicit typing (var), and query expressions
  • C# 4.0 released with .NET 4 and VS2010 (April 2010). Major new features: late binding (dynamic), delegate and interface generic variance, more COM support, named arguments, tuple data type and optional parameters
  • C# 5.0 released with .NET 4.5 and VS2012 (August 2012). Major features: async programming, and caller info attributes. Breaking change: loop variable closure.
  • C# 6.0 released with .NET 4.6 and VS2015 (July 2015). Implemented by Roslyn. Features: initializers for automatically implemented properties, using directives to import static members, exception filters, element initializers, await in catch and finally, extension Add methods in collection initializers.
  • C# 7.0 released with .NET 4.7 and VS2017 (March 2017). Major new features: tuples, ref locals and ref return, pattern matching (including pattern-based switch statements), inline out parameter declarations, local functions, binary literals, digit separators, and arbitrary async returns.
  • C# 7.1 released with VS2017 v15.3 (August 2017). New features: async main, tuple member name inference, default expression, and pattern matching with generics.
  • C# 7.2 released with VS2017 v15.5 (November 2017). New features: private protected access modifier, Span<T>, aka interior pointer, aka stackonly struct, and everything else.
  • C# 7.3 released with VS2017 v15.7 (May 2018). New features: enum, delegate and unmanaged generic type constraints. ref reassignment. Unsafe improvements: stackalloc initialization, unpinned indexed fixed buffers, custom fixed statements. Improved overloading resolution. Expression variables in initializers and queries. == and != defined for tuples. Auto-properties' backing fields can now be targeted by attributes.
  • C# 8.0 released with .NET Core 3.0 and VS2019 v16.3 (September 2019). Major new features: nullable reference-types, asynchronous streams, indices and ranges, readonly members, using declarations, default interface methods, static local functions, and enhancement of interpolated verbatim strings.
  • C# 9.0 released with .NET 5.0 and VS2019 v16.8 (November 2020). Major new features: init-only properties, records, with-expressions, data classes, positional records, top-level programs, improved pattern matching (simple type patterns, relational patterns, logical patterns), improved target typing (target-type new expressions, target typed ?? and ?), and covariant returns. Minor features: relax ordering of ref and partial modifiers, parameter null checking, lambda discard parameters, native ints, attributes on local functions, function pointers, static lambdas, extension GetEnumerator, module initializers, and extending partial.

In response to the OP's question:

What are the correct version numbers for C#? What came out when? Why can't I find any answers about C# 3.5?

There is no such thing as C# 3.5 - the cause of confusion here is that the C# 3.0 is present in .NET 3.5. The language and framework are versioned independently, however - as is the CLR, which is at version 2.0 for .NET 2.0 through 3.5, .NET 4 introducing CLR 4.0, service packs notwithstanding. The CLR in .NET 4.5 has various improvements, but the versioning is unclear: in some places it may be referred to as CLR 4.5 (this MSDN page used to refer to it that way, for example), but the Environment.Version property still reports 4.0.xxx.

As of May 3, 2017, the C# Language Team created a history of C# versions and features on their GitHub repository: Features Added in C# Language Versions. There is also a page that tracks upcoming and recently implemented language features.

C# – How to enable assembly bind failure logging (Fusion) in .NET

Add the following values to

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Fusion
Add:
DWORD ForceLog set value to 1
DWORD LogFailures set value to 1
DWORD LogResourceBinds set value to 1
DWORD EnableLog set value to 1
String LogPath set value to folder for logs (e.g. C:\FusionLog\)

Make sure you include the backslash after the folder name and that the Folder exists.

You need to restart the program that you're running to force it to read those registry settings.

By the way, don't forget to turn off fusion logging when not needed.

enter image description here

Related Question
  • C# – Path.Combine for URLs
  • C# – How to remedy “The breakpoint will not currently be hit. No symbols have been loaded for this document.” warning
  • C# – Programmatically automating a web login
  • C# – Deserialize JSON into C# dynamic object
  • Google-chrome – Disabling Chrome cache for website development
  • C# – reason for C#’s reuse of the variable in a foreach
  • C# – Logging Into A Website Using C# Programmatically