Get a list of URLs from a site

web-crawler

I'm deploying a replacement site for a client but they don't want all their old pages to end in 404s. Keeping the old URL structure wasn't possible because it was hideous.

So I'm writing a 404 handler that should look for an old page being requested and do a permanent redirect to the new page. Problem is, I need a list of all the old page URLs.

I could do this manually, but I'd be interested if there are any apps that would provide me a list of relative (eg: /page/path, not http:/…/page/path) URLs just given the home page. Like a spider but one that doesn't care about the content other than to find deeper pages.

Best Solution

I didn't mean to answer my own question but I just thought about running a sitemap generator. First one I found http://www.xml-sitemaps.com has a nice text output. Perfect for my needs.