Current Google Mini setup uses customized stylesheet setup on the appliance. I am looking to apply a different style to another collection I am integrating into a site. How can I override the default and apply a custom style? Where do you place the custom stylesheet so that it can be referenced?
Can you view cached pages through the Crawl Diagnostics page?
I am not 100% sure I got your question right. I am assuming here that:
- What GSA is missing to index are the pages from which there is the link "print this page" (rather than assuming those pages are indexed and the problem is in indexing the printable version of such pages)
- The following bit means that you can find other pages which contain other terms, and not that you can find the missing pages if you search them with another term.
I can search at /search/google_appliance/TERM and they do not show up. When I search for other terms, they do show up. In other words, I know that GSA is working
Please correct me if I misunderstood your question. Should I have got it wrong, please provide some more details about the terms you are using.
This is however what I I would do for identifying the source of the problem (although I would probably not do these in this precise order):
- I would try to understand what are the distinctive elements of the "bad pages" (if any) that trigger the odd behaviour. It seems that you have already done some of this digging and consider the culprit to be the print link. Have you verified this by removing the link altogether and see if the pages get correctly indexed in this case?
- I would check if there is any rule in
robots.txtthat might interfere with the indexing. GSA honors that file, so for example if your pages' URL is beginning with
/admin/, those pages will be skipped.
- I would check if my pages have some kind of access control restricting their view. Should this be the case, I would check that GSA has been configured for that. (The same applies for unpublished pages of course, where you have to be admin to see or index them with an external application).
- I am not sure if GSA uses
sitemap.xmlto perform the indexing. However I would inspect the drupal generated
sitemap.xmlfile (if any) to check for blatant errors like a priority set to 0, for example. If you haven't such file, and know that GSA uses it, I would try to generate one with the appropriate module and see if this solves the problem.
- I would inspect the sitemap generated by GSA to see if it shows any blatant anomaly too. This would clearly not be the problem, but any kind of self-explanatory anomaly could put you on the right track.
- I the problem is not specific to the page structure (see point #1 of this list) I would begin to systematically search what is the non-structural element that generates the error. Does a different theme solves the problem. Does deactivating a given module solves the problem? (Maybe the problem is with meta-tags? Maybe with the "print this page" module? Maybe a module sets the language of those pages to a different language than the rest of the site?). All of these are rather unlikely possibilities, but before smashing down the GSA with an sledgehammer I would try that too.
- I would go through (probably for the Nth time) all the settings of my GSA.
All of the above - if I had the chance to - I would do it with a peer. He or she could help ruling out the "human factor" as source of the problem (i.e. that little checkbox in the configuration panel that to him/her is so paramount but that you never noticed before...).
If you manage to find out any more hints on what is going on, report them back here. If it is a problem on the drupal side I'm pretty sure me or somebody else of the excellent "drupalists" hanging around on SO will be able to help.