I wonder who did tune the search function to return only a few results.
One example is the search for “iptables” which returns one result.
A Google search for “iptables +site:.igniterealtime.org” returns about 250 entries. There are also some PDF documents as Igniterealtime only blocks bot access to /fisheye.
it should scare one that the search index gets broken within CS, hopefully this affects only the 3.0.0 pre-release version.
Regarding search engines I see some more issues but one should carefully tune the robots.txt file to make sure not to drop results.
There are quite a few pages (Home, Overview, All, Discussions) which link to the same thread - but they use different target URLs. This may result to multiple (google) search results. I wonder if it makes sense to add also
Disallow: tstart=
Disallow: /community/message/
to make sure that only “/community/thread/12345” will be found. Modifying Clearspace to return a unique link for each thread (similar to /docs/DOC-1234) would be a much better solution.
So I did another google search for `“Firefox is crashing when doing Reply” +site:.igniterealtime.org´ which returns 76 results. So one may want to drop also “jsessionid=” - I really wonder how google gets the ID. It seems that that every time it scans for content it gets a new ID.