powered by Jive Software

Preventing google & co. to crawl/index adminconsole (port)

Hi,

i just recognized that google & co. crawl/index my adminconsole on port 9090 and 9091. Is there a way to disallow google to index this site? Sure by adding the metatag <meta name=“robots” content=“noindex, nofollow”> to index.jsp - but that file i cant locate or even edit. So maybe you guys can add this at the next release? Also there is no way to do it with the robots.txt because “Disallow: www.domain.com:9090” doesn’t work.

Any suggestions how i can deal with it?

Thanks!

Hi,

Disallow: /index.jsp

Disallow: /login.jsp

in robots.txt should be fine and block access to these pages.

LG

Hm, ok, but what’s in the case that all my sites end up .jsp? Disallow: /index.jsp will cause blocking the whole domain.

Beside that, are you really sure, that the robots.txt will work for other ports than 80?

Hi,

one may want to test this. Anyhow I wonder who does link to your admin console. Google usually do not index pages.

Maybe the robots.txt file is automatically bound to the port where it is read from, so http://example.com:12345/robots.txt may be applied only to content on example.com:12345.

In any case I would plock access to the admin console - if access to plugins is needed one should use a reverse proxy in front of Openfire. Such a reverse proxy does also write an access.log file, that’s quite useful for internet applications.

LG

But google indexed my admin console. Also some sites from plugins…

I although think that the robots.txt is automatically bound to the port where it is read from. So Disallow: www.domain.com:9090 won’t have any effect.

Access to the admin console and some plugins are needed but i don’t want to have them indexed.

How would google even find your server at port 909x. They crawl on port 80. Unless there is a website out there on port 80 (Fastpath, Sparkweb, or som custom page) with a link to your server admin ports then I don’t see how this would have started. You can contact google and request it be stopped.

I don’t know! There is no website out there, that link to my admin console and i never set a link to it. But anyway google indexed the page. Of course i requested google to remove the page from the index (takes time) but this is only a temporary fix because i don’t tell google “to not index” this page again - so that’s because i am asking.