Its a search engine web crawler that most optimistically keep reading resources from our website in form content information. This way it adds up to server load w.r.t to actual regular users. Hence sometimes there is additional load on server due to search engine keeps requesting information.
We can smartly manage or load balanced these traffic with good use of robot.txt, sitemaps and bandwidth throttling.
By using above set of module we can channelized the given traffic and keep updated or educated search engine about our site resources for better search and right information.
To prevent bots from reading unnecessary contents are as follows:-
Disallow access to Images
User-agent: *
Disallow: /images/
Exclude Google Image Search
User-agent: Googlebot-Image
Disallow: /
We can smartly manage or load balanced these traffic with good use of robot.txt, sitemaps and bandwidth throttling.
By using above set of module we can channelized the given traffic and keep updated or educated search engine about our site resources for better search and right information.
To prevent bots from reading unnecessary contents are as follows:-
- Images, JavaScript, CSS, layout file
- Registration page.
- Search Results
- .zip,.avi,.docx,.pptx and so on
- Pages- that needs- authentication and authorization.
Disallow access to Images
User-agent: *
Disallow: /images/
Exclude Google Image Search
User-agent: Googlebot-Image
Disallow: /
No comments :
Post a Comment