Customize robots for wordpress | Workion. ru

Some newbies who create WordPress sites are sure that after installing the engine you can begin to fill the site. In fact, there are a lot of subtleties that you need to think about before the search robots begin to index the site.

The robots file. txt contains data that helps to limit search engines from unnecessary information on the site, its presence and configuration is a must.

The robots file. WordPress txt is installed by default, so all you have to do is set it up.

Customize robots for wordpress | Workion. ru

in robots. The txt, which is installed by default, already contains some data, for example, the string User-agent. This line allows you to specify for which search robot you are setting. The * symbol indicates that the settings are common for all search engine robots.

If necessary, you can enter in the User-agent line the name of one of the search engine robots and thereby set individual parameters. Here are the names of search robots:

Customize robots for wordpress | Workion. ru

In each of the search engines there are also individual robots that are responsible for specific content. In the network you can find the names of each of them, but they are used quite rarely. For example, imagine several Yandex robots:

Customize robots for wordpress | Workion. ru

The following important parts of the robots file. txt is the Disallow, Host and Satemap directives.

Disallow - thanks to this value, you can hide some of the site information from search robots. Default in your robots. txt should already be closed from indexing the following directories:

Disallow: / wp-admin /
Disallow: / wp-includes /
Disallow: / wp-trackback
Disallow: / wp-feed
Disallow: / wp-comments
Disallow: / wp-content / plugins
Disallow: / wp-content / themes
Disallow: / wp-login. php
Disallow: / wp-register. php

This is not the entire list of closed directories containing various pages of plugins, cache, administrative panel and other directories.

What is better to close from search robots?

This may not be unique content or duplicate pages. As practice shows, when using CMS, many people have a problem with duplicates, which are easiest to hide.

Host - the following function allows you to specify the main address of the site. Even if your site has one domain, it has 2 values, for example, www. workion. ru and just workion. ru. When buying links, you must use one of the types of URLs of your site, and in the robots file. txt, specify which one is the main one.

Satemap - this line is intended to indicate the path to the site map. Creating a sitemap on WordPress is not difficult, there are special plugins for this. It is necessary for the search engines to quickly find new materials for indexing.

Questions about configuring Robots. txt

My regular readers literally fall asleep with various issues related to this file. In order not to write the same thing many times, I decided to compile a list of popular questions and answer them:

  1. How to prevent page indexing?

To prohibit indexing of a single page, use Disallow function, here's an example:

Disallow: http: // www. domain ru / shop / 22

  1. How to prohibit site indexing?

It is also useful for Disallow, put a link to the root of the site (can be installed for certain search engines using User-agent):

Disallow: /

  1. How to specify a Sitemap?

For search engines to correctly find a site map use the Sitemap function:

Sitemap: http: // sait / sitemap. xml

  1. How to disable broken links?

When different plugins work, broken links may appear. In order not to prohibit the components completely, define them and add Robots in turn in the code:

Disallow: / index. php? option = com_jreviews. Itemid = 91

  1. How to disable indexing of a subdomain?

To close a subdomain, you need to create Robots in the root of the secondary site. txt and there prescribe the same code as in the second question (a complete ban on indexing the site).

Here are some simple solutions to complex issues. Newbies are often interested in this, so the information should be useful.

Correct Robots for WordPress, how to configure?

For each site you need to create an individual robots file. txt for the Workion blog.ru it looks like this:

User-agent: *
Disallow: / wp-admin
Disallow: / wp-includes
Disallow: / wp -content / plugins
Disallow: / wp-content / cache
Disallow: / wp-content / themes
Disallow: / trackback
Disallow: * / trackback
Disallow: * / * / trackback
Disallow: * / * / feed / * /
Disallow: * / feed
Disallow : / *? *
Disallow: / tag

User-agent: Yandex
Disallow: / wp-admin
Disallow: / wp-includes
Disallow: / wp-content / plugins
Disallow: / wp-content / cache
Disallow: / wp-content / themes
Disallow: / trackback
Disallow: * / trackback
Disallow: * / * / trackback
Disallow: * / * / feed / * /
Disallow: * / feed
Disallow: / *? *
Disallow: / tag
Host: your_site. ru

Sitemap: http: // your_site. ru / sitemap. xml. gz
Sitemap: http: // your_site. ru / sitemap. xml

If you have already created a site on CMS WordPress and you have never paid attention to the robots file. txt, we strongly recommend that you do this. So that even newbies do not have problems when setting up this important file, let's see what all these lines are written in:

User-agent: - indicates that the specified rules will be taken into account by all search engines. If it is necessary to set rules for a specific search engine, the format User-agent: Yandex is specified.

Allow is the inverse Disallow function, it allows indexing (you can choose not to use WordPress).

Asterisk * - indicates an arbitrary character set.

The remaining functions are already described in this article. In principle, to understand this is not necessary, because you can take the finished version a little higher.

The official Yandex website has a detailed description of all the important points for setting up robots. txt (

).

After setting up various parameters and installing above the specified code, problems appeared. It turned out that the site is not configured CNC (what is the URL and CNC). If there are no human-readable addresses on your resource, use the following code in Robots. txt:

User-agent: *
Disallow: / cgi-bin
Disallow: / wp-admin
Disallow: / wp-includes
Disallow: / wp-content / plugins
Disallow: / wp-content / cache
Disallow: / wp-content / themes
Disallow: / trackback
Disallow: * / trackback
Disallow: * / * / trackback
Disallow: * / * / feed / * /
Disallow: * / feed
Disallow: / tag
User-agent: Yandex
Disallow: / cgi-bin
Disallow: / wp-admin
Disallow: / wp-includes
Disallow: / wp-content / plugins
Disallow: / wp-content / cache
Disallow: / wp-content / themes
Disallow: / trackback
Disallow: * / trackback
Disallow: * / * / trackback
Disallow: * / * / feed / * /
Disallow: * / feed
Disallow: / tag
Host: your_site. ru
Sitemap: http: // your_site / sitemap. xml. gz
Sitemap: http: // your_site / sitemap. xml

Every time after making changes in this file, check it. For this, search engines have special tools ().

If you do not want to manually configure this file, you can use the plugin to configure Robots. txt All in One SEO Pack.

Search bots can not independently determine which directories of your site to go to and what exactly to index.

They need help with this, and Robots setting. txt really helps to do this . Make it so that this file is ideal for your resource, this is one of the important points of optimization.

You will also be interested in:
- How to get natural links?
- Free Website Builder Fo
- Negative impact of directory runs

Search

Related Articles