Hi David,
have a question that I can't seem to find an answer to, hopefully you can advise.
Your normal PT install includes a robots.txt file containing the following:
User-agent: *
Disallow: /categories.php
Disallow: /brands.php
Disallow: /reviews.php
Disallow: /category/
Disallow: /brand/
Disallow: /review/
Disallow: /admin/
Disallow: /search.php
Disallow: /jump.php
Assuming a normal recommended install of PT on WP, where price-tapestry has been installed at wp-root/pt, what would your revised robots.txt now look like and what unwanted file/folders would you remove from the pt folder ?
Thanks.
Hi David,
thanks for your reply, although I am a little confused.
What about the files / folders in the pt directory ? (as asked)
As I understand it, crawlers only take notice of the robots.txt file that is included at root, not from a sub-directory / folder.
Originally files / folders were excluded from being crawled with your included robots.txt file, but now as the /pt directory is not fully included at root, they will presumably setup duplicate content between the two installs. Would it not be better to either remove some of the unwanted files / folders from /pt or use Disallow: /pt/ in the new WP robots.txt ?
What is actually still needed in /pt for WP PTO to function correctly ?
Thanks.
Hi,
You could do both;
Disallow: /pt/
(in which case no need for the disallow /pt/jump.php)
And you can cut the top level files in /pt/ to just:
/pt/config.php
/pt/config.advanced.php
/pt/jump.php
In addition, I would upload placeholder /pt/index.php:
<?php
?>
...and /pt/index.html:
<!-- -->
Cheers,
David.
--
PriceTapestry.com
Hi David, back again!
Now I have the correct robots.txt in place, it has at last managed to remove about 90% of the duplicate page titles I was seeing. However I still seem to have quite a few left all attributed to /merchant/.
Would you have any idea what could be causing them ?
A couple of examples;
https://example.com/merchant/merchant-name
https://example.com/merchant/merchant-name/
Permalinks setup with no trailing forward slash.
Hi,
First double check that there are no links being generated to the versions ending "/" - specifically, from the /shopping page go to Merchant A-Z and ensure that you have
https://example.com/merchant
...and then select a merchant, which should then be;
https://example.com/merchant/merchant-name
If that all looks OK, try adding to your WordPress .htaccess the following rule to make sure that any request to the old "/" versions is 301 (Moved Permanently) redirected to the new version:
RewriteRule ^merchant/(.*)/$ merchant/$1 [L,R=301]
Hope this helps!
Cheers,
David.
--
PriceTapestry.com
The equivalent would be;
Disallow: /productcategory
Disallow: /brand
Disallow: /review
Disallow: /shopping?pto_q=
Disallow: /pt/jump.php
(if your site is review focussed you might want to permit the /review path, the reason for the exclusion is to avoid duplicate content with the /product path...)
Cheers,
David.
--
PriceTapestry.com