Editing (v7+)

File Links Creating Duplicate Content? (possible SEO issue)

Permalink September 15, 2013 at 11:41 AM

Hi

I just ran Xenu Link sleuth over a site and noticed that everywhere I had a PDF file linked, the URL is slightly different, even though they all link to the same file.

e.g.
mysite.com/index.php/download_file/view/1/42/
mysite.com/index.php/download_file/view/1/44/
mysite.com/index.php/download_file/view/1/46/
mysite.com/index.php/download_file/view/1/48/

All these pages link to the same file, but I noticed the value at the end matches the c5 page ID.

I'm a little worried this could cause duplicate content (SEO) issues with Google.

Does anyone know how to disable this so that the link is just
mysite.com/index.php/download_file/view/1/
(that is without the page ID)

Otherwise, would you know if this could cause a dupe content issue?

Cheers!

johnpaulb replied on Sep 16, 2013 at 5:19 pm Permalink Reply

Hello malkau,

Have you tried clearing the page cache? Maybe the download link was generated and is just cached, clearing it should make the links stop working:
1. Login to your Concrete5 Dashboard.

2. Roll your mouse over the Dashboard button and click the System & Settings option. This willl bring up the System & Settings menu.

3. Under the Optimization section, select Clear Cache to bring up the Clear Cache Menu.

4. Click the Clear Cache button to clear your cache. Once it completes, you will see a notification stating "Cached files removed."

You can also set the cache to clear periodically in the Cache & Speed settings, for example twice a day (every 720 minutes). Here is a link to an article I did on cache and speed settings:
http://www.webhostinghub.com/support/edu/concrete5/get-started/cach...

I hope this is helpful,
John-Paul

malkau replied on Sep 16, 2013 at 9:36 pm Permalink Reply

Hi John-Paul

I tried clearing the cache but that didn't change the file link URLs.

I am guess the URLs work like this for tracking purposes... but as I said, am just worried that Google will see this as the same PDF loaded to the site 10 times or something, when it's really the one file with the pageID included at the end of the URL.

Cheers

johnpaulb replied on Sep 18, 2013 at 6:25 pm Permalink Reply

Hi again malkau,

If your main concern is Google finding duplicate content, you can use a robots.txt file to limit what pages are crawled, here is a simple guide on using a robots.txt file:
http://www.webhostinghub.com/support/website/how-tos/using-robotstx...

Also, here is a link to Google's official webmaster tools guide on how to "Block or remove pages using a robots.txt file":
https://support.google.com/webmasters/answer/156449...

I hope this is helpful,
John-Paul

Forums

Editing (v7+)

File Links Creating Duplicate Content? (possible SEO issue)

Code

Post Reply

Delete Post

Mark Post as Spam

Destroy Spammer

Sign In?