How to Noindex OneSignal WordPress Plugin Directory?


(Serdar) #1

Hi,

I’m using Onesignal on my website. And Google shows OneSignal indexes of my website.

How can I remove them from the Google?


(Serdar) #2

Should I add following code to the robots.txt?

Disallow: /wp-content/plugins/onesignal-free-web-push-notifications


#3

Your issue is similar to this thread! Check it out!


(Ultra Noob) #4

Hi dear @Turk

According to Google Webmaster Guidelines, to noindex any URL the first basic rule is

  • Never block it via robots.txt

Why? Because, robots.txt prevent crawling. If Googlebot cannot crawl, they cannot see the HTTP header or parse HTML to see the noindex directive added via meta tag.

Disallow: / 

is not equal to

<meta name="robots" content="noindex">

or

<meta name="googlebot" content="noindex">

Or HTTP header

X-Robots-Tag: noindex

Or explicitly

HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: googlebot: nofollow
X-Robots-Tag: otherbot: noindex, nofollow
(…)

Why??

Because, the noindex directive explicitly tells Bots to do not index specific URL.

Unfortunately, some wrong information trending over Internet has made people believe that robots.txt prevent indexing.

I see webmasters simply block URL by using Disavow directive at the robots.txt that simply block crawling. URL would be still visible in the SERPs.

Solution
To noindex any URL,

  • first you shouldn’t block that URL via robots.txt
  • Second, you should use noindex directive

Notable Reference (which most people HATE reading)


(Ultra Noob) #5
$url = "//{$_SERVER['HTTP_HOST']}{$_SERVER["REQUEST_URI"]}";
if (preg_match("#/onesignal-free-web-push-notifications/#", $url))
{
	header( "X-Robots-Tag: noindex, follow", true );
}

To noindex alleged Onesignal path, add above code in the functions.php


#6

Will the above code for my case as well? My blog is based on Vultr Cloud VPS LEMP Stack, No Apache server installed! (no .htaccess file)


(Ultra Noob) #7

Assuming, you are using WordPress, and a theme that has functions.php file. Just there you need to add that code in the last line. Alternatively, you may place it via Code Snippet plugin.


(Serdar) #8

Thank you bro :slight_smile:
I have added your code to functions.php of my child theme.

And robots.txt looks like this according to your recommendations:

User-agent: *
Disallow: /?s=*
Disallow: /wp-admin/
Disallow: /xmlrpc.php
Allow: /wp-admin/admin-ajax.php
Allow: /wp-admin/images/
Allow: /wp-admin/css/
Allow: /wp-admin/js/

Sitemap: https://kriptobyte.com/sitemap_index.xml

(Ultra Noob) #9

You’re welcome!

Everything is good but

Disallow: /xmlrpc.php

This is not something which should be there. I guess, I never recommended it.


(Serdar) #10

You’re right! sorry,

I have added Disallow: /xmlrpc.php line according to Perishablepress

But I can remove that :slight_smile:


(Ultra Noob) #11

Once that functions.php code added,

Please make sure to verify HTTP header response, it should return a line

x-robots-tag => noindex, follow

Ref: screenshot (an example) representing when I did test for path :arrow_right: https://www.gulshankumar.net/wp-content/plugins/onesignal-free-web-push-notifications/readme.txt


(Ultra Noob) #12

What does it mean?
When Googlebot next time will crawl specific path related to OneSignal, it will do noindex. It may take sometime.

So, this is the proper way to do. Any question? Please feel free to ask. Thanks


(Serdar) #13

I also redirected wp-content pages to 403

56

Is that bad idea?


(Ultra Noob) #14

403 is a vague kind of HTTP response for the Googlebot (Ref: 1). Untill this issue resolved you should avoid it.

  1. Tweet by the Google representative, webmasters trend analyst

(Serdar) #15

Thank you so much :slight_smile:

I have realized now, readme.txt files can not open because of the following code:

# SECURE LOOSE FILES 
<IfModule mod_alias.c>
	RedirectMatch 403 (?i)(^#.*#|~)$
	RedirectMatch 403 (?i)/readme\.(html|txt)
	RedirectMatch 403 (?i)\.(ds_store|well-known)
	RedirectMatch 403 (?i)/wp-config-sample\.php
	RedirectMatch 403 (?i)\.(7z|bak|bz2|com|conf|dist|fla|git|inc|ini|log|old|psd|rar|tar|tgz|save|sh|sql|svn|swo|swp)$
</IfModule>

(Ultra Noob) #16

Yes, that is true. You can try checking some other file existing in the OneSignal directory.

Here’s one, I just found. It is a plain CSS code. It would be helpful in the debugging.

/wp-content/plugins/onesignal-free-web-push-notifications/views/css/icons.css


(Serdar) #17

Although I put the code to the functions.php of my child theme, header looks like this:

01

I don’t know why it is :open_mouth:

https://kriptobyte.com/wp-content/plugins/onesignal-free-web-push-notifications/views/css/icons.css


(Ultra Noob) #18

Yes, I see. Can you please try adding same Snippet using below plugin? Make sure to save and activate it.


(Amit Tiwari) #19

Why should block search result in robots.txt?

  1. Disallow: /*?s=
  2. Disallow: /?s=*
    What is the difference between 1 & 2 ?

(Ultra Noob) #20

This will block these type of URLs

http://example.com/anything_comes_here?s=

Doesn’t adhere to WordPress permalink structure. As far I know, WordPress doesn’t use this kind of permalink with query string.

This is suppose to block crawling for the Search Results page. For example, this one.
https://www.gulshankumar.net/?s=WordPress