Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Setup check for X-Robot-Tag expects exact match, resulting in a warning #37409

Open
6 of 9 tasks
stavros-k opened this issue Mar 26, 2023 · 18 comments
Open
6 of 9 tasks
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap enhancement feature: settings needs review Needs review to determine if still applicable privacy

Comments

@stavros-k
Copy link

stavros-k commented Mar 26, 2023

⚠️ This issue respects the following points: ⚠️

  • This is a bug, not a question or a configuration/webserver/proxy issue.
  • This issue is not already reported on Github (I've searched it).
  • Nextcloud Server is up to date. See Maintenance and Release Schedule for supported versions.
  • Nextcloud Server is running on 64bit capable CPU, PHP and OS.
  • I agree to follow Nextcloud's Code of Conduct.

Bug description

I have set the following tags using treafik (which I can see in the response headers)

x-robots-tag: none,noarchive,nosnippet,notranslate,noimageindex,nofollow,noindex

Yet nextcloud still shows

The "X-Robots-Tag" HTTP header is not set to "noindex, nofollow". This is a potential security or privacy risk, as it is recommended to adjust this setting accordingly.

Steps to reproduce

  1. Set required X-Robot-Tag plus some extra.
  2. Go to Overview
  3. See the warning

Expected behavior

Warning to only show when headers are lesser than the recommended

Installation method

Community Docker image

Operating system

Other

PHP engine version

PHP 8.1

Web server

Nginx

Database engine version

PostgreSQL

Is this bug present after an update or on a fresh install?

Updated to a major version (ex. 22.2.3 to 23.0.1)

Are you using the Nextcloud Server Encryption module?

Encryption is Disabled

What user-backends are you using?

  • Default user-backend (database)
  • LDAP/ Active Directory
  • SSO - SAML
  • Other

Configuration report

root@a0e913b615b7:/# occ config:list system
{
    "system": {
        "memcache.local": "\\OC\\Memcache\\APCu",
        "memcache.distributed": "\\OC\\Memcache\\Redis",
        "memcache.locking": "\\OC\\Memcache\\Redis",
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "port": 6379,
            "password": "***REMOVED SENSITIVE VALUE***"
        },
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "10.10.10.254",
            "cloud.DOMAIN.TLD"
        ],
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***",
        "overwrite.cli.url": "https:\/\/cloud.DOMAIN.TLD\/",
        "overwriteprotocol": "https",
        "overwritehost": "cloud.DOMAIN.TLD",
        "dbtype": "pgsql",
        "version": "26.0.0.11",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "mysql.utf8mb4": true,
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "installed": true,
        "updater.release.channel": "stable",
        "maintenance": false,
        "theme": "",
        "loglevel": 2,
        "logfile": "\/config\/log\/nextcloud\/nextcloud.log",
        "logdateformat": "D d\/m\/Y H:i:s",
        "log.condition": {
            "apps": [
                "admin_audit"
            ]
        },
        "data-fingerprint": "6619a1ae23dec248f717298ba30bd687",
        "mail_smtpmode": "smtp",
        "mail_smtpauthtype": "LOGIN",
        "mail_sendmailmode": "pipe",
        "mail_smtpauth": 1,
        "mail_smtpsecure": "tls",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpport": "587",
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "trashbin_retention_obligation": "30,60",
        "versions_retention_obligation": "30,60",
        "enable_previews": true,
        "preview_max_x": "2048",
        "preview_max_y": "2048",
        "preview_max_memory": "1024",
        "jpeg_quality": "60",
        "default_phone_region": "GR",
        "activity_expire_days": "90",
        "app_install_overwrite": [
            "previewgenerator",
            "breezedark",
            "bookmarks",
            "bruteforcesettings",
            "imageconverter",
            "files_antivirus",
            "news",
            "timemanager"
        ],
        "enabledPreviewProviders": [
            "OC\\Preview\\Image",
            "OC\\Preview\\MP3",
            "OC\\Preview\\TXT",
            "OC\\Preview\\PDF",
            "OC\\Preview\\Movie",
            "OC\\Preview\\Photoshop",
            "OC\\Preview\\TIFF",
            "OC\\Preview\\SVG",
            "OC\\Preview\\OpenDocument",
            "OC\\Preview\\HEIC"
        ]
    }
}

List of activated Apps

Enabled:
  - activity: 2.18.0
  - admin_audit: 1.16.0
  - bookmarks: 13.0.1
  - breezedark: 25.0.1
  - bruteforcesettings: 2.6.0
  - calendar: 4.3.1
  - circles: 26.0.0
  - cloud_federation_api: 1.9.0
  - comments: 1.16.0
  - contacts: 5.2.0
  - contactsinteraction: 1.7.0
  - dashboard: 7.6.0
  - dav: 1.25.0
  - deck: 1.9.0
  - federatedfilesharing: 1.16.0
  - federation: 1.16.0
  - files: 1.21.1
  - files_antivirus: 4.0.3
  - files_automatedtagging: 1.16.1
  - files_pdfviewer: 2.7.0
  - files_rightclick: 1.5.0
  - files_sharing: 1.18.0
  - files_trashbin: 1.16.0
  - files_versions: 1.19.1
  - firstrunwizard: 2.15.0
  - imageconverter: 1.3.4
  - logreader: 2.11.0
  - lookup_server_connector: 1.14.0
  - mail: 3.0.2
  - news: 21.1.0
  - nextcloud_announcements: 1.15.0
  - notes: 4.7.2
  - notifications: 2.14.0
  - oauth2: 1.14.0
  - password_policy: 1.16.0
  - photos: 2.2.0
  - previewgenerator: 5.2.1
  - privacy: 1.10.0
  - provisioning_api: 1.16.0
  - recognize: 3.7.0
  - recommendations: 1.5.0
  - related_resources: 1.1.0-alpha1
  - richdocuments: 8.0.0
  - serverinfo: 1.16.0
  - settings: 1.8.0
  - sharebymail: 1.16.0
  - support: 1.9.0
  - survey_client: 1.14.0
  - suspicious_login: 4.4.0
  - systemtags: 1.16.0
  - tasks: 0.14.5
  - text: 3.7.2
  - theming: 2.1.1
  - twofactor_backupcodes: 1.15.0
  - twofactor_totp: 8.0.0-alpha.0
  - updatenotification: 1.16.0
  - user_status: 1.6.0
  - viewer: 1.10.0
  - weather_status: 1.6.0
  - workflowengine: 2.8.0
Disabled:
  - encryption: 2.14.0 (installed 2.8.1)
  - files_external: 1.18.0 (installed 1.17.0)
  - onlyoffice: 7.8.0 (installed 7.8.0)
  - user_ldap: 1.16.0

Nextcloud Signing status

No errors have been found.

Nextcloud Logs

No response

Additional info

No response

@stavros-k stavros-k added 0. Needs triage Pending check for reproducibility or if it fits our roadmap bug labels Mar 26, 2023
@szaimen
Copy link
Contributor

szaimen commented Mar 26, 2023

Cc @MichaIng

@MichaIng
Copy link
Member

MichaIng commented Mar 26, 2023

Not related to our recent change, so I removed 26-feedback tag.

Same request was done here: nextcloud/docker#1218
Since it makes more sense to discuss it here on the server repo, I'm closing the issue at the Docker repo and copying the relevant part of my comment here:

All the other header values you added are obsolete if noindex is set. So they are not "doing more" but should be removed to save the additional bytes transferred and to avoid possible confusion of other search engines which do not understand them.

So while the extras do most likely not hurt, they are all redundant/unnecessary and at best ballast. And while I do not think so, there is also a chance that those, since not supported by all search engines, may even confuse other search engines and in worst case lead to the essential noindex being ignored. So IMO it is correct to explicitly accept only noindex, nofollow, as failsafe value.

@smileyxy
Copy link

smileyxy commented Apr 3, 2023

modify the X-Robots-Tag in nextcloud\config\nginx\site-confs
"add_header X-Robots-Tag "noindex, nofollow" always;"
It should be possible to eliminate this warning.

@hrenard
Copy link
Contributor

hrenard commented Jun 13, 2023

Shouldn't none be equivalent to noindex, nofollow and not show the warning ?

@MichaIng
Copy link
Member

It is not equivalent: #36689

@schneidr
Copy link

So IMO it is correct to explicitly accept only noindex, nofollow, as failsafe value

I disagree. While this might be true for the values noted in the OP I am using the additional value noai, which is not covered by noindex or nofollow. It should only be checked if the wanted values are set, not if they are set exclusively.

@MichaIng
Copy link
Member

Is there some "official" documentation about this header value? I could only find it being respected by the img2dataset tool, which used to massively cause trouble for image-related websites like DeviantArt, not to forget the personal and legal concerns: https://github.com/rom1504/img2dataset/pull/249/files

However, checking the docs of the option which is not set by default to 4 header values, suggests that any of them is interpreted as denial: https://github.com/rom1504/img2dataset/blob/6adc82b/README.md?plain=1#L169
So for this tool at least, noindex implies noai (as long as users of this tool do not override/empty the options default).

If you know a tool/service/case where it is different, and noai is additionally required, let me know, and I will adjust the check when I find time.

@darkBuddha
Copy link

This is still not fixed

@MichaIng
Copy link
Member

It is also still no closed and the confirmation that any additional value for this header makes any sense is outstanding (see my last reply). As long as noindex, nofollow covers everything else, there is no point to allow extending the value.

@darkBuddha
Copy link

darkBuddha commented Oct 29, 2023

Yes, but e.g. nofollow, noindex should also be valid as it has the same meaning. Currently it triggers the warning.

The test should look for the presence of noindex and nofollow instead of looking fornoindex, nofollow.

Simple as.

@MichaIng
Copy link
Member

MichaIng commented Oct 29, 2023

Of course we could make those tests overly complex and hence error-prone. But is it really such a problem to just exchange the two arguments in your webserver config?

@darkBuddha
Copy link

I don't consider a test that tests correctly overly complex

I consider a test that doesn't test correctly a bug

If a project doesn't aim to be bug free, it can never become bug free

@MichaIng
Copy link
Member

I am just a spare time contributor as well, and I have no motivation to rewrite the test unless it is really required to solve peoples need. All headers tests expect a hardcoded string, this was always the case, and it should not be a problem unless this hardcoded string is not fulfilling all possible user needs. As noindex, nofollow and nofollow, noindex are equivalent, the second does not add any use case and hence does not solve any problem, so I am not gonna invest time for this.

If you have a different opinion, and changing the header in your webserver config causes you more trouble then rewriting the test yourself, you are of course free to open a PR 😉.

@kale1d0code
Copy link

I have X-Robots-Tag set to noindex, nofollow yet I still get error messages about it not being set to noindex,nofollow

might be best to remove the error message altogether as I have never known it to work correctly.

@darkBuddha
Copy link

The entire approach is flawed, wondering how low quality code like this can stay in such a big project.

@MichaIng
Copy link
Member

@kale1d0code
Then you did not apply it correctly. Please check this:

curl -IL https://localhost/

or of your Nextcloud is in a sub directory:

curl -IL https://localhost/nextcloud/

@darkBuddha
You are free to open a PR to enhance the code, if you think you can do it better and it is worth your time. I am a volunteer with own projects, and others I am contributing to. All I did was changing the hardcoded expected value none into the hardcoded expected value noindex, nofollow, since the first is not supported by all search engines. This was a quick change allowing you to have your Nextcloud instance removed from all search engines without triggering the admin panel warning, so this was a real enhancement and worth my time. Based on all information I read here and elsewhere, changing this into a complicated conditional to support all variances of this header value, with(out) spaces, additional values, swapped values etc, does not add anything to hiding Nextcloud instance content from search engines. All it does is, allowing admins to copy and paste other values into their webservers, which are not better than noindex, nofollow, but probably worse (if we do not further research about conflicting values, code the check very carefully etc), and this for a significantly larger amount of invested time, either someones free time, or paid developer time, which is surely better invested in real enhancements, at least IMO.

If this topic remains just a "but I want to be able to copy&paste other/worse values into my config, while I am not able or willing to implement this myself", I vote for locking it. If someone comes up with a "real" argument, then we can reopen it.

@kale1d0code
Copy link

kale1d0code commented Jul 18, 2024

I ran

curl -IL http://localhost:8080/

and found the x-robots-tag in every 302 and 303 request but the last 200 request did not have the header.
This is confusing because nginx should always add the header to every single request due to my config

server {
    listen 8080;
    listen [::]:8080;
    server_name ${NGINX_HOST};

    port_in_redirect off;
    fastcgi_param  SERVER_PORT 443;
    proxy_set_header Host $http_host;
    fastcgi_param HTTPS on;

    # HTTP response headers borrowed from Nextcloud `.htaccess`
    add_header Referrer-Policy                      "no-referrer"              always;
    add_header X-Content-Type-Options               "nosniff"                  always;
    add_header X-Download-Options                   "noopen"                   always;
    add_header X-Frame-Options                      "SAMEORIGIN"               always;
    add_header X-Permitted-Cross-Domain-Policies    "none"                     always;
    add_header X-Robots-Tag                         "noindex, nofollow"        always;
    add_header X-XSS-Protection                     "1; mode=block"            always;
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
    fastcgi_hide_header X-Powered-By;

    root /var/www/html;
    ...

another confusing thing is the header shows a space in the curl output but doesn't in chrome devtools.

@MichaIng
Copy link
Member

MichaIng commented Jul 18, 2024

Note that in Nginx, add_header directives or parent config blocks have no effect if applied child config blocks contain any add_header by themselves. So if there is any other location block or similar applied for the last request, this might explain it.

@joshtrichards joshtrichards changed the title [Bug]: Nextcloud checks X-Robot-Tag for exact match, resulting in a warning. [Bug]: Setup check for X-Robot-Tag expects exact match, resulting in a warning Jul 31, 2024
@joshtrichards joshtrichards added needs review Needs review to determine if still applicable enhancement and removed overview labels Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap enhancement feature: settings needs review Needs review to determine if still applicable privacy
Projects
None yet
Development

No branches or pull requests

9 participants