HTTPS surface reachable (robots ✓, sitemap ✗, title ✓)
Why it matters: Public files — robots.txt, sitemap.xml, head meta — are what attackers see first during reconnaissance. Misadvertised paths, stale sitemaps, and verbose generators leak more than intended (ISO 27001 A.8.9).
robots.txt
present
#
# robots.txt for https://www.w3.org/
#
# $Id: robots.txt,v 1.104 2026/02/16 05:12:36 gerald Exp $
#
# the following settings apply to all bots
User-agent: *
# Blogs - WordPress
# https://wordpress.org/documentation/article/search-engine-optimization/#robots-txt-optimization
Disallow: /*/wp-admin/
Disallow: /*/wp-includes/
Disallow: /*/wp-content/plugins/
Disallow: /*/wp-content/cache/
Disallow: /*/wp-content/themes/
Disallow: /blog/?
Disallow: /blog/*/trackback/
Disallow: /blog/*/feed/
Disallow: /blog/*/comments/
Disallow: /blog/*/category/*/*
Disallow: /blog/*/*/trackback/
Disallow: /blog/*/*/feed/
Disallow: /blog/*/*/comments/
Disallow: /blog/*/*?
Disallow: /community/trackback/
Disallow: /community/feed/
Disallow: /community/comments/
Disallow: /community/category/*/*
Disallow: /community/*/trackback/
Disallow: /community/*/feed/
Disallow: /community/*/comments/
Disallow: /community/*/category/*/*
Disallow: /community/*?
Disallow: /Consortium/Offices/trackback/
Disallow: /Consortium/Offices/feed/
Disallow: /Consortium/Offices/comments/
Disallow: /Consortium/Offices/category/*/*
Disallow: /Consortium/Offices/*/trackback/
Disallow: /Consortium/Offices/*/feed/
Disallow: /Consortium/Offices/*/comments/
Disallow: /Consortium/Offices/*?
# Wikis - Mediawiki
# https://www.mediawiki.org/wiki/Manual:Robots.txt
Disallow: /wiki/index.php?
Disallow: /wiki/index.php/Help
Disallow: /wiki/index.php/MediaWiki
Disallow: /wiki/index.php/Special:
Disallow: /wiki/index.php/Template
Disallow: /wiki/Special:
Disallow: /wiki/Help:
Disallow: /wiki/Talk:
Disallow: /wiki/skins/
Disallow: /wiki/*/info
Disallow: /wiki/*/edit
Disallow: /wiki/*/history
Disallow: /*/wiki/index.php?
Disallow: /*/wiki/index.php/Help
Disallow: /*/wiki/index.php/MediaWiki
Disallow: /*/wiki/index.php/Special:
Disallow: /*/wiki/index.php/Template
Disallow: /*/wiki/Special:
Disallow: /*/wiki/Help:
Disallow: /*/wiki/Talk:
Disallow: /*/wiki/skins/
Disallow: /*/wiki/*/info
Disallow: /*/wiki/*/edit
Disallow: /*/wiki/*/history
# various other access-controlled or expensive areas
Disallow: /2004/ontaria/basic
Disallow: /Member/
Disallow: /Team/
Disallow: /Project/
Disallow: /Web/
Disallow: /Systems/
Disallow: /Out-Of-Date
Disallow: /2005/06/blog/
Disallow: /2004/08/W3CTalks
Disallow: /2007/11/Talks/search
Disallow: /People/all/
Disallow: /people/all/
Disallow: /People/domain/
Disallow: /people/domain/
Disallow: /RDF/Validator/ARPServlet
Disallow: /RDF/Validator/rdfval
Disallow: /2003/03/Translations/byLanguage
Disallow: /2003/03/Translations/byTechnology
Disallow: /2005/11/Translations/Query
Disallow: /2000/06/webdata/xslt
Disallow: /2000/09/webdata/xslt
Disallow: /2005/08/online_xslt/xslt
Disallow: /Search/Mail/Public/
Disallow: /2006/02/chartergen
Disallow: /2004/01/pp-impl
Disallow: /Consortium/supporters
Disallow: /2012/pyRdfa/extract
Disallow: /2012/pyRdfa/?
Disallow: /WAI/PF/comments/
Disallow: /WAI/events/
Disallow: /participate/conferences.xml
Disallow: /scripts/
Disallow: /2005/01/yacker/
Disallow: /2005/01/yacker?
Disallow: /2005/07/pubrules?
Disallow: /ns/hydra/console/?
Disallow: /2007/08/grddl/?
Disallow: /2009/07/webidl-check?
Disallow: /RDF/Validator/ARPServlet?
Disallow: /2000/06/webdata/xsv?
Disallow: /2000/09/webdata/xsv?
Disallow: /2007/10/htmldiff?
Disallow: /Style/CSS/members.be/
Disallow: /services
Disallow: /2021/05/view-gallery/
Disallow: /International/questions/qa-html-language-declarations/icons/
Disallow: /International/questions/qa-html-language-declarations/qa-html-language-declarations-data/icons/
Disallow: /*,*
Disallow: /WAI/beta/
Disallow: /WAI/ut1/
Disallow: /WAI/ut2/
Disallow: /WAI/ut3/
Disallow: /WAI/ut4/
Disallow: /WAI/drafts/
# Allow W3C Link checker to check any paths on this site
User-agent: W3C-checklink
Disallow:
head
- title
- W3C
- description
- The World Wide Web Consortium (W3C) develops standards and guidelines to help everyone build a web based on the principles of accessibility, internationalization, privacy and security.
social
- og:url
- https://www.w3.org/
- og:type
- website
- og:title
- W3C
- og:image
- https://www.w3.org/assets/website-2021/images/w3c-opengraph-image.png
- og:image:width
- 1200
- og:image:height
- 630
- og:description
- The World Wide Web Consortium (W3C) develops standards and guidelines to help everyone build a web based on the principles of accessibility, internationalization, privacy and security.
- og:site_name
- W3C
- twitter:card
- summary_large_image
- twitter:site
- w3c
- twitter:url
- https://www.w3.org/
- twitter:title
- W3C
- twitter:description
- The World Wide Web Consortium (W3C) develops standards and guidelines to help everyone build a web based on the principles of accessibility, internationalization, privacy and security.
- twitter:image
- https://www.w3.org/assets/website-2021/images/w3c-opengraph-image.png