HTTPS surface reachable (robots ✓, sitemap ✓, title ✗)
Why it matters: Public files — robots.txt, sitemap.xml, head meta — are what attackers see first during reconnaissance. Misadvertised paths, stale sitemaps, and verbose generators leak more than intended (ISO 27001 A.8.9).
robots.txt
present
# Google Search Engine Robot
# ==========================
User-agent: Googlebot
Allow: /*?s=
Allow: /*?t=
Allow: /*?ref_src=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Allow: /i/api/
Disallow: /*?lang=en-ss
Allow: /*?lang=
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid
Disallow: /*/analytics
Disallow: /*/followers
Disallow: /*/following
Disallow: /account/deactivated
Disallow: /settings/deactivated
Disallow: /[_0-9a-zA-Z]+/status/[0-9]+/likes
Disallow: /[_0-9a-zA-Z]+/status/[0-9]+/retweets
Disallow: /[_0-9a-zA-Z]+/likes
Disallow: /[_0-9a-zA-Z]+/media
Disallow: /[_0-9a-zA-Z]+/photo
Allow: /*?
User-agent: Bingbot
Allow: /*?s=
Allow: /*?t=
Allow: /*?ref_src=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Allow: /i/api/
Disallow: /*?lang=en-ss
Allow: /*?lang=
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid
Disallow: /*/analytics
Disallow: /*/followers
Disallow: /*/following
Disallow: /account/deactivated
Disallow: /settings/deactivated
Disallow: /[_0-9a-zA-Z]+/status/[0-9]+/likes
Disallow: /[_0-9a-zA-Z]+/status/[0-9]+/retweets
Disallow: /[_0-9a-zA-Z]+/likes
Disallow: /[_0-9a-zA-Z]+/media
Disallow: /[_0-9a-zA-Z]+/photo
Allow: /*?
User-agent: facebookexternalhit
Allow: /*?lang=
Allow: /*?s=
Allow: /*?t=
Allow: /*?ref_src=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Allow: /search?*cashtagRestId=
Allow: /i/api/
Disallow: /search/realtime
Disallow: /search/users
Disallow: /search/*/grid
Disallow: /*?
Disallow: /*/followers
Disallow: /*/following
Disallow: /account/deactivated
Disallow: /settings/deactivated
Disallow: /[_0-9a-zA-Z]+/status/[0-9]+/likes
Disallow: /[_0-9a-zA-Z]+/status/[0-9]+/retweets
Disallow: /[_0-9a-zA-Z]+/likes
Disallow: /[_0-9a-zA-Z]+/media
Disallow: /[_0-9a-zA-Z]+/photo
User-Agent: Google-Extended
Disallow: *
User-Agent: FacebookBot
Disallow: *
User-agent: Discordbot
Disallow: *
# Every bot that might possibly read and respect this file
# ========================================================
User-agent: *
Disallow: /
# WHAT-4882 - Block indexing of links in notification emails. This applies to all bots.
# =====================================================================================
Disallow: /i/u
Noindex: /i/u
# Wait 1 second between successive requests. See ONBOARD-2698 for details.
Crawl-delay: 1
# Independent of user agent. Links in the sitemap are full URLs using https:// and need to match
# the protocol of the sitemap.
Sitemap: https://x.com/sitemap.xml
Sitemap: https://twitter.com/sitemap.xml
sitemap.xml
present — 0 url(s)
social
no OpenGraph or Twitter meta tags found