feat: enhance SurfSense with new skills, blog section, and improve SEO metadata
Some checks failed
Build and Push Docker Images / tag_release (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Has been cancelled
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Has been cancelled
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Has been cancelled

- Added multiple new skills to skills-lock.json from the repository `aaron-he-zhu/seo-geo-claude-skills`.
- Introduced `fuzzy-search` dependency in package.json for improved search functionality.
- Updated pnpm-lock.yaml to include the new `fuzzy-search` package.
- Enhanced SEO metadata across various pages, including canonical links and descriptions for better search visibility.
- Improved layout and structure of several components, including the homepage and changelog, to enhance user experience.
This commit is contained in:
DESKTOP-RTLN3BA\$punk 2026-04-11 23:38:12 -07:00
parent 61b3f0d7e3
commit 7ea840dbb2
120 changed files with 25729 additions and 352 deletions

View file

@ -0,0 +1,397 @@
---
name: technical-seo-checker
description: 'Technical SEO audit: Core Web Vitals, crawl, indexing, mobile, speed, architecture, redirects. 技术SEO/网站速度'
version: "6.0.0"
license: Apache-2.0
compatibility: "Claude Code ≥1.0, skills.sh marketplace, ClawHub marketplace, Vercel Labs skills ecosystem. No system packages required. Optional: MCP network access for SEO tool integrations."
homepage: "https://github.com/aaron-he-zhu/seo-geo-claude-skills"
when_to_use: "Use when checking technical SEO health: site speed, Core Web Vitals, indexing, crawlability, robots.txt, sitemaps, or canonical tags."
argument-hint: "<URL or domain>"
allowed-tools: WebFetch
metadata:
author: aaron-he-zhu
version: "6.0.0"
geo-relevance: "low"
tags:
- seo
- technical-seo
- core-web-vitals
- page-speed
- crawlability
- indexability
- mobile-seo
- site-health
- lcp
- cls
- inp
- robots-txt
- xml-sitemap
- 技术SEO
- 网站速度
- テクニカルSEO
- 기술SEO
- seo-tecnico
triggers:
# EN-formal
- "technical SEO audit"
- "check page speed"
- "Core Web Vitals"
- "crawl issues"
- "site indexing problems"
- "canonical tag issues"
- "duplicate content"
- "mobile-friendly check"
- "site speed"
- "site health check"
# EN-casual
- "my site is slow"
- "Google can't crawl my site"
- "Google can't find my pages"
- "mobile issues"
- "indexing problems"
- "why is my site slow"
# EN-question
- "how do I fix my page speed"
- "why is my site not indexed"
- "how to improve Core Web Vitals"
- "why did my site disappear from Google"
# EN-competitor
- "PageSpeed Insights alternative"
- "GTmetrix alternative"
- "Sitebulb alternative"
# ZH-pro
- "技术SEO检查"
- "网站速度优化"
- "核心网页指标"
- "爬虫问题"
- "索引问题"
- "网站收录"
- "sitemap提交"
- "robots设置"
# ZH-casual
- "网站加载太慢"
- "网站太慢了"
- "Google找不到我的页面"
- "手机端有问题"
- "收录不了"
- "Google收录少"
# JA
- "テクニカルSEO"
- "サイト速度"
- "コアウェブバイタル"
- "クロール問題"
- "インデックス登録"
- "モバイル最適化"
# KO
- "기술 SEO"
- "사이트 속도"
- "코어 웹 바이탈"
- "크롤링 문제"
- "사이트 왜 이렇게 느려?"
# ES
- "auditoría SEO técnica"
- "velocidad del sitio"
- "problemas de indexación"
- "mi sitio no aparece en Google"
- "velocidad de carga"
# PT
- "auditoria SEO técnica"
- "meu site não aparece no Google"
- "velocidade de carregamento"
# Misspellings
- "techincal SEO"
- "core web vitalls"
---
# Technical SEO Checker
> **[SEO & GEO Skills Library](https://github.com/aaron-he-zhu/seo-geo-claude-skills)** · 20 skills for SEO + GEO · [ClawHub](https://clawhub.ai/u/aaron-he-zhu) · [skills.sh](https://skills.sh/aaron-he-zhu/seo-geo-claude-skills)
> **System Mode**: This optimization skill follows the shared [Skill Contract](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/references/skill-contract.md) and [State Model](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/references/state-model.md).
This skill performs comprehensive technical SEO audits to identify issues that may prevent search engines from properly crawling, indexing, and ranking your site.
**System role**: Optimization layer skill. It turns weak pages, structures, and technical issues into prioritized repair work.
## When This Must Trigger
Use this when the conversation involves any of these situations — even if the user does not use SEO terminology:
Use this whenever the task needs a diagnosis or repair plan that should feed directly into remediation work, not just a one-time opinion.
- Launching a new website
- Diagnosing ranking drops
- Pre-migration SEO audits
- Regular technical health checks
- Identifying crawl and index issues
- Improving site performance
- Fixing Core Web Vitals issues
## What This Skill Does
1. **Crawlability Audit**: Checks robots.txt, sitemaps, crawl issues
2. **Indexability Review**: Analyzes index status and blockers
3. **Site Speed Analysis**: Evaluates Core Web Vitals and performance
4. **Mobile-Friendliness**: Checks mobile optimization
5. **Security Check**: Reviews HTTPS and security headers
6. **Structured Data Audit**: Validates schema markup
7. **URL Structure Analysis**: Reviews URL patterns and redirects
8. **International SEO**: Checks hreflang and localization
## Quick Start
Start with one of these prompts. Finish with a short handoff summary using the repository format in [Skill Contract](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/references/skill-contract.md).
### Full Technical Audit
```
Perform a technical SEO audit for [URL/domain]
```
### Specific Issue Check
```
Check Core Web Vitals for [URL]
```
```
Audit crawlability and indexability for [domain]
```
### Pre-Migration Audit
```
Technical SEO checklist for migrating [old domain] to [new domain]
```
## Skill Contract
**Expected output**: a scored diagnosis, prioritized repair plan, and a short handoff summary ready for `memory/audits/`.
- **Reads**: the current page or site state, symptoms, prior audits, and current priorities from [CLAUDE.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/CLAUDE.md) and the shared [State Model](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/references/state-model.md) when available.
- **Writes**: a user-facing audit or optimization plan plus a reusable summary that can be stored under `memory/audits/`.
- **Promotes**: blocking defects, repeated weaknesses, and fix priorities to `memory/open-loops.md` and `memory/decisions.md`.
- **Next handoff**: use the `Next Best Skill` below when the repair path is clear.
## Data Sources
> See [CONNECTORS.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/CONNECTORS.md) for tool category placeholders.
**With ~~web crawler + ~~page speed tool + ~~CDN connected:**
Claude can automatically crawl the entire site structure via ~~web crawler, pull Core Web Vitals and performance metrics from ~~page speed tool, analyze caching headers from ~~CDN, and fetch mobile-friendliness data. This enables comprehensive automated technical audits.
**With manual data only:**
Ask the user to provide:
1. Site URL(s) to audit
2. PageSpeed Insights screenshots or reports
3. robots.txt file content
4. sitemap.xml URL or file
Proceed with the full audit using provided data. Note in the output which findings are from automated crawl vs. manual review.
## Instructions
When a user requests a technical SEO audit:
1. **Audit Crawlability**
```markdown
## Crawlability Analysis
### Robots.txt Review
**URL**: [domain]/robots.txt
**Status**: [Found/Not Found/Error]
**Current Content**:
```
[robots.txt content]
```
| Check | Status | Notes |
|-------|--------|-------|
| File exists | ✅/❌ | [notes] |
| Valid syntax | ✅/⚠️/❌ | [errors found] |
| Sitemap declared | ✅/❌ | [sitemap URL] |
| Important pages blocked | ✅/⚠️/❌ | [blocked paths] |
| Assets blocked | ✅/⚠️/❌ | [CSS/JS blocked?] |
| Correct user-agents | ✅/⚠️/❌ | [notes] |
**Issues Found**:
- [Issue 1]
- [Issue 2]
**Recommended robots.txt**:
```
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /private/
Sitemap: https://example.com/sitemap.xml
```
---
### XML Sitemap Review
**Sitemap URL**: [URL]
**Status**: [Found/Not Found/Error]
| Check | Status | Notes |
|-------|--------|-------|
| Sitemap exists | ✅/❌ | [notes] |
| Valid XML format | ✅/⚠️/❌ | [errors] |
| In robots.txt | ✅/❌ | [notes] |
| Submitted to ~~search console | ✅/⚠️/❌ | [notes] |
| URLs count | [X] | [appropriate?] |
| Only indexable URLs | ✅/⚠️/❌ | [notes] |
| Includes priority | ✅/⚠️ | [notes] |
| Includes lastmod | ✅/⚠️ | [accurate?] |
**Issues Found**:
- [Issue 1]
---
### Crawl Budget Analysis
| Factor | Status | Impact |
|--------|--------|--------|
| Crawl errors | [X] errors | [Low/Med/High] |
| Duplicate content | [X] pages | [Low/Med/High] |
| Thin content | [X] pages | [Low/Med/High] |
| Redirect chains | [X] found | [Low/Med/High] |
| Orphan pages | [X] found | [Low/Med/High] |
**Crawlability Score**: [X]/10
```
2. **Audit Indexability**
```markdown
## Indexability Analysis
### Index Status Overview
| Metric | Count | Notes |
|--------|-------|-------|
| Pages in sitemap | [X] | |
| Pages indexed | [X] | [source: site: search] |
| Index coverage ratio | [X]% | [good if >90%] |
### Index Blockers Check
| Blocker Type | Found | Pages Affected |
|--------------|-------|----------------|
| noindex meta tag | [X] | [list or "none"] |
| noindex X-Robots | [X] | [list or "none"] |
| Robots.txt blocked | [X] | [list or "none"] |
| Canonical to other | [X] | [list or "none"] |
| 4xx/5xx errors | [X] | [list or "none"] |
| Redirect loops | [X] | [list or "none"] |
### Canonical Tags Audit
| Check | Status | Notes |
|-------|--------|-------|
| Canonicals present | ✅/⚠️/❌ | [X]% of pages |
| Self-referencing | ✅/⚠️/❌ | [notes] |
| Consistent (HTTP/HTTPS) | ✅/⚠️/❌ | [notes] |
| Consistent (www/non-www) | ✅/⚠️/❌ | [notes] |
| No conflicting signals | ✅/⚠️/❌ | [notes] |
### Duplicate Content Issues
| Issue Type | Count | Examples |
|------------|-------|----------|
| Exact duplicates | [X] | [URLs] |
| Near duplicates | [X] | [URLs] |
| Parameter duplicates | [X] | [URLs] |
| WWW/non-WWW | [X] | [notes] |
| HTTP/HTTPS | [X] | [notes] |
**Indexability Score**: [X]/10
```
3. **Audit Site Speed & Core Web Vitals** — CWV metrics (LCP/FID/CLS/INP), additional performance metrics (TTFB/FCP/Speed Index/TBT), resource loading breakdown, optimization recommendations
> **Reference**: See [references/technical-audit-templates.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-templates.md) for the performance analysis template (Step 3).
4. **Audit Mobile-Friendliness** — Mobile-friendly test, responsive design check, mobile-first indexing verification
> **Reference**: See [references/technical-audit-templates.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-templates.md) for the mobile optimization template (Step 4).
5. **Audit Security & HTTPS** — SSL certificate, HTTPS enforcement, mixed content, HSTS, security headers (CSP, X-Frame-Options, etc.)
> **Reference**: See [references/technical-audit-templates.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-templates.md) for the security analysis template (Step 5).
6. **Audit URL Structure** — URL patterns, issues (dynamic params, session IDs, uppercase), redirect analysis (chains, loops, 302s)
> **Reference**: See [references/technical-audit-templates.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-templates.md) for the URL structure template (Step 6).
7. **Audit Structured Data** — Schema markup validation, missing schema opportunities. CORE-EEAT alignment: maps to O05.
> **Reference**: See [references/technical-audit-templates.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-templates.md) for the structured data template (Step 7).
8. **Audit International SEO (if applicable)** — Hreflang implementation, language/region targeting
> **Reference**: See [references/technical-audit-templates.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-templates.md) for the international SEO template (Step 8).
9. **Generate Technical Audit Summary** — Overall health score with visual breakdown, critical/high/medium issues, quick wins, implementation roadmap (weeks 1-4+), monitoring recommendations
> **Reference**: See [references/technical-audit-templates.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-templates.md) for the audit summary template (Step 9).
## Validation Checkpoints
### Input Validation
- [ ] Site URL or domain clearly specified
- [ ] Access to technical data (robots.txt, sitemap, or crawl results)
- [ ] Performance metrics available (via ~~page speed tool or screenshots)
### Output Validation
- [ ] Every recommendation cites specific data points (not generic advice)
- [ ] All issues include affected URLs or page counts
- [ ] Performance metrics include actual numbers with units (seconds, KB, etc.)
- [ ] Source of each data point clearly stated (~~web crawler data, ~~page speed tool, user-provided, or estimated)
## Example
> **Reference**: See [references/technical-audit-example.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-example.md) for a full worked example (cloudhosting.com technical audit) and the comprehensive technical SEO checklist.
## Tips for Success
1. **Prioritize by impact** - Fix critical issues first
2. **Monitor continuously** - Use ~~search console alerts
3. **Test changes** - Verify fixes work before deploying widely
4. **Document everything** - Track changes for troubleshooting
5. **Regular audits** - Schedule quarterly technical reviews
> **Technical reference**: For issue severity framework, prioritization matrix, and Core Web Vitals optimization quick reference, see [references/http-status-codes.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/http-status-codes.md).
### Save Results
After delivering audit or optimization findings to the user, ask:
> "Save these results for future sessions?"
If yes, write a dated summary to `memory/audits/technical-seo-checker/YYYY-MM-DD-<topic>.md` containing:
- One-line verdict or headline finding
- Top 3-5 actionable items
- Open loops or blockers
- Source data references
If any veto-level issue was found (CORE-EEAT T04, C01, R10 or CITE T03, T05, T09), also append a one-liner to `memory/hot-cache.md` without asking.
## Reference Materials
- [robots.txt Reference](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/robots-txt-reference.md) — Syntax guide, templates, common configurations
- [HTTP Status Codes](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/http-status-codes.md) — SEO impact of each status code, redirect best practices
- [Technical Audit Templates](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-templates.md) — Detailed output templates for steps 3-9 (CWV, mobile, security, URL structure, structured data, international, audit summary)
- [Technical Audit Example & Checklist](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/references/technical-audit-example.md) — Full worked example and comprehensive technical SEO checklist
## Next Best Skill
- **Primary**: [on-page-seo-auditor](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/on-page-seo-auditor/SKILL.md) — continue from infrastructure issues into page-level remediation.

View file

@ -0,0 +1,705 @@
# HTTP Status Codes for SEO
SEO-relevant HTTP status codes, their implications, and how to diagnose and fix issues.
## Status Code Categories
- **2xx**: Success - Request succeeded
- **3xx**: Redirection - Further action needed
- **4xx**: Client Error - Problem with the request
- **5xx**: Server Error - Server failed to fulfill request
---
## 2xx Success Codes
### 200 OK
**What it means**: Request succeeded, content returned normally.
**SEO impact**: Positive - page is accessible and indexable.
**When to use**: Standard response for all working pages.
**When it's a problem**: When different URLs return 200 for same content (should use 301 redirect).
---
### 204 No Content
**What it means**: Request succeeded but no content to return.
**SEO impact**: Neutral - rarely used for pages meant to be indexed.
**Common use**: API responses, AJAX requests.
---
## 3xx Redirection Codes
### 301 Moved Permanently
**What it means**: Resource permanently moved to new URL. All link equity transfers.
**SEO impact**: Positive when used correctly - passes 90-99% of link equity.
**When to use**:
- Permanently changing URL structure
- Consolidating duplicate content
- Moving to new domain
- Changing HTTP to HTTPS
- Changing www to non-www (or vice versa)
**Example header**:
```
HTTP/1.1 301 Moved Permanently
Location: https://example.com/new-page
```
**Common mistakes**:
- Using 302 instead of 301 for permanent changes
- Creating redirect chains (A→B→C)
- Redirecting to irrelevant pages
- Not redirecting HTTP to HTTPS
**How to implement**:
- **.htaccess** (Apache): `Redirect 301 /old-page /new-page`
- **nginx**: `rewrite ^/old-page$ /new-page permanent;`
- **Server-side**: Set Location header with 301 status
---
### 302 Found (Temporary Redirect)
**What it means**: Resource temporarily at different URL. Original URL should still be used.
**SEO impact**: Neutral to negative - does NOT pass full link equity. Search engines keep indexing original URL.
**When to use**:
- A/B testing
- Temporary promotions
- Maintenance redirects
- Geolocation redirects (sometimes)
**When NOT to use**: Permanent URL changes (use 301).
**Warning**: Google may treat long-standing 302s as 301s, but better to be explicit.
---
### 303 See Other
**What it means**: Response can be found at another URI using GET.
**SEO impact**: Minimal - rarely used for SEO purposes.
**Common use**: After form submissions, redirect to results page.
---
### 307 Temporary Redirect
**What it means**: Temporary redirect that preserves request method (POST stays POST).
**SEO impact**: Similar to 302 - temporary, doesn't pass full link equity.
**Difference from 302**: Guarantees request method won't change (more precise than 302).
**When to use**: Temporary redirects where HTTP method preservation matters.
---
### 308 Permanent Redirect
**What it means**: Permanent redirect that preserves request method.
**SEO impact**: Similar to 301 - passes link equity.
**Difference from 301**: Guarantees request method won't change (POST stays POST).
**When to use**: Permanent redirects where method preservation matters (rare for SEO).
---
### Redirect Chain Issues
**Problem**: Multiple redirects before reaching final destination.
**Example chain**:
```
http://example.com/page
→ https://example.com/page (redirect 1)
→ https://www.example.com/page (redirect 2)
→ https://www.example.com/new-page (redirect 3)
```
**SEO impact**:
- Slows page load (each redirect = new HTTP request)
- Dilutes link equity with each hop
- Wastes crawl budget
- Poor user experience
**How to fix**: Redirect directly from original URL to final destination.
**Fixed version**:
```
http://example.com/page
→ https://www.example.com/new-page (single redirect)
```
---
### Redirect Loops
**Problem**: Redirects create infinite loop.
**Example**:
```
/page-a → /page-b
/page-b → /page-a
```
**SEO impact**: Severe - page completely inaccessible.
**Symptoms**:
- Browser shows "Too many redirects" error
- Page never loads
- Search Console shows crawl errors
**How to diagnose**:
1. Use redirect checker tool
2. Check .htaccess or nginx config for conflicting rules
3. Review server-side redirect logic
**How to fix**:
1. Identify conflicting redirect rules
2. Remove or correct the loop
3. Test thoroughly
4. Request recrawl in Search Console
---
## 4xx Client Error Codes
### 404 Not Found
**What it means**: Requested resource doesn't exist.
**SEO impact**: Neutral to negative depending on context.
**When 404s are OK**:
- Legitimately deleted pages with no equivalent
- Never-existed URLs from typos
- Temporary content that expired (old promotions)
- Intentionally removed low-quality content
**When 404s are problems**:
- Pages that should exist are returning 404
- Previously working pages now broken
- Important pages missing from navigation
- High-traffic pages deleted without redirect
**How to fix**:
1. **If content moved**: Set up 301 redirect to new location
2. **If content deleted**: Either keep 404 or redirect to relevant category
3. **If never existed**: Leave as 404
4. **If important**: Restore the page
**Monitoring 404s**:
- Check Search Console → Coverage → Not found (404)
- Review referrer data to see what's linking to 404s
- Fix high-value 404s first (most traffic/backlinks)
**Soft 404s** (BAD):
- Page returns 200 but shows "not found" message
- Search engines may keep page indexed
- Creates duplicate content issues
- Fix: Return proper 404 status code
---
### 410 Gone
**What it means**: Resource permanently deleted, never coming back.
**SEO impact**: Stronger signal than 404 - tells search engines not to return.
**When to use**:
- Discontinued products
- Expired promotions
- Permanently removed content
- Outdated information
**Difference from 404**:
- 404: "Not found" (might exist at another URL)
- 410: "Gone forever" (don't look for it)
**When to use 410 vs 301**:
- Use 410: No equivalent replacement exists
- Use 301: Relevant alternative exists
**How search engines respond**:
- Faster de-indexing than 404
- Stop crawling sooner
- Better for crawl budget
---
### 403 Forbidden
**What it means**: Server understood request but refuses to authorize it.
**SEO impact**: Negative - page inaccessible and won't be indexed.
**Common causes**:
- Permission restrictions
- IP blocking
- .htaccess restrictions
- File permissions (chmod)
- Authentication required
**When it's intentional**:
- Admin areas
- Member-only content
- Geographic restrictions
**When it's a problem**:
- Public pages returning 403
- Search engine bots blocked
- Accidental permission changes
**How to diagnose**:
1. Check .htaccess for IP restrictions
2. Verify file permissions (should be 644 for files, 755 for directories)
3. Check server-level access rules
4. Test with different IPs/user-agents
**How to fix**:
1. Adjust file permissions: `chmod 644 filename`
2. Remove blocking rules from .htaccess
3. Whitelist search engine bots
4. Review server firewall rules
---
### 401 Unauthorized
**What it means**: Authentication required but not provided or failed.
**SEO impact**: Negative - page won't be indexed.
**Common causes**:
- Password-protected pages
- HTTP Basic Authentication
- Expired sessions
- Missing credentials
**When it's intentional**: Member areas, staging sites, admin panels.
**How to handle for SEO**:
- Don't password-protect pages you want indexed
- Use separate staging domain with 401
- For members-only content, show teaser with meta robots noindex
---
### 429 Too Many Requests
**What it means**: User/bot sent too many requests in given timeframe (rate limiting).
**SEO impact**: Negative if search engines can't crawl.
**Common causes**:
- Aggressive crawling
- DDoS protection triggered
- API rate limits
- Server throttling
**How to handle**:
1. Check Googlebot isn't being rate-limited (use Search Console)
2. Whitelist verified search engine bots
3. Configure rate limits appropriately
4. Monitor crawl rate in Search Console
---
## 5xx Server Error Codes
### 500 Internal Server Error
**What it means**: Generic server error, something went wrong.
**SEO impact**: Very negative if persistent - prevents indexing and ranking.
**Common causes**:
- PHP/code errors
- Database connection issues
- .htaccess syntax errors
- Resource limits exceeded
- Plugin/theme conflicts (WordPress)
**How to diagnose**:
1. Check server error logs
2. Review recent code/config changes
3. Test locally or on staging
4. Disable plugins one by one (if CMS)
5. Check .htaccess syntax
**How to fix**:
1. Review error logs for specific error
2. Roll back recent changes
3. Fix code errors
4. Increase resource limits if needed
5. Test thoroughly before re-deploying
**Monitoring**: Set up alerts for 500 errors (sudden spike = problem).
---
### 502 Bad Gateway
**What it means**: Server received invalid response from upstream server.
**SEO impact**: Negative if persistent - prevents crawling/indexing.
**Common causes**:
- Proxy/load balancer issues
- Upstream server down
- Timeout issues
- Firewall blocking
**Common scenarios**:
- CDN can't reach origin server
- Application server crashed
- Database server unresponsive
**How to fix**:
1. Check upstream server status
2. Verify firewall rules
3. Check timeout settings
4. Restart proxy/load balancer if needed
5. Review CDN configuration
---
### 503 Service Unavailable
**What it means**: Server temporarily unable to handle request.
**SEO impact**: Neutral if truly temporary with Retry-After header. Negative if prolonged.
**Common causes**:
- Maintenance mode
- Server overload
- Database down
- Resource exhaustion
**Proper use for maintenance**:
```
HTTP/1.1 503 Service Unavailable
Retry-After: 3600
```
**Best practices for maintenance**:
1. Use 503 (not 404 or 500)
2. Include Retry-After header
3. Keep maintenance brief (<24 hours)
4. Schedule during low-traffic times
5. Inform users with clear message
**How search engines handle 503**:
- Short-term (hours): Will retry, no ranking impact
- Long-term (days+): May drop rankings, de-index pages
---
### 504 Gateway Timeout
**What it means**: Server didn't receive timely response from upstream server.
**SEO impact**: Negative - prevents crawling.
**Common causes**:
- Slow database queries
- External API timeouts
- Insufficient server resources
- Network issues
**How to fix**:
1. Optimize slow queries
2. Increase timeout limits
3. Add caching
4. Scale server resources
5. Review external dependencies
---
## Status Code Decision Flowchart
### Content Moved Permanently?
→ YES: Use **301 redirect**
→ NO: Continue
### Content Moved Temporarily?
→ YES: Use **302 redirect**
→ NO: Continue
### Content Deleted with No Replacement?
→ YES: Use **404** (or **410** if permanently gone)
→ NO: Continue
### Content Exists at This URL?
→ YES: Use **200 OK**
→ NO: Use **404**
### Need Authentication?
→ YES: Use **401**
→ NO: Continue
### Access Forbidden?
→ YES: Use **403**
→ NO: Continue
### Server Error?
→ YES: Use **500**, **502**, **503**, or **504** depending on cause
→ NO: Use **200 OK**
---
## Diagnosing Status Code Issues
### Tools
**Browser DevTools**:
1. Open DevTools (F12)
2. Go to Network tab
3. Reload page
4. Check status code in first request
**cURL command**:
```bash
curl -I https://example.com/page
```
**Online checkers**:
- httpstatus.io
- redirect-checker.org
- websiteplanet.com/webtools/redirects/
**Google Search Console**:
- Coverage report → Error/Excluded sections
- URL Inspection tool → Check specific URLs
---
### Common Diagnostic Scenarios
### "Page Won't Index"
**Check**:
1. Status code (should be 200)
2. Redirects (shouldn't redirect away)
3. 4xx/5xx errors
4. robots.txt blocking
5. noindex meta tag
### "Page Disappeared from Results"
**Check**:
1. Returns 404/410/5xx
2. Redirecting elsewhere (301/302)
3. Changed to 403/401
4. Server timing out (504)
### "Traffic Dropped After Migration"
**Check**:
1. Old URLs return 404 (should be 301)
2. Redirect chains (should be direct)
3. Redirect loops
4. Wrong redirect type (302 vs 301)
5. Incorrect redirect targets
---
## Status Codes and Crawl Budget
### Impact on Crawl Budget
**Efficient (minimal impact)**:
- 200 OK
- 301 redirects (if minimal chains)
- 410 Gone (removes from crawl queue)
**Moderate impact**:
- 302 redirects (search engine may keep checking)
- 404 errors (search engines periodically recheck)
- Redirect chains (multiple requests per URL)
**High impact (wasteful)**:
- 5xx errors (search engines retry frequently)
- Redirect loops (waste crawl budget)
- Soft 404s (search engine confused, keeps crawling)
- 429 rate limiting (prevents efficient crawling)
---
## SEO Status Code Best Practices
### For Migrations
- [ ] Use 301 redirects for all permanently moved pages
- [ ] Redirect directly to final destination (no chains)
- [ ] Test all redirects before launching
- [ ] Keep redirects in place for at least 1 year
- [ ] Monitor 404 errors in Search Console post-launch
- [ ] Map 1:1 where possible (old URL → equivalent new URL)
### For Deleted Content
- [ ] Use 301 if relevant replacement exists
- [ ] Use 404 if no replacement and might return
- [ ] Use 410 if permanently gone, never returning
- [ ] Don't redirect to irrelevant pages (creates soft 404)
- [ ] Create custom 404 page with search and navigation
### For Maintenance
- [ ] Use 503 with Retry-After header
- [ ] Keep maintenance window brief (<24 hours)
- [ ] Create user-friendly maintenance page
- [ ] Inform users of expected downtime
- [ ] Monitor Search Console for crawl issues
### For Performance
- [ ] Minimize redirect chains
- [ ] Fix redirect loops immediately
- [ ] Monitor 5xx errors closely
- [ ] Set up alerts for sudden status code changes
- [ ] Optimize to reduce 504 timeouts
---
## Status Code Monitoring
### Key Metrics to Track
**In Search Console**:
- Crawl errors by type
- Server errors (5xx) trend
- Not found (404) trend
- Redirect errors
**In analytics**:
- 404 page views
- Entry pages with high exit rate (might be errors)
- Sudden traffic drops (could indicate status code issues)
**Server logs**:
- Status code distribution
- 5xx error frequency
- Unusual patterns
### Setting Up Alerts
**Alert on**:
- Sudden increase in 5xx errors
- Increase in 404 errors
- New redirect chains
- Crawl error spikes in Search Console
**Tools**:
- Google Search Console email alerts
- Server monitoring (UptimeRobot, Pingdom)
- Log analysis tools
- Custom scripts for log monitoring
---
## Quick Reference Table
| Code | Name | SEO Impact | Use When | Passes Link Equity? |
|------|------|------------|----------|---------------------|
| 200 | OK | ✅ Positive | Page works normally | N/A (original URL) |
| 301 | Moved Permanently | ✅ Positive | Permanent URL change | ✅ Yes (~90-99%) |
| 302 | Found | ⚠️ Neutral | Temporary redirect | ❌ No |
| 307 | Temporary Redirect | ⚠️ Neutral | Temporary (method preserved) | ❌ No |
| 308 | Permanent Redirect | ✅ Positive | Permanent (method preserved) | ✅ Yes |
| 404 | Not Found | ⚠️ Neutral | Content doesn't exist | N/A |
| 410 | Gone | ⚠️ Neutral | Permanent deletion | N/A |
| 403 | Forbidden | ❌ Negative | Access denied | N/A |
| 401 | Unauthorized | ❌ Negative | Auth required | N/A |
| 500 | Internal Server Error | ❌ Negative | Server error | N/A |
| 502 | Bad Gateway | ❌ Negative | Upstream error | N/A |
| 503 | Service Unavailable | ⚠️ Neutral | Temporary downtime | N/A |
| 504 | Gateway Timeout | ❌ Negative | Timeout error | N/A |
---
## Status Code Testing Checklist
Before launching site changes:
- [ ] Test all redirects return correct status codes
- [ ] Verify no redirect chains exist
- [ ] Check no redirect loops present
- [ ] Confirm important pages return 200
- [ ] Ensure deleted pages return 404/410 (not 200)
- [ ] Verify 301s point to correct destinations
- [ ] Test with multiple user-agents
- [ ] Check status codes in Search Console
- [ ] Monitor server logs for unusual patterns
- [ ] Set up alerts for error spikes
---
## Technical SEO Severity Framework
### Issue Classification
| Severity | Impact Description | Examples | Response Time |
|----------|-------------------|---------|---------------|
| **Critical** | Prevents indexation or causes site-wide issues | Robots.txt blocking site, noindex on key pages, site-wide 500 errors | Same day |
| **High** | Significantly impacts rankings or user experience | Slow page speed, missing hreflang, duplicate content, redirect chains | Within 1 week |
| **Medium** | Affects specific pages or has moderate impact | Missing schema, suboptimal canonicals, thin content pages | Within 1 month |
| **Low** | Minor optimization opportunities | Image compression, minor CLS issues, non-essential schema missing | Next quarter |
### Technical Debt Prioritization Matrix
| Factor | Weight | Assessment |
|--------|--------|-----------|
| Pages affected | 30% | Site-wide > Section > Single page |
| Revenue impact | 25% | Revenue pages > Blog > Utility pages |
| Fix difficulty | 20% | Config change < Template change < Code rewrite |
| Competitive impact | 15% | Competitors passing you > parity > you ahead |
| Crawl budget waste | 10% | High waste > Moderate > Minimal |
## Core Web Vitals Optimization Quick Reference
### LCP (Largest Contentful Paint) Optimization
| Root Cause | Detection | Fix |
|-----------|-----------|-----|
| Large hero image | PageSpeed Insights | Serve WebP, resize to container, add loading="lazy" |
| Render-blocking CSS/JS | DevTools Coverage | Defer non-critical, inline critical CSS |
| Slow server response | TTFB >800ms | CDN, server-side caching, upgrade hosting |
| Third-party scripts | DevTools Network | Defer/async, use facade pattern |
### CLS (Cumulative Layout Shift) Optimization
| Root Cause | Detection | Fix |
|-----------|-----------|-----|
| Images without dimensions | DevTools | Add explicit width/height attributes |
| Ads/embeds without reserved space | Visual inspection | Set min-height on containers |
| Web fonts causing FOUT | DevTools | font-display: swap + preload fonts |
| Dynamic content injection | Visual inspection | Reserve space with CSS |
### INP (Interaction to Next Paint) Optimization
| Root Cause | Detection | Fix |
|-----------|-----------|-----|
| Long JavaScript tasks | DevTools Performance | Break into smaller tasks, use requestIdleCallback |
| Heavy event handlers | DevTools | Debounce/throttle, use passive listeners |
| Main thread blocking | DevTools | Web workers for heavy computation |

View file

@ -0,0 +1,717 @@
# Robots.txt Reference Guide
Complete reference for creating, testing, and troubleshooting robots.txt files.
## Syntax Guide
### Basic Structure
```
User-agent: [bot name]
Disallow: [path to block]
Allow: [path to allow]
Sitemap: [sitemap URL]
Crawl-delay: [seconds]
```
---
## Core Directives
### User-agent
Specifies which bot the rules apply to.
**Syntax**: `User-agent: [bot-name]`
**Common user-agents**:
```
User-agent: * # All bots
User-agent: Googlebot # Google's crawler
User-agent: Bingbot # Bing's crawler
User-agent: GPTBot # OpenAI's crawler
User-agent: CCBot # Common Crawl bot
User-agent: anthropic-ai # Anthropic's crawler
User-agent: PerplexityBot # Perplexity AI crawler
User-agent: ClaudeBot # Claude's web crawler
```
**Multiple user-agents**: Group rules by leaving no blank lines between user-agent declarations.
```
User-agent: Googlebot
User-agent: Bingbot
Disallow: /admin/
```
---
### Disallow
Blocks bots from crawling specified paths.
**Syntax**: `Disallow: [path]`
**Examples**:
```
Disallow: / # Block entire site
Disallow: /admin/ # Block admin directory
Disallow: /private # Block private directory (and subdirectories)
Disallow: /*.pdf$ # Block all PDF files
Disallow: /*? # Block all URLs with parameters
Disallow: # Allow everything (empty disallow)
```
**Path matching**:
- `/` at end = block directory and all subdirectories
- Without `/` at end = block all paths starting with string
- `*` = wildcard, matches any sequence
- `$` = end of URL
---
### Allow
Explicitly allows crawling (overrides Disallow).
**Syntax**: `Allow: [path]`
**Common use**: Allow specific subdirectories within blocked parent.
```
User-agent: *
Disallow: /admin/
Allow: /admin/public/
```
**Note**: Allow is not standard but supported by Google, Bing, and most major crawlers.
---
### Sitemap
Specifies location of XML sitemap.
**Syntax**: `Sitemap: [absolute URL]`
**Examples**:
```
Sitemap: https://example.com/sitemap.xml
Sitemap: https://example.com/sitemap_index.xml
Sitemap: https://example.com/blog/sitemap.xml
```
**Best practices**:
- Use absolute URLs (not relative)
- Can include multiple Sitemap directives
- Place at end of file
- Submit same sitemap(s) to Google Search Console
---
### Crawl-delay
Adds delay between requests (seconds).
**Syntax**: `Crawl-delay: [seconds]`
**Example**:
```
User-agent: *
Crawl-delay: 10
```
**Warning**: Not supported by Googlebot (use Search Console rate limiting instead). Supported by Bing, Yandex, and others.
---
## Common Configurations
### 1. Allow All Bots (Default)
```
User-agent: *
Disallow:
Sitemap: https://example.com/sitemap.xml
```
Use when you want all bots to crawl entire site.
---
### 2. Block All Bots
```
User-agent: *
Disallow: /
```
Use for development/staging sites or private content.
---
### 3. Block Specific Directories
```
User-agent: *
Disallow: /admin/
Disallow: /private/
Disallow: /temp/
Disallow: /cgi-bin/
Sitemap: https://example.com/sitemap.xml
```
Standard configuration blocking admin and utility directories.
---
### 4. Block All AI Crawlers
```
# Block OpenAI
User-agent: GPTBot
Disallow: /
# Block Anthropic
User-agent: anthropic-ai
User-agent: ClaudeBot
Disallow: /
# Block Common Crawl
User-agent: CCBot
Disallow: /
# Block Perplexity
User-agent: PerplexityBot
Disallow: /
# Block Google-Extended (Bard training)
User-agent: Google-Extended
Disallow: /
# Allow search engines
User-agent: Googlebot
Disallow:
User-agent: Bingbot
Disallow:
Sitemap: https://example.com/sitemap.xml
```
Use when you want search indexing but not AI training.
---
### 5. Allow Search Engines, Block Everything Else
```
# Block all by default
User-agent: *
Disallow: /
# Allow Google
User-agent: Googlebot
Disallow:
# Allow Bing
User-agent: Bingbot
Disallow:
# Allow DuckDuckGo
User-agent: DuckDuckBot
Disallow:
Sitemap: https://example.com/sitemap.xml
```
---
### 6. Block URL Parameters
```
User-agent: *
Disallow: /*? # Block all URLs with parameters
Allow: /? # Allow homepage with parameters
Sitemap: https://example.com/sitemap.xml
```
Prevents duplicate content from parameter variations.
---
### 7. Block File Types
```
User-agent: *
Disallow: /*.pdf$
Disallow: /*.doc$
Disallow: /*.xls$
Disallow: /*.zip$
Sitemap: https://example.com/sitemap.xml
```
---
### 8. E-commerce Configuration
```
User-agent: *
# Block search/filter pages
Disallow: /*?q=
Disallow: /*?sort=
Disallow: /*?filter=
# Block account pages
Disallow: /account/
Disallow: /cart/
Disallow: /checkout/
# Block admin
Disallow: /admin/
# Allow product pages
Allow: /products/
Sitemap: https://example.com/sitemap.xml
```
---
### 9. WordPress Configuration
```
User-agent: *
# WordPress core
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
# WordPress directories
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/themes/
# Allow uploads
Allow: /wp-content/uploads/
# Block parameter pages
Disallow: /?s=
Disallow: /feed/
Disallow: /trackback/
Sitemap: https://example.com/sitemap_index.xml
```
---
### 10. Shopify Configuration
```
User-agent: *
# Block admin and account
Disallow: /admin
Disallow: /account
Disallow: /cart
Disallow: /checkout
# Block search
Disallow: /search
# Block collections with filters
Disallow: /collections/*+*
Disallow: /collections/*?*
Sitemap: https://example.com/sitemap.xml
```
---
## Platform-Specific Templates
### Wix
```
User-agent: *
Disallow: /_api/
Disallow: /_partials/
Sitemap: https://example.com/sitemap.xml
```
### Squarespace
```
User-agent: *
Disallow: /config/
Disallow: /search
Sitemap: https://example.com/sitemap.xml
```
### Webflow
```
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xml
```
### Drupal
```
User-agent: *
Disallow: /admin/
Disallow: /user/
Disallow: /node/add/
Disallow: /?q=
Sitemap: https://example.com/sitemap.xml
```
---
## Testing and Validation
### Google Search Console Robots.txt Tester
1. Go to: Search Console → Settings → robots.txt
2. View current robots.txt
3. Test specific URLs
4. See which user-agents are affected
### Manual Testing
Test URL pattern: `https://example.com/robots.txt`
Check file is:
- Accessible (returns 200 status)
- Plain text format
- UTF-8 encoded
- Located at root domain
- No more than 500KB (Google limit)
### Common Testing Scenarios
Test these URLs in tester:
- Homepage: `/`
- Product page: `/products/example`
- Admin page: `/admin/`
- Parameter page: `/search?q=test`
- File: `/document.pdf`
---
## Common Mistakes and Fixes
### Mistake 1: Blocking CSS/JS Files
**Wrong**:
```
User-agent: *
Disallow: /css/
Disallow: /js/
```
**Why it's wrong**: Google needs CSS/JS to render pages properly.
**Fix**:
```
User-agent: *
Allow: /css/
Allow: /js/
```
---
### Mistake 2: Using Relative URLs for Sitemap
**Wrong**:
```
Sitemap: /sitemap.xml
```
**Fix**:
```
Sitemap: https://example.com/sitemap.xml
```
---
### Mistake 3: Spaces in Directives
**Wrong**:
```
User-agent : Googlebot
Disallow : /admin/
```
**Fix** (no spaces before colons):
```
User-agent: Googlebot
Disallow: /admin/
```
---
### Mistake 4: Forgetting Trailing Slash
**Intention**: Block /admin directory
**Wrong**:
```
Disallow: /admin
```
**Result**: Also blocks /admin-panel, /administrator, etc.
**Fix**:
```
Disallow: /admin/
```
---
### Mistake 5: Blocking Entire Site Accidentally
**Wrong**:
```
User-agent: *
Disallow: /
Allow: /blog/
```
**Why it's wrong**: Many bots don't support Allow directive.
**Fix**: Use noindex meta tags for pages you don't want indexed, not robots.txt.
---
### Mistake 6: Not Blocking Development Environments
**Wrong**: No robots.txt on staging.example.com
**Result**: Staging site gets indexed.
**Fix**:
```
User-agent: *
Disallow: /
```
On all non-production environments.
---
### Mistake 7: Case Sensitivity Errors
**Note**: Directives are case-insensitive, but paths are case-sensitive.
**Example**:
```
Disallow: /Admin/ # Blocks /Admin/ but not /admin/
```
**Fix**: Block both if needed:
```
Disallow: /admin/
Disallow: /Admin/
```
---
## Advanced Patterns
### Wildcard Examples
```
# Block all PDFs
Disallow: /*.pdf$
# Block all URLs with parameters
Disallow: /*?
# Block all URLs ending in .php
Disallow: /*.php$
# Block all admin paths regardless of location
Disallow: /*/admin/
```
### Multiple Sitemaps
```
Sitemap: https://example.com/sitemap-pages.xml
Sitemap: https://example.com/sitemap-posts.xml
Sitemap: https://example.com/sitemap-products.xml
```
### Bot-Specific Rules
```
# Aggressive bot - slow it down
User-agent: BadBot
Crawl-delay: 60
Disallow: /
# Good bots - full access
User-agent: Googlebot
User-agent: Bingbot
Disallow:
# Default for others
User-agent: *
Crawl-delay: 10
Disallow: /admin/
```
---
## Robots.txt vs Meta Robots vs X-Robots-Tag
### When to use each:
**Robots.txt**:
- Block crawling of entire directories
- Reduce crawl budget waste
- Block parameter variations
- Does NOT prevent indexing if page is linked from elsewhere
**Meta robots tag**:
- Prevent specific pages from being indexed
- Control snippet display
- Control following links
- Example: `<meta name="robots" content="noindex,follow">`
**X-Robots-Tag HTTP header**:
- Control non-HTML files (PDFs, images)
- Server-level control
- Example: `X-Robots-Tag: noindex`
**Important**: If you don't want a page indexed, use noindex (meta tag or header), NOT robots.txt.
---
## Monitoring and Maintenance
### Regular Checks
**Monthly**:
- [ ] Verify robots.txt is accessible
- [ ] Check Search Console for blocked URLs
- [ ] Review crawl stats for blocked resources
**Quarterly**:
- [ ] Audit blocked paths - still relevant?
- [ ] Check for new admin/private sections to block
- [ ] Review AI crawler landscape (new bots?)
**After site changes**:
- [ ] Update robots.txt if URL structure changed
- [ ] Test new sections (should they be blocked?)
- [ ] Verify sitemaps still referenced
### Search Console Monitoring
Check these reports:
- **Coverage** → Excluded by robots.txt
- **Settings** → Crawl stats
- **URL Inspection** → Test specific URLs
---
## Robots.txt Checklist
Before deploying:
- [ ] File is named exactly `robots.txt` (lowercase)
- [ ] Located at root domain (`example.com/robots.txt`)
- [ ] Plain text format (not HTML or PDF)
- [ ] UTF-8 encoding
- [ ] No HTML tags in file
- [ ] All paths start with `/`
- [ ] Sitemap URLs are absolute
- [ ] No spaces before colons
- [ ] Tested in Search Console robots.txt tester
- [ ] Not blocking important CSS/JS/images
- [ ] Not blocking content you want indexed
- [ ] Trailing slashes used correctly for directories
- [ ] Wildcard patterns tested
- [ ] File size under 500KB
---
## Emergency Fixes
### Accidentally Blocked Entire Site
**Symptom**: All pages blocked in Search Console
**Fix**:
1. Edit robots.txt to:
```
User-agent: *
Disallow:
Sitemap: https://example.com/sitemap.xml
```
2. Test in Search Console
3. Request urgent recrawl for key pages
4. Monitor Coverage report for recovery
**Recovery time**: 1-7 days
---
### Blocked CSS/JS Files
**Symptom**: "Blocked by robots.txt" in Mobile-Friendly Test
**Fix**:
1. Add Allow directives:
```
User-agent: *
Allow: /css/
Allow: /js/
Allow: /wp-content/uploads/
```
2. Test in robots.txt tester
3. Request re-render in URL Inspection tool
---
### Staging Site Indexed
**Symptom**: staging.example.com appears in search results
**Fix**:
1. Add to staging robots.txt:
```
User-agent: *
Disallow: /
```
2. Add noindex meta tag to all staging pages
3. Remove staging URLs in Search Console (Removals tool)
---
## Resources and Tools
**Testing**:
- Google Search Console robots.txt tester
- Bing Webmaster Tools robots.txt analyzer
- Technical SEO browser extensions
**Validation**:
- https://www.google.com/webmasters/tools/robots-testing-tool
- https://en.ryte.com/free-tools/robots-txt/
- https://technicalseo.com/tools/robots-txt/
**Documentation**:
- Google: https://developers.google.com/search/docs/crawling-indexing/robots/intro
- Bing: https://www.bing.com/webmasters/help/robots-txt-validation
- Robots.txt spec: https://www.robotstxt.org/

View file

@ -0,0 +1,169 @@
# Technical SEO Checker — Worked Example & Checklist
Referenced from [SKILL.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/SKILL.md).
---
## Worked Example
**User**: "Check the technical SEO of cloudhosting.com"
**Output**:
```markdown
# Technical SEO Audit Report
**Domain**: cloudhosting.com
**Audit Date**: 2024-09-15
**Pages Analyzed**: 312
## Crawlability Analysis
### Robots.txt Review
**URL**: cloudhosting.com/robots.txt
**Status**: Found
| Check | Status | Notes |
|-------|--------|-------|
| File exists | ✅ | 200 response |
| Valid syntax | ⚠️ | Wildcard pattern `Disallow: /*?` too aggressive — blocks faceted pages |
| Sitemap declared | ❌ | No Sitemap directive in robots.txt |
| Important pages blocked | ⚠️ | /pricing/ blocked by `Disallow: /pricing` rule |
| Assets blocked | ✅ | CSS/JS accessible |
**Issues Found**:
- Sitemap URL not declared in robots.txt
- `/pricing/` inadvertently blocked — high-value commercial page
### XML Sitemap Review
**Sitemap URL**: cloudhosting.com/sitemap.xml
**Status**: Found (not referenced in robots.txt)
| Check | Status | Notes |
|-------|--------|-------|
| Sitemap exists | ✅ | Valid XML, 287 URLs |
| Only indexable URLs | ❌ | 23 noindex URLs included |
| Includes lastmod | ⚠️ | All dates set to 2023-01-01 — not accurate |
**Crawlability Score**: 5/10
## Performance Analysis
### Core Web Vitals
| Metric | Mobile | Desktop | Target | Status |
|--------|--------|---------|--------|--------|
| LCP (Largest Contentful Paint) | 4.8s | 2.1s | <2.5s | Mobile / Desktop |
| FID (First Input Delay) | 45ms | 12ms | <100ms | / |
| CLS (Cumulative Layout Shift) | 0.24 | 0.08 | <0.1 | Mobile / Desktop |
| INP (Interaction to Next Paint) | 380ms | 140ms | <200ms | Mobile / Desktop |
### Additional Performance Metrics
| Metric | Value | Status |
|--------|-------|--------|
| Time to First Byte (TTFB) | 1,240ms | ❌ |
| Page Size | 3.8MB | ❌ |
| Requests | 94 | ⚠️ |
**LCP Issues**:
- Uncompressed hero image (2.4MB PNG): Convert to WebP, est. save 1.9MB
- No CDN detected: TTFB 1,240ms from origin server
**CLS Issues**:
- Ad banner at top of page injects without reserved height (0.18 shift contribution)
**Performance Score**: 3/10
## Security Analysis
### HTTPS Status
| Check | Status | Notes |
|-------|--------|-------|
| SSL certificate valid | ✅ | Expires: 2025-03-22 |
| HTTPS enforced | ⚠️ | http://cloudhosting.com returns 200 instead of 301 redirect |
| Mixed content | ❌ | 7 images loaded over HTTP on /features/ page |
| HSTS enabled | ❌ | Header not present |
**Security Score**: 5/10
## Structured Data Analysis
### Schema Markup Found
| Schema Type | Pages | Valid | Errors |
|-------------|-------|-------|--------|
| Organization | 1 (homepage) | ✅ | None |
| Article | 0 | — | Missing on 48 blog posts |
| Product | 0 | — | Missing on 5 plan pages |
| FAQ | 0 | — | Missing on 12 pages with FAQ content |
**Structured Data Score**: 3/10
## Overall Technical Health: 42/100
```
Score Breakdown:
█████░░░░░ Crawlability: 5/10
██████░░░░ Indexability: 6/10
███░░░░░░░ Performance: 3/10
██████░░░░ Mobile: 6/10
█████░░░░░ Security: 5/10
██████░░░░ URL Structure: 6/10
███░░░░░░░ Structured Data: 3/10
```
## Priority Issues
### 🔴 Critical (Fix Immediately)
1. **Mobile LCP 4.8s (target <2.5s)** — Compress hero image to WebP (est. save 1.9MB) and implement a CDN to reduce TTFB from 1,240ms to <400ms.
### 🟡 Important (Fix Soon)
2. **HTTP not redirecting to HTTPS** — Add 301 redirect from http:// to https:// and enable HSTS header. 7 mixed-content images on /features/ need URL updates.
### 🟢 Minor (Optimize)
3. **No Article/FAQ schema on blog posts** — Add Article schema to 48 blog posts and FAQ schema to 12 FAQ pages for rich result eligibility.
```
---
## Technical SEO Checklist
```markdown
### Crawlability
- [ ] robots.txt is valid and not blocking important content
- [ ] XML sitemap exists and is submitted to ~~search console
- [ ] No crawl errors in ~~search console
- [ ] No redirect chains or loops
### Indexability
- [ ] Important pages are indexable
- [ ] Canonical tags are correct
- [ ] No duplicate content issues
- [ ] Pagination is handled correctly
### Performance
- [ ] Core Web Vitals pass
- [ ] Page speed under 3 seconds
- [ ] Images are optimized
- [ ] JS/CSS are minified
### Mobile
- [ ] Mobile-friendly test passes
- [ ] Viewport is configured
- [ ] Touch elements are properly sized
### Security
- [ ] HTTPS is enforced
- [ ] SSL certificate is valid
- [ ] No mixed content
- [ ] Security headers present
### Structure
- [ ] URLs are clean and descriptive
- [ ] Site architecture is logical
- [ ] Internal linking is strong
```

View file

@ -0,0 +1,311 @@
# Technical SEO Checker — Output Templates
Detailed output templates for technical-seo-checker steps 3-9. Referenced from [SKILL.md](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/optimize/technical-seo-checker/SKILL.md).
---
## Step 3: Audit Site Speed & Core Web Vitals
```markdown
## Performance Analysis
### Core Web Vitals
| Metric | Mobile | Desktop | Target | Status |
|--------|--------|---------|--------|--------|
| LCP (Largest Contentful Paint) | [X]s | [X]s | <2.5s | // |
| FID (First Input Delay) | [X]ms | [X]ms | <100ms | // |
| CLS (Cumulative Layout Shift) | [X] | [X] | <0.1 | // |
| INP (Interaction to Next Paint) | [X]ms | [X]ms | <200ms | // |
### Additional Performance Metrics
| Metric | Value | Status |
|--------|-------|--------|
| Time to First Byte (TTFB) | [X]ms | ✅/⚠️/❌ |
| First Contentful Paint (FCP) | [X]s | ✅/⚠️/❌ |
| Speed Index | [X] | ✅/⚠️/❌ |
| Total Blocking Time | [X]ms | ✅/⚠️/❌ |
| Page Size | [X]MB | ✅/⚠️/❌ |
| Requests | [X] | ✅/⚠️/❌ |
### Performance Issues
**LCP Issues**:
- [Issue]: [Impact] - [Solution]
- [Issue]: [Impact] - [Solution]
**CLS Issues**:
- [Issue]: [Impact] - [Solution]
**Resource Loading**:
| Resource Type | Count | Size | Issues |
|---------------|-------|------|--------|
| Images | [X] | [X]MB | [notes] |
| JavaScript | [X] | [X]MB | [notes] |
| CSS | [X] | [X]KB | [notes] |
| Fonts | [X] | [X]KB | [notes] |
### Optimization Recommendations
**High Impact**:
1. [Recommendation] - Est. improvement: [X]s
2. [Recommendation] - Est. improvement: [X]s
**Medium Impact**:
1. [Recommendation]
2. [Recommendation]
**Performance Score**: [X]/10
```
---
## Step 4: Audit Mobile-Friendliness
```markdown
## Mobile Optimization Analysis
### Mobile-Friendly Test
| Check | Status | Notes |
|-------|--------|-------|
| Mobile-friendly overall | ✅/❌ | [notes] |
| Viewport configured | ✅/❌ | [viewport tag] |
| Text readable | ✅/⚠️/❌ | Font size: [X]px |
| Tap targets sized | ✅/⚠️/❌ | [notes] |
| Content fits viewport | ✅/❌ | [notes] |
| No horizontal scroll | ✅/❌ | [notes] |
### Responsive Design Check
| Element | Desktop | Mobile | Issues |
|---------|---------|--------|--------|
| Navigation | [status] | [status] | [notes] |
| Images | [status] | [status] | [notes] |
| Forms | [status] | [status] | [notes] |
| Tables | [status] | [status] | [notes] |
| Videos | [status] | [status] | [notes] |
### Mobile-First Indexing
| Check | Status | Notes |
|-------|--------|-------|
| Mobile version has all content | ✅/⚠️/❌ | [notes] |
| Mobile has same structured data | ✅/⚠️/❌ | [notes] |
| Mobile has same meta tags | ✅/⚠️/❌ | [notes] |
| Mobile images have alt text | ✅/⚠️/❌ | [notes] |
**Mobile Score**: [X]/10
```
---
## Step 5: Audit Security & HTTPS
```markdown
## Security Analysis
### HTTPS Status
| Check | Status | Notes |
|-------|--------|-------|
| SSL certificate valid | ✅/❌ | Expires: [date] |
| HTTPS enforced | ✅/❌ | [redirects properly?] |
| Mixed content | ✅/⚠️/❌ | [X] issues |
| HSTS enabled | ✅/⚠️ | [notes] |
| Certificate chain | ✅/⚠️/❌ | [notes] |
### Security Headers
| Header | Present | Value | Recommended |
|--------|---------|-------|-------------|
| Content-Security-Policy | ✅/❌ | [value] | [recommendation] |
| X-Frame-Options | ✅/❌ | [value] | DENY or SAMEORIGIN |
| X-Content-Type-Options | ✅/❌ | [value] | nosniff |
| X-XSS-Protection | ✅/❌ | [value] | 1; mode=block |
| Referrer-Policy | ✅/❌ | [value] | [recommendation] |
**Security Score**: [X]/10
```
---
## Step 6: Audit URL Structure
```markdown
## URL Structure Analysis
### URL Pattern Review
| Check | Status | Notes |
|-------|--------|-------|
| HTTPS URLs | ✅/⚠️/❌ | [X]% HTTPS |
| Lowercase URLs | ✅/⚠️/❌ | [notes] |
| No special characters | ✅/⚠️/❌ | [notes] |
| Readable/descriptive | ✅/⚠️/❌ | [notes] |
| Appropriate length | ✅/⚠️/❌ | Avg: [X] chars |
| Keywords in URLs | ✅/⚠️/❌ | [notes] |
| Consistent structure | ✅/⚠️/❌ | [notes] |
### URL Issues Found
| Issue Type | Count | Examples |
|------------|-------|----------|
| Dynamic parameters | [X] | [URLs] |
| Session IDs in URLs | [X] | [URLs] |
| Uppercase characters | [X] | [URLs] |
| Special characters | [X] | [URLs] |
| Very long URLs (>100) | [X] | [URLs] |
### Redirect Analysis
| Check | Status | Notes |
|-------|--------|-------|
| Redirect chains | [X] found | [max chain length] |
| Redirect loops | [X] found | [URLs] |
| 302 → 301 needed | [X] found | [URLs] |
| Broken redirects | [X] found | [URLs] |
**URL Score**: [X]/10
```
---
## Step 7: Audit Structured Data
> **CORE-EEAT alignment**: Schema markup quality maps to O05 (Schema Markup) in the CORE-EEAT benchmark. See [content-quality-auditor](https://github.com/aaron-he-zhu/seo-geo-claude-skills/blob/main/cross-cutting/content-quality-auditor/SKILL.md) for full content quality audit.
```markdown
## Structured Data Analysis
### Schema Markup Found
| Schema Type | Pages | Valid | Errors |
|-------------|-------|-------|--------|
| [Type 1] | [X] | ✅/❌ | [errors] |
| [Type 2] | [X] | ✅/❌ | [errors] |
### Validation Results
**Errors**:
- [Error 1]: [affected pages] - [solution]
- [Error 2]: [affected pages] - [solution]
**Warnings**:
- [Warning 1]: [notes]
### Missing Schema Opportunities
| Page Type | Current Schema | Recommended |
|-----------|----------------|-------------|
| Blog posts | [current] | Article + FAQ |
| Products | [current] | Product + Review |
| Homepage | [current] | Organization |
**Structured Data Score**: [X]/10
```
---
## Step 8: Audit International SEO (if applicable)
```markdown
## International SEO Analysis
### Hreflang Implementation
| Check | Status | Notes |
|-------|--------|-------|
| Hreflang tags present | ✅/❌ | [notes] |
| Self-referencing | ✅/⚠️/❌ | [notes] |
| Return tags present | ✅/⚠️/❌ | [notes] |
| Valid language codes | ✅/⚠️/❌ | [notes] |
| x-default tag | ✅/⚠️ | [notes] |
### Language/Region Targeting
| Language | URL | Hreflang | Status |
|----------|-----|----------|--------|
| [en-US] | [URL] | [tag] | ✅/⚠️/❌ |
| [es-ES] | [URL] | [tag] | ✅/⚠️/❌ |
**International Score**: [X]/10
```
---
## Step 9: Generate Technical Audit Summary
```markdown
# Technical SEO Audit Report
**Domain**: [domain]
**Audit Date**: [date]
**Pages Analyzed**: [X]
## Overall Technical Health: [X]/100
```
Score Breakdown:
████████░░ Crawlability: 8/10
███████░░░ Indexability: 7/10
█████░░░░░ Performance: 5/10
████████░░ Mobile: 8/10
█████████░ Security: 9/10
██████░░░░ URL Structure: 6/10
█████░░░░░ Structured Data: 5/10
```
## Critical Issues (Fix Immediately)
1. **[Issue]**: [Impact]
- Affected: [pages/scope]
- Solution: [specific fix]
- Priority: 🔴 Critical
2. **[Issue]**: [Impact]
- Affected: [pages/scope]
- Solution: [specific fix]
- Priority: 🔴 Critical
## High Priority Issues
1. **[Issue]**: [Solution]
2. **[Issue]**: [Solution]
## Medium Priority Issues
1. **[Issue]**: [Solution]
2. **[Issue]**: [Solution]
## Quick Wins
These can be fixed quickly for immediate improvement:
1. [Quick fix 1]
2. [Quick fix 2]
3. [Quick fix 3]
## Implementation Roadmap
### Week 1: Critical Fixes
- [ ] [Task 1]
- [ ] [Task 2]
### Week 2-3: High Priority
- [ ] [Task 1]
- [ ] [Task 2]
### Week 4+: Optimization
- [ ] [Task 1]
- [ ] [Task 2]
## Monitoring Recommendations
Set up alerts for:
- Core Web Vitals drops
- Crawl error spikes
- Index coverage changes
- Security issues
```