SEOPABLO
SUITE.
Automated, non-destructive, and built for engineering excellence.
Overview
SEOPablo is a custom-built, asynchronous Python-based SEO auditing suite designed to provide deep technical insights into company registries and web platforms.
Unlike generic crawlers, SEOPablo follows a "ReadOnly-First" principle, ensuring no state-changing requests (POST/PUT) are ever sent to the infrastructure during an audit. This allows for safe, continuous monitoring of production environments.
Capabilities
The suite performs comprehensive analysis across multiple dimensions of technical SEO and content quality:
- Core SEO Monitoring: Verification of meta tags, canonicals, robots.txt compliance, and status codes.
- Content Quality: Detection of thin content, missing descriptions, and word count monitoring.
- Advanced Analysis: Schema.org validation, Image ALT tag auditing, and Favicon presence detection.
- Near-Duplicate Detection: Uses SimHash algorithms to detect near-duplicate content across the platform.
- Performance Insights: Deep integration with PageSpeed Insights API for lighthouse metrics.
Usage
SEOPablo is operated via a CLI interface. It supports recursive crawling and specific depth-limited audits.
python tools/SEOPablo/main.py --url https://central.enterprises --depth 3 --check-speed
Command line arguments include:
--url: The entry point for the audit (required).--depth: How many levels deep to crawl (default: 3).--check-speed: Enable PageSpeed Insights analysis for the homepage.
Architecture
The system is built with modularity and scalability in mind, separating crawling from analysis and reporting:
- Crawler Engine: Asyncio-based engine with intelligent sitemap discovery.
- Analysis Modules: Pluggable analyzers (SEO, Content, Speed, Advanced).
- Utility Layer: Efficient HTML parsing and resilient network management.
Reporting
SEOPablo generates multi-format artifacts for different stakeholders:
- Markdown (.md): Human-readable summaries and prioritized issue lists.
- JSON (.json): Machines-readable data for integration into CI/CD pipelines.
- CSV (.csv): Flattened data for spreadsheets and deep-dive analysis.