Format and validate markdown documents with auto-generated TOC, frontmatter, structure validation, and cross-reference linking. Export to GitHub/CommonMark/Jekyll/Hugo.
Structure, validate, and format long-form markdown content for documentation, blogs, and static site generators. Auto-generate tables of contents, add frontmatter, validate structure, and convert between markdown flavors.
The markdown formatting process follows these steps:
from scripts.markdown_formatter import MarkdownFormatter
# Load and format markdown
formatter = MarkdownFormatter(file_path='document.md')
# Generate table of contents
toc = formatter.generate_toc(max_depth=3)
# Validate structure
validation = formatter.validate_structure()
if not validation['valid']:
print("Issues found:")
for error in validation['errors']:
print(f" - {error['message']}")
# Add frontmatter
formatter.add_frontmatter({
'title': 'My Document',
'author': 'John Doe',
'date': '2024-01-15'
})
# Export formatted version
formatter.export(
output_path='formatted.md',
include_toc=True,
target_flavor='github'
)
Auto-generate TOC from document heading structure:
Add YAML/TOML/JSON frontmatter for static site generators:
---) for Jekyll/Hugo+++) for HugoCheck document structure for common issues:
Enhance code blocks with syntax highlighting markers:
Auto-link headings and create cross-references:
Apply consistent formatting rules:
Convert between markdown flavors:
The validator identifies these common issues:
| Issue Type | Description | Example |
|---|---|---|
| Heading Skip | Level jumps (H2 → H4) | Missing H3 between H2 and H4 |
| Broken Link | Invalid internal/external link | [link](#missing-section) |
| Duplicate Heading | Same heading appears multiple times | Two "Introduction" headings |
| Missing ID | Heading lacks unique identifier | Anchor link fails |
| Invalid Structure | Incorrect nesting or formatting | List inside heading |
Initialization:
formatter = MarkdownFormatter(
file_path='document.md', # OR
content='# Markdown text...'
)
Parameters:
file_path (str): Path to markdown file (optional)content (str): Direct markdown content (optional)One of file_path or content must be provided.
toc = formatter.generate_toc(
max_depth=3, # Max heading level (1-6)
start_level=2, # Start from H2 (skip H1)
style='github' # 'github', 'numbered', 'bullets'
)
Returns: TOC markdown string
Styles:
github - Bulleted list with anchor linksnumbered - Numbered outlinebullets - Simple bullet listExample Output (github style):
## Table of Contents
- [Introduction](#introduction)
- [Getting Started](#getting-started)
- [Installation](#installation)
- [Configuration](#configuration)
- [Advanced Topics](#advanced-topics)
content = formatter.add_frontmatter(
metadata={
'title': 'Document Title',
'author': 'John Doe',
'date': '2024-01-15',
'tags': ['markdown', 'documentation']
},
format='yaml' # 'yaml', 'toml', or 'json'
)
Returns: Markdown content with frontmatter prepended
Example Output (YAML):
---