Skip to content

Reports & Export

Generate beautiful HTML reports and export results in multiple formats.

HTML Dashboard Reports

Generate interactive reports with comprehensive visualizations:

# Generate report after running tests
praisonaibench --suite tests.yaml --report

# Generate report from existing results
praisonaibench --report-from output/json/benchmark_results_20241211_123456.json

# Compare multiple test results
praisonaibench --compare result1.json result2.json result3.json

Report Features

📊 Dashboard Tab

  • Summary cards with key metrics
  • Interactive charts:
  • Status distribution (success/failure)
  • Execution time by model
  • Evaluation scores (radar chart)
  • Errors & warnings

🏆 Leaderboard Tab

  • Model rankings with multiple criteria:
  • Overall Score (default)
  • Functional Score
  • Quality Score
  • Pass Rate
  • Speed (fastest first)
  • Top 3 models highlighted with medals
  • Click criteria to re-rank dynamically

⚖️ Comparison Tab

  • Side-by-side model comparison
  • Comprehensive metrics table:
  • Overall score, functional score, quality score
  • Pass rate with color coding
  • Average execution time
  • Total errors and warnings count

📋 Results Tab

  • Complete test results table
  • Individual test status, scores, time, tokens, cost
  • Sortable columns
  • Status indicators

Report Benefits

Feature Benefit
🎨 Modern UI Gradient headers, smooth transitions
📱 Responsive Works on all devices
⚡ Lightweight No external dependencies
📊 Interactive Chart.js powered
💾 Standalone Works offline
📧 Shareable Single HTML file

CSV Export

Export results for spreadsheet analysis:

# Export to CSV format
praisonaibench --suite tests.yaml --format csv

# Results saved to: output/csv/benchmark_results_20241211_123456.csv

CSV Columns

  • Test names and status
  • Model information
  • Execution times
  • Token usage (input/output/total)
  • Costs per test
  • Evaluation scores
  • Prompts and response lengths
  • Error messages (if any)

CSV Use Cases

  • Spreadsheet analysis (Excel/Google Sheets)
  • Data visualization tools
  • Statistical analysis
  • Sharing with non-technical stakeholders

Comparison Reports

Multi-run comparison shows: - Side-by-side success rates - Performance trends - Cost and token usage evolution - Model improvements over time