Testing Guide¶
This guide covers testing strategies for PortableRalph and testing code that Ralph generates.
Overview¶
Testing Ralph involves two aspects:
- Testing Ralph itself - Ensuring Ralph works correctly
- Testing Ralph's output - Validating AI-generated code
Testing Ralph Installation¶
Quick Verification¶
# Check Ralph is installed
ralph --version
# Expected output:
# PortableRalph v1.6.0
# Check help works
ralph --help
# Check dependencies
which claude # Claude CLI
which git # Git
which curl # For updates/notifications
New Test Suites¶
Ralph now includes comprehensive test suites for security, validation, constants, and Windows compatibility.
Test Suite: Security Fixes¶
File: tests/test-security.sh
Tests all security vulnerabilities and their fixes:
# Run security tests
cd ~/ralph
./tests/test-security.sh
# Test categories:
# - Command injection prevention (sed, eval, system calls)
# - Path traversal attack prevention
# - SSRF protection in URL validation
# - Custom script validation
# - Input sanitization
# - Token masking in logs
# - Config file validation
Sample output:
Running Ralph Security Tests
============================
Command Injection Tests:
✓ Sed injection prevented
✓ Command substitution blocked
✓ Eval injection prevented
Path Traversal Tests:
✓ Parent directory traversal blocked
✓ Absolute path injection prevented
✓ Symlink attack mitigated
SSRF Protection Tests:
✓ Localhost URLs rejected
✓ Private IP ranges blocked
✓ Metadata service URLs blocked
Tests: 42 | Passed: 42 | Failed: 0
What it tests:
- Sed injection with malicious patterns
- Command injection via $(), backticks, |, ;, &
- Path traversal using .., ~, absolute paths
- SSRF attempts to localhost, 127.0.0.1, 192.168.x.x, 10.x.x.x, 169.254.169.254
- Custom script validation (permissions, existence, timeout)
- Token masking (ensures secrets not logged)
- JSON escaping for special characters
Test Suite: Validation Library¶
File: lib/test-compat.sh
Tests the validation library functions:
# Run validation tests
cd ~/ralph
./lib/test-compat.sh
# Test categories:
# - URL validation (format, SSRF, protocols)
# - Email validation (RFC compliance)
# - Numeric validation (range checking)
# - Path validation (security, existence)
# - JSON escaping
# - Token masking
Sample output:
Testing Validation Library
=========================
URL Validation:
✓ Valid HTTPS URLs accepted
✓ Invalid URLs rejected
✓ SSRF attempts blocked
✓ Protocol validation working
Email Validation:
✓ Valid emails accepted
✓ Invalid emails rejected
✓ RFC compliance verified
Numeric Validation:
✓ Valid numbers accepted
✓ Invalid input rejected
✓ Range checking works
Tests: 35 | Passed: 35 | Failed: 0
What it tests:
- validate_url() - URL format, SSRF protection, protocol checking
- validate_email() - Email format, RFC 5322 compliance
- validate_numeric() - Integer validation, range checking
- validate_path() - Path security, traversal prevention, existence
- json_escape() - Special character escaping for JSON
- mask_token() - Sensitive data masking
Test Suite: Constants Library¶
File: lib/constants.sh (with inline tests)
Tests that all constants are defined and exported:
# Verify constants are loaded
source ~/ralph/lib/constants.sh
# Check specific constant
echo $HTTP_MAX_TIME
# Should output: 10
# Check all constants exported
env | grep -E 'HTTP_|NOTIFY_|MAX_|TIMEOUT'
What it tests: - All timeout constants defined - All rate limit constants defined - All retry constants defined - All validation limit constants defined - All constants properly exported
Test Suite: Windows Compatibility¶
Bash Tests: lib/test-compat.sh
PowerShell Tests: lib/test-compat.ps1
Tests cross-platform compatibility:
# Run bash compatibility tests
cd ~/ralph
./lib/test-compat.sh
# Test categories:
# - Platform detection
# - Path conversion (Windows ↔ Unix)
# - Command availability checking
# - Process management
# - Configuration loading
PowerShell tests:
# Run PowerShell compatibility tests
cd $HOME\ralph
.\lib\test-compat.ps1
# Test categories:
# - Platform utilities (Get-RalphPlatform, Test-IsWSL)
# - Path conversion (Get-UnixPath, Get-WindowsPath)
# - Command checking (Test-CommandExists)
# - Config management (Get-RalphConfig, Set-RalphConfig)
# - Process management (Test-RalphProcess, Start-RalphBackground)
Sample output:
Testing Windows Compatibility
=============================
Platform Detection:
✓ Platform correctly detected
✓ WSL detection working
✓ Architecture detection working
Path Conversion:
✓ Windows to Unix conversion works
✓ Unix to Windows conversion works
✓ Path normalization works
Command Utilities:
✓ Command existence checking works
✓ Safe command execution works
Tests: 28 | Passed: 28 | Failed: 0
What it tests: - Platform detection (Windows, WSL, Linux, macOS) - Path conversions between Windows and Unix formats - Command availability (git, claude, curl, etc.) - Process management (start, stop, status) - Configuration reading and writing - Cross-platform compatibility
Test Suite: PowerShell Scripts¶
File: tests/test-ralph.ps1, tests/test-notify.ps1, tests/test-monitor.ps1
PowerShell-specific test suites:
# Run all PowerShell tests
cd $HOME\ralph\tests
# Test Ralph core
.\test-ralph.ps1
# Test notifications
.\test-notify.ps1
# Test monitoring
.\test-monitor.ps1
What they test: - Ralph command execution in PowerShell - Notification system in PowerShell environment - Progress monitoring in PowerShell - Configuration management - Error handling
Running All Tests¶
# Run all bash tests
cd ~/ralph
for test in tests/test-*.sh lib/test-*.sh; do
echo "Running $test..."
bash "$test"
done
# Run all PowerShell tests (Windows)
cd $HOME\ralph
Get-ChildItem -Path tests,lib -Filter "test-*.ps1" | ForEach-Object {
Write-Host "Running $($_.Name)..."
& $_.FullName
}
Functional Tests¶
Test 1: Plan Mode¶
# Create test plan
cat > test-plan.md << 'EOF'
# Test Feature
## Goal
Create a simple hello world function
## Requirements
- Function named `hello()`
- Returns "Hello, World!"
- Include test
EOF
# Run plan mode
ralph ./test-plan.md plan
# Verify progress file created
test -f test-plan_PROGRESS.md && echo "✓ Progress file created" || echo "✗ Failed"
# Check for task list
grep -q "Task" test-plan_PROGRESS.md && echo "✓ Tasks generated" || echo "✗ Failed"
# Verify status is IN_PROGRESS
grep -q "IN_PROGRESS" test-plan_PROGRESS.md && echo "✓ Status correct" || echo "✗ Failed"
# Clean up
rm test-plan.md test-plan_PROGRESS.md
Test 2: Build Mode (Dry Run)¶
# Create simple plan
cat > test-build.md << 'EOF'
# Test Build
## Goal
Add a comment to README
## Requirements
- Add "# Test Comment" to top of README.md
DO_NOT_COMMIT
EOF
# Backup README
cp README.md README.md.bak
# Run with 1 iteration
ralph ./test-build.md build 1
# Verify change was made
grep -q "Test Comment" README.md && echo "✓ File modified" || echo "✗ Failed"
# Restore
mv README.md.bak README.md
rm test-build.md test-build_PROGRESS.md
Test 3: Notifications¶
# Test notification system
ralph notify test
# Expected output:
# Testing Ralph notifications...
#
# Configured platforms:
# - Slack: configured (or not configured)
# - Discord: configured (or not configured)
# ...
#
# Sending test message...
# Slack: sent (or FAILED)
# ...
# Verify notification appeared in Slack/Discord
Test 4: Configuration¶
# Test config commands
ralph config commit status
# Expected output shows current setting:
# Auto-commit setting:
# Current: enabled (or disabled)
# Toggle setting
ralph config commit off
ralph config commit status | grep -q "disabled" && echo "✓ Config changed" || echo "✗ Failed"
# Restore
ralph config commit on
Test 5: Update System¶
# Check for updates
ralph update --check
# List versions
ralph update --list
# Verify current version shown
ralph update --list | grep -q "$(ralph --version | awk '{print $2}')" && echo "✓ Version found" || echo "✗ Failed"
Continuous Integration Testing¶
Add Ralph tests to your CI/CD pipeline:
GitHub Actions:
name: Ralph Tests
on: [push, pull_request]
jobs:
test-ralph:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Ralph security tests
run: ./tests/test-security.sh
- name: Run validation tests
run: ./lib/test-compat.sh
- name: Verify constants
run: |
source ./lib/constants.sh
test -n "$HTTP_MAX_TIME"
test -n "$NOTIFY_MAX_RETRIES"
Windows CI with PowerShell:
name: Ralph Windows Tests
on: [push, pull_request]
jobs:
test-windows:
runs-on: windows-latest
steps:
- uses: actions/checkout@v4
- name: Run PowerShell compatibility tests
shell: pwsh
run: .\lib\test-compat.ps1
- name: Run PowerShell notification tests
shell: pwsh
run: .\tests\test-notify.ps1
Testing Ralph's Output¶
Unit Testing Generated Code¶
Ralph should write code that passes your test suite. Ensure Ralph:
- Runs tests after implementing
- Fixes failing tests
- Adds tests for new functionality
Example plan with test requirements:
# Feature: User Authentication
## Goal
Add user login endpoint
## Requirements
- POST /auth/login endpoint
- Validates username/password
- Returns JWT token
## Testing Requirements
- Unit tests for login function
- Integration test for endpoint
- Test invalid credentials
- Test missing fields
- All tests must pass
Manual Validation¶
After Ralph completes, manually verify:
# Review all commits
git log --oneline --author="Ralph" -10
# Check each commit
git show HEAD
git show HEAD~1
# Run full test suite
npm test # Node.js
pytest # Python
cargo test # Rust
go test ./... # Go
# Run linters
npm run lint # JavaScript/TypeScript
pylint **/*.py # Python
clippy # Rust
go vet ./... # Go
# Build project
npm run build # Node.js
python setup.py build # Python
cargo build # Rust
go build # Go
Integration Testing¶
Verify Ralph's changes work with the rest of the system:
# Run integration tests
npm run test:integration
pytest tests/integration
cargo test --test integration
# Manual testing
# Start development server
npm run dev
# Test new features manually
curl http://localhost:3000/new-endpoint
# Or use Postman, browser, etc.
Automated Testing in CI/CD¶
GitHub Actions Example¶
name: Test Ralph Output
on:
push:
branches: [ralph/**] # Branches created by Ralph
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup environment
run: |
# Install dependencies
npm install # or pip install -r requirements.txt
- name: Lint
run: |
npm run lint
# Fail if linting errors
exit $?
- name: Unit tests
run: |
npm test
# Fail if tests fail
exit $?
- name: Integration tests
run: |
npm run test:integration
exit $?
- name: Security scan
run: |
npm audit --audit-level=high
exit $?
- name: Build
run: |
npm run build
exit $?
- name: Comment on PR
if: always()
uses: actions/github-script@v6
with:
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: '✅ All tests passed! Ralph-generated code is ready for review.'
})
Pre-merge Validation¶
# Create git hook: .git/hooks/pre-push
#!/bin/bash
echo "Running tests before push..."
# Run tests
npm test
if [ $? -ne 0 ]; then
echo "❌ Tests failed. Push aborted."
exit 1
fi
# Run linter
npm run lint
if [ $? -ne 0 ]; then
echo "❌ Linting failed. Push aborted."
exit 1
fi
echo "✅ All checks passed. Pushing..."
exit 0
Make it executable:
Test-Driven Development with Ralph¶
Ralph works well with TDD. Write tests first, then let Ralph implement:
Example TDD Workflow¶
Step 1: Write tests
// tests/calculator.test.js
describe('Calculator', () => {
test('add() should sum two numbers', () => {
const calc = new Calculator();
expect(calc.add(2, 3)).toBe(5);
});
test('subtract() should subtract two numbers', () => {
const calc = new Calculator();
expect(calc.subtract(5, 3)).toBe(2);
});
test('multiply() should multiply two numbers', () => {
const calc = new Calculator();
expect(calc.multiply(2, 3)).toBe(6);
});
});
Step 2: Create plan
# Implement Calculator
## Goal
Implement Calculator class to pass all tests
## Requirements
- Create Calculator class
- Implement add() method
- Implement subtract() method
- Implement multiply() method
- All tests in tests/calculator.test.js must pass
## Success Criteria
- `npm test` exits with code 0
- No linting errors
Step 3: Run Ralph
# Tests currently fail (no implementation)
npm test # Fails
# Run Ralph
ralph ./calculator-plan.md build 5
# Tests now pass
npm test # Passes ✓
Regression Testing¶
Ensure Ralph doesn't break existing functionality:
Create Regression Test Suite¶
# Before Ralph makes changes
npm test > baseline-tests.log
# Run Ralph
ralph ./plan.md build
# After Ralph
npm test > after-ralph-tests.log
# Compare
diff baseline-tests.log after-ralph-tests.log
# If different, investigate
Automated Regression Check¶
#!/bin/bash
# regression-check.sh
# Get baseline
git checkout main
npm test > /tmp/baseline.log
BASELINE_EXIT=$?
# Get current
git checkout -
npm test > /tmp/current.log
CURRENT_EXIT=$?
# Compare
if [ $BASELINE_EXIT -eq 0 ] && [ $CURRENT_EXIT -ne 0 ]; then
echo "❌ Regression detected! Tests passed on main but fail now."
diff /tmp/baseline.log /tmp/current.log
exit 1
fi
echo "✅ No regression detected"
exit 0
Performance Testing¶
Verify Ralph's changes don't degrade performance:
Benchmark Tests¶
// benchmark.js
const { performance } = require('perf_hooks');
function benchmark(name, fn, iterations = 1000) {
const start = performance.now();
for (let i = 0; i < iterations; i++) {
fn();
}
const end = performance.now();
const duration = end - start;
const avg = duration / iterations;
console.log(`${name}: ${duration.toFixed(2)}ms total, ${avg.toFixed(4)}ms avg`);
}
// Before Ralph
benchmark('Old Implementation', () => {
oldFunction();
});
// After Ralph
benchmark('New Implementation', () => {
newFunction();
});
Load Testing¶
# Before Ralph
ab -n 1000 -c 10 http://localhost:3000/endpoint > before.txt
# Run Ralph
ralph ./plan.md build
# After Ralph
ab -n 1000 -c 10 http://localhost:3000/endpoint > after.txt
# Compare
diff before.txt after.txt
Coverage Testing¶
Ensure Ralph adds adequate test coverage:
Measure Coverage¶
# JavaScript/TypeScript
npm run test:coverage
# or
jest --coverage
# Python
pytest --cov=src tests/
# Go
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out
# Rust
cargo tarpaulin --out Html
Coverage Requirements in Plan¶
# Feature: User Service
## Goal
Add user management functionality
## Requirements
- Create User model
- CRUD operations
- Input validation
## Testing Requirements
- Unit tests for all public methods
- Test edge cases (null, empty, invalid)
- **Minimum 80% code coverage**
- Coverage report must show increase, not decrease
Enforce Coverage Thresholds¶
// jest.config.js
module.exports = {
coverageThreshold: {
global: {
branches: 80,
functions: 80,
lines: 80,
statements: 80
}
}
};
Test Data Management¶
Fixtures for Testing¶
# tests/fixtures.py
import pytest
@pytest.fixture
def sample_user():
return {
"id": 1,
"username": "testuser",
"email": "test@example.com"
}
@pytest.fixture
def mock_database():
# Setup mock database
db = MockDB()
yield db
# Teardown
db.close()
Test Plan with Fixtures¶
# Feature: User Management
## Testing Requirements
- Use existing fixtures in tests/fixtures.py
- Add new fixtures for new entities
- Don't use real database in unit tests
- Integration tests can use test database
Continuous Testing¶
Watch Mode¶
Run tests continuously as Ralph makes changes:
# JavaScript/TypeScript
npm test -- --watch
# Python
pytest-watch
# Rust
cargo watch -x test
# Go
gow test ./...
Monitor Ralph + Tests¶
# Terminal 1: Run Ralph
ralph ./plan.md build
# Terminal 2: Watch tests
npm test -- --watch
# Terminal 3: Watch logs
tail -f ~/ralph/monitor.log
Validation Checklist¶
Use this checklist after Ralph completes:
Code Quality¶
- [ ] All tests pass (
npm test) - [ ] Linting passes (
npm run lint) - [ ] Build succeeds (
npm run build) - [ ] No TypeScript errors (
tsc --noEmit) - [ ] Code coverage maintained or increased
Functionality¶
- [ ] Feature works as described in plan
- [ ] Edge cases handled
- [ ] Error handling present
- [ ] Input validation added
Security¶
- [ ] No hardcoded credentials
- [ ] No SQL injection vulnerabilities
- [ ] No XSS vulnerabilities
- [ ] Dependencies are secure (
npm audit)
Documentation¶
- [ ] Code is commented
- [ ] JSDoc/docstrings added
- [ ] README updated if needed
- [ ] CHANGELOG updated
Performance¶
- [ ] No obvious performance regressions
- [ ] Efficient algorithms used
- [ ] No memory leaks
Common Testing Issues¶
Issue: Tests Fail After Ralph¶
Diagnosis:
# See what changed
git diff HEAD~1
# Run specific test
npm test -- --testNamePattern="failing test"
# Check for syntax errors
npm run lint
Solutions:
- Fix manually and commit
- Revert and adjust plan:
- Let Ralph fix it:
Issue: Coverage Decreased¶
Diagnosis:
# Check coverage
npm run test:coverage
# See untested code
# Look at coverage report (usually in coverage/index.html)
Solution:
Add testing requirement to plan:
## Testing Requirements
- Add tests for all new functions
- Maintain minimum 80% coverage
- Test edge cases: null, undefined, empty, invalid inputs
Best Practices¶
1. Always Include Test Requirements in Plans¶
# Good Plan
## Requirements
- Implement feature X
- Add unit tests
- Add integration tests
- All tests must pass
- Minimum 80% coverage
# Bad Plan
## Requirements
- Implement feature X
2. Run Plan Mode First¶
# Review task list before building
ralph ./plan.md plan
cat plan_PROGRESS.md
# Check if testing tasks included
grep -i "test" plan_PROGRESS.md
3. Limit Iterations for Testing¶
# Build incrementally, test frequently
ralph ./plan.md build 5
npm test # Verify
ralph ./plan.md build 5
npm test # Verify again
4. Use DO_NOT_COMMIT for Experiments¶
5. Keep Test Data Separate¶
# Don't let Ralph modify test fixtures
echo "tests/fixtures/**" >> .gitignore
# Or make them read-only
chmod -w tests/fixtures/*
See Also¶
- Usage Guide - Ralph commands
- CI/CD Examples - Automated testing in CI
- Security Guide - Security testing
- Troubleshooting - Test-related issues