---
description: Unit testing guidelines for concise and effective test coverage
globs: tests/**/*.py
---
# Unit Testing Guidelines
## Test Conciseness Principles
### Default Test Structure
Most functions should have **only 2 unit tests**:
1. **Happy Path Test** - Valid inputs with multiple scenarios
2. **Error Handling Test** - Invalid inputs and edge cases
### Consolidate Related Scenarios
Instead of separate tests for each condition, combine related validations:
```python
# ✅ Good: Consolidated valid scenarios
def test_function_name_valid_inputs(self):
"""Test function with various valid inputs."""
# Test multiple valid scenarios in one test
assert function(valid_input_1) == expected_1
assert function(valid_input_2) == expected_2
assert function(edge_case_valid) == expected_3
# ✅ Good: Consolidated error scenarios
def test_function_name_error_handling(self):
"""Test function error handling."""
with pytest.raises(ValueError):
function(invalid_input_1)
with pytest.raises(TypeError):
function(invalid_input_2)
```
```python
# ❌ Bad: Too many separate tests
def test_function_with_input_1(self):
def test_function_with_input_2(self):
def test_function_with_edge_case(self):
def test_function_missing_column_1(self):
def test_function_missing_column_2(self):
def test_function_null_values(self):
def test_function_empty_dataframe(self):
```
## When to Add More Tests
Only create additional tests for:
- **Complex functions** with multiple distinct code paths
- **Critical business logic** requiring detailed validation
- **Integration points** with external systems
- **Performance-sensitive** functions
## Test Naming Convention
Use descriptive names that capture the test scope:
- `test_<function_name>_valid_scenarios`
- `test_<function_name>_error_handling`
- `test_<function_name>_integration` (if needed)
## Fixtures and Setup
Keep fixtures simple and focused:
- One fixture per test data type
- Combine related test data in single fixtures
- Avoid over-engineered test setup
## Example: Before and After
### Before (Too Many Tests)
```python
def test_add_column_successful_calculation(self):
def test_add_column_missing_columns(self):
def test_add_column_zero_distance_handling(self):
def test_add_column_null_values(self):
def test_add_column_empty_dataframe(self):
def test_add_column_return_type(self):
def test_add_column_preserves_columns(self):
```
### After (Concise)
```python
def test_add_avg_fare_per_mile_by_zones_valid_scenarios(self, test_dataframe):
"""Test function with valid inputs including edge cases."""
result_df = add_avg_fare_per_mile_by_zones(test_dataframe)
# Test column addition
assert "avg_fare_per_mile_by_zones" in result_df.columns
# Test return type
assert isinstance(result, DataFrame)
# Test column preservation
assert len(result_df.columns) == len(test_dataframe.columns) + 1
# Test calculations with various scenarios
results = result_df.collect()
# ... validate multiple scenarios in one test
def test_add_avg_fare_per_mile_by_zones_error_handling(self, spark_session):
"""Test function error handling and edge cases."""
# Test missing columns
with pytest.raises(ValueError, match="Missing required columns"):
add_avg_fare_per_mile_by_zones(incomplete_df)
# Test empty DataFrame handling
empty_result = add_avg_fare_per_mile_by_zones(empty_df)
assert empty_result.count() == 0
# Test null value handling
null_result = add_avg_fare_per_mile_by_zones(null_df)
assert null_result.count() == len(null_data)
```
## Benefits of Concise Testing
- Faster test execution
- Easier maintenance
- Focus on essential functionality
- Clearer test intent
- Reduced test code complexity