Skip to main content

Internal Implementation Specification: Payment Recovery ML Model

Hands In Payment Recovery Service - Machine Learning System

Document Version: 1.0
Date: October 6, 2025
Owner: Hands In Data Science & Engineering Team
Status: Draft - Design Phase
Audience: Internal - Engineering, Data Science, Product Teams


Executive Summary

This is an internal technical specification for the Payment Recovery Intelligence Engine implementation. This document is intended for Hands In's engineering and data science teams who will build, train, deploy, and maintain the machine learning system that powers payment recovery.

For customer-facing integration documentation, see the Payment Recovery Service Technical Specification.

The Payment Recovery Intelligence Engine uses machine learning to predict the optimal recovery strategy for failed payments. By analyzing historical transaction data, customer behavior patterns, processor performance metrics, and contextual factors, the model can achieve 30-40% recovery rates on previously failed payments.

Key Capabilities

  • Strategy Selection: Predict which recovery strategy has the highest success probability
  • Processor Routing: Select optimal payment processor based on failure characteristics
  • Success Prediction: Estimate likelihood of recovery success (0-100%)
  • Time Estimation: Predict optimal timing for recovery attempts
  • Confidence Scoring: Provide confidence intervals for predictions

1. Model Architecture

1.1 Ensemble Approach

The system uses an ensemble of three complementary models to maximize prediction accuracy:

Input Features (50+ dimensions)

Feature Engineering Pipeline

┌─────────────────────────────────────────────────┐
│ Ensemble Model │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ Model 1: Gradient Boosted Trees │ │
│ │ (XGBoost) │ │
│ │ - Best for: Categorical decisions │ │
│ │ - Weight: 0.40 │ │
│ └────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ Model 2: Deep Neural Network │ │
│ │ - Architecture: 3 hidden layers │ │
│ │ - Best for: Non-linear patterns │ │
│ │ - Weight: 0.35 │ │
│ └────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ Model 3: Random Forest │ │
│ │ - 500 trees, max depth 15 │ │
│ │ - Best for: Feature interactions │ │
│ │ - Weight: 0.25 │ │
│ └────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────┘

Weighted Voting & Confidence Calculation

Output Predictions:
├─ Primary Strategy (with probability)
├─ Fallback Strategies (ranked)
├─ Optimal Processor
├─ Expected Success Rate (%)
├─ Estimated Time to Recovery (hours)
└─ Confidence Score (0-1)

1.2 Model Components

1.2.1 XGBoost Model

import xgboost as xgb

model_xgb = xgb.XGBClassifier(
objective='multi:softprob',
num_class=5, # 5 recovery strategies
max_depth=10,
learning_rate=0.1,
n_estimators=200,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.1,
reg_lambda=1.0,
random_state=42
)

Strengths:

  • Excellent handling of categorical features
  • Robust to missing data
  • Fast inference time (< 50ms)
  • Built-in feature importance

1.2.2 Neural Network Model

import tensorflow as tf
from tensorflow import keras

model_nn = keras.Sequential([
keras.layers.Dense(128, activation='relu', input_shape=(feature_dim,)),
keras.layers.Dropout(0.3),
keras.layers.BatchNormalization(),

keras.layers.Dense(64, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.BatchNormalization(),

keras.layers.Dense(32, activation='relu'),
keras.layers.Dropout(0.1),

# Multi-output head
keras.layers.Dense(5, activation='softmax', name='strategy'),
keras.layers.Dense(1, activation='sigmoid', name='success_probability')
])

model_nn.compile(
optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss={
'strategy': 'categorical_crossentropy',
'success_probability': 'binary_crossentropy'
},
loss_weights={'strategy': 0.6, 'success_probability': 0.4},
metrics=['accuracy']
)

Strengths:

  • Captures complex non-linear relationships
  • Multi-task learning (strategy + success prediction)
  • Handles continuous features well
  • Good generalization with dropout

1.2.3 Random Forest Model

from sklearn.ensemble import RandomForestClassifier

model_rf = RandomForestClassifier(
n_estimators=500,
max_depth=15,
min_samples_split=10,
min_samples_leaf=5,
max_features='sqrt',
bootstrap=True,
oob_score=True,
random_state=42,
n_jobs=-1
)

Strengths:

  • Robust to outliers
  • Handles feature interactions naturally
  • Provides uncertainty estimates
  • Minimal hyperparameter tuning

2. Feature Engineering

2.1 Input Features (50+ Dimensions)

interface ModelFeatures {
// Failure Characteristics (10 features)
failure_category_encoded: number[]; // One-hot encoded (9 categories)
failure_severity: number; // 0-1 score
processor_specific_code: number; // Hashed processor code
time_since_failure: number; // Hours

// Customer Characteristics (15 features)
customer_lifetime_value: number; // Normalized LTV
customer_risk_score: number; // 0-1 fraud risk
customer_tenure_days: number; // Account age
total_successful_payments: number;
total_failed_payments: number;
customer_success_rate: number; // Historical %
avg_payment_amount: number; // Historical average
payment_frequency: number; // Payments per month
recency_last_payment: number; // Days since last payment
preferred_payment_method: number[]; // One-hot encoded
customer_segment: number; // Encoded segment (VIP, regular, new)

// Payment Characteristics (12 features)
payment_amount_normalized: number; // Log-normalized amount
amount_to_avg_ratio: number; // Current / historical avg
currency_encoded: number[]; // One-hot encoded
payment_method_type: number[]; // One-hot encoded
card_bin_reputation: number; // 0-1 score for card BIN
card_brand_encoded: number[]; // One-hot encoded
card_expiry_months_remaining: number;
is_subscription: boolean;
subscription_period: number; // Encoded period

// Contextual Features (8 features)
hour_of_day: number; // 0-23
day_of_week: number; // 0-6
is_weekend: boolean;
is_business_hours: boolean;
merchant_category: number[]; // One-hot encoded
geographic_region: number[]; // One-hot encoded

// Historical Performance Features (10 features)
processor_success_rate_category: number; // Historical % for this category
processor_success_rate_overall: number; // Overall historical %
processor_avg_response_time: number; // Milliseconds
strategy_success_rate_category: number; // Historical % for this category
similar_transactions_success_rate: number; // KNN-based similarity
failure_category_recovery_rate: number; // Historical category recovery %
time_of_day_success_rate: number; // Historical success by hour
merchant_recovery_rate: number; // Merchant-specific historical %
global_recovery_rate: number; // Platform-wide baseline
seasonality_factor: number; // Seasonal adjustment
}

2.2 Feature Engineering Pipeline

class FeatureEngineer:
def __init__(self):
self.scalers = {}
self.encoders = {}
self.feature_stats = {}

def fit_transform(self, raw_data):
"""Transform raw transaction data into model features"""
features = {}

# 1. Categorical Encoding
features['failure_category'] = self._one_hot_encode(
raw_data['failure_category'],
categories=FAILURE_CATEGORIES
)

# 2. Numerical Normalization
features['payment_amount_normalized'] = self._log_normalize(
raw_data['payment_amount']
)

# 3. Temporal Features
features['hour_of_day'] = self._extract_hour(raw_data['timestamp'])
features['day_of_week'] = self._extract_day_of_week(raw_data['timestamp'])
features['is_weekend'] = features['day_of_week'].isin([5, 6])

# 4. Customer Aggregations
features['customer_success_rate'] = self._calculate_historical_rate(
raw_data['customer_id'],
success_col='is_successful'
)

# 5. BIN Reputation Lookup
features['card_bin_reputation'] = self._lookup_bin_reputation(
raw_data['card_bin']
)

# 6. Similarity Features
features['similar_transactions_success_rate'] = self._knn_similarity(
raw_data,
k=50
)

# 7. Interaction Features
features['amount_to_avg_ratio'] = (
raw_data['payment_amount'] /
features['avg_payment_amount']
)

return features

def _one_hot_encode(self, series, categories):
"""One-hot encoding with handling for unknown categories"""
encoder = OneHotEncoder(
categories=[categories],
handle_unknown='ignore',
sparse=False
)
return encoder.fit_transform(series.values.reshape(-1, 1))

def _log_normalize(self, series):
"""Log transformation with standardization"""
log_values = np.log1p(series) # log(1 + x) to handle zeros
return (log_values - log_values.mean()) / log_values.std()

def _calculate_historical_rate(self, customer_ids, success_col):
"""Calculate per-customer historical success rates"""
# Implementation would query historical database
pass

def _knn_similarity(self, data, k=50):
"""Find k similar transactions and compute their success rate"""
# Use KNN on feature space to find similar historical transactions
pass

2.3 Feature Importance

Based on SHAP values from initial model training:

FeatureImportanceCategory
failure_category0.18Failure
customer_success_rate0.15Customer
processor_success_rate_category0.12Historical
payment_amount_normalized0.09Payment
customer_lifetime_value0.08Customer
card_bin_reputation0.07Payment
time_since_failure0.06Failure
similar_transactions_success_rate0.05Historical
processor_avg_response_time0.04Historical
merchant_recovery_rate0.04Historical
(Other features)0.12Various

3. Model Training

3.1 Training Data Requirements

interface TrainingDataset {
// Minimum samples for initial training
minimumSamples: 50000;

// Minimum samples per class
minimumPerStrategy: 5000;

// Data collection period
historicalWindow: '6 months';

// Data balance requirements
classDistribution: {
alternative_processor: 0.30,
delayed_retry: 0.25,
alternative_payment_method: 0.20,
installments: 0.15,
not_recoverable: 0.10
};
}

3.2 Training Pipeline

class ModelTrainer:
def __init__(self):
self.models = {
'xgboost': None,
'neural_network': None,
'random_forest': None
}
self.ensemble_weights = [0.40, 0.35, 0.25]

def train_ensemble(self, X_train, y_train, X_val, y_val):
"""Train all models in ensemble"""

# 1. Train XGBoost
print("Training XGBoost...")
self.models['xgboost'] = self._train_xgboost(
X_train, y_train, X_val, y_val
)

# 2. Train Neural Network
print("Training Neural Network...")
self.models['neural_network'] = self._train_neural_network(
X_train, y_train, X_val, y_val
)

# 3. Train Random Forest
print("Training Random Forest...")
self.models['random_forest'] = self._train_random_forest(
X_train, y_train, X_val, y_val
)

# 4. Optimize ensemble weights
print("Optimizing ensemble weights...")
self.ensemble_weights = self._optimize_weights(
X_val, y_val
)

# 5. Evaluate ensemble
val_accuracy = self._evaluate_ensemble(X_val, y_val)
print(f"Validation accuracy: {val_accuracy:.4f}")

return self.models

def _train_xgboost(self, X_train, y_train, X_val, y_val):
model = xgb.XGBClassifier(
objective='multi:softprob',
num_class=5,
max_depth=10,
learning_rate=0.1,
n_estimators=200,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.1,
reg_lambda=1.0,
random_state=42
)

model.fit(
X_train, y_train,
eval_set=[(X_val, y_val)],
early_stopping_rounds=20,
verbose=10
)

return model

def _train_neural_network(self, X_train, y_train, X_val, y_val):
# Convert labels to categorical
y_train_cat = tf.keras.utils.to_categorical(y_train, num_classes=5)
y_val_cat = tf.keras.utils.to_categorical(y_val, num_classes=5)

model = self._build_neural_network(X_train.shape[1])

# Training callbacks
callbacks = [
keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=10,
restore_best_weights=True
),
keras.callbacks.ReduceLROnPlateau(
monitor='val_loss',
factor=0.5,
patience=5,
min_lr=1e-6
)
]

model.fit(
X_train, y_train_cat,
validation_data=(X_val, y_val_cat),
epochs=100,
batch_size=256,
callbacks=callbacks,
verbose=1
)

return model

def _optimize_weights(self, X_val, y_val):
"""Use grid search to find optimal ensemble weights"""
from scipy.optimize import minimize

def objective(weights):
predictions = self._ensemble_predict(X_val, weights)
accuracy = accuracy_score(y_val, predictions)
return -accuracy # Minimize negative accuracy

# Constraint: weights sum to 1
constraints = {'type': 'eq', 'fun': lambda w: np.sum(w) - 1}
bounds = [(0, 1) for _ in range(3)]

result = minimize(
objective,
x0=[0.33, 0.33, 0.34],
method='SLSQP',
bounds=bounds,
constraints=constraints
)

return result.x

3.3 Training Schedule

interface TrainingSchedule {
// Initial training
initial: {
data_collection_period: '6 months';
minimum_samples: 50000;
training_frequency: 'one-time';
};

// Ongoing retraining
retraining: {
frequency: 'weekly';
incremental_update: true;
full_retrain_frequency: 'monthly';
trigger_conditions: {
accuracy_drop: 0.05; // Retrain if accuracy drops 5%
new_samples_threshold: 10000; // Retrain after 10k new samples
distribution_shift: 0.1; // Retrain if data distribution shifts
};
};

// A/B testing
ab_testing: {
new_model_rollout: 'gradual';
initial_traffic_percentage: 0.05;
ramp_up_duration: '2 weeks';
success_criteria: {
accuracy_improvement: 0.02; // 2% improvement required
no_performance_degradation: true;
};
};
}

4. Model Inference

4.1 Prediction Pipeline

class RecoveryPredictor {
private models: EnsembleModels;
private featureEngineer: FeatureEngineer;

async predict(
failureContext: FailureContext,
customerContext: CustomerContext,
paymentContext: PaymentContext
): Promise<RecoveryPrediction> {

// 1. Feature engineering
const features = await this.featureEngineer.transform({
failure: failureContext,
customer: customerContext,
payment: paymentContext
});

// 2. Get predictions from each model
const xgboostPred = await this.models.xgboost.predict(features);
const nnPred = await this.models.neuralNetwork.predict(features);
const rfPred = await this.models.randomForest.predict(features);

// 3. Ensemble predictions
const ensemblePrediction = this.combinepredictions(
[xgboostPred, nnPred, rfPred],
this.models.ensembleWeights
);

// 4. Post-processing and business logic
const finalPrediction = this.applyBusinessRules(
ensemblePrediction,
failureContext,
customerContext
);

return finalPrediction;
}

private combinepredictions(
predictions: Prediction[],
weights: number[]
): EnsemblePrediction {
// Weighted average of probabilities
const ensembleProbs = predictions[0].probabilities.map((_, idx) => {
return predictions.reduce((sum, pred, modelIdx) => {
return sum + pred.probabilities[idx] * weights[modelIdx];
}, 0);
});

// Select strategy with highest probability
const primaryStrategyIdx = this.argmax(ensembleProbs);
const primaryStrategy = STRATEGIES[primaryStrategyIdx];

// Calculate confidence based on probability distribution
const confidence = this.calculateConfidence(ensembleProbs);

// Get fallback strategies
const fallbackStrategies = this.getFallbackStrategies(
ensembleProbs,
primaryStrategyIdx
);

return {
primaryStrategy,
probability: ensembleProbs[primaryStrategyIdx],
confidence,
fallbackStrategies,
allProbabilities: ensembleProbs
};
}

private applyBusinessRules(
prediction: EnsemblePrediction,
failureContext: FailureContext,
customerContext: CustomerContext
): RecoveryPrediction {
// Override ML prediction if business rules dictate

// Rule 1: Never retry fraud-suspected transactions
if (failureContext.category === 'fraud_suspected') {
return {
strategy: 'not_recoverable',
reason: 'fraud_risk',
confidence: 1.0
};
}

// Rule 2: High-value customers get priority routing
if (customerContext.lifetimeValue > 10000) {
// Use alternative processor first for VIP customers
if (prediction.primaryStrategy !== 'alternative_processor') {
return {
...prediction,
primaryStrategy: 'alternative_processor',
reason: 'vip_customer_priority'
};
}
}

// Rule 3: Low confidence predictions default to safe strategy
if (prediction.confidence < 0.5) {
return {
strategy: 'delayed_retry',
reason: 'low_confidence_safe_default',
confidence: prediction.confidence
};
}

return prediction;
}
}

4.2 Performance Requirements

interface InferencePerformance {
// Latency targets
p50_latency: '< 50ms';
p95_latency: '< 100ms';
p99_latency: '< 200ms';

// Throughput
requests_per_second: 1000;

// Resource usage
memory_per_request: '< 10MB';
cpu_utilization: '< 70%';

// Model size
xgboost_size: '~50MB';
neural_network_size: '~30MB';
random_forest_size: '~100MB';
total_ensemble_size: '~180MB';
}

5. Model Evaluation

5.1 Evaluation Metrics

class ModelEvaluator:
def evaluate(self, y_true, y_pred, y_pred_proba):
"""Comprehensive model evaluation"""

metrics = {
# Classification metrics
'accuracy': accuracy_score(y_true, y_pred),
'precision': precision_score(y_true, y_pred, average='weighted'),
'recall': recall_score(y_true, y_pred, average='weighted'),
'f1_score': f1_score(y_true, y_pred, average='weighted'),

# Multi-class metrics
'confusion_matrix': confusion_matrix(y_true, y_pred),
'classification_report': classification_report(y_true, y_pred),

# Probability calibration
'log_loss': log_loss(y_true, y_pred_proba),
'brier_score': brier_score_loss(y_true, y_pred_proba),

# Business metrics
'recovery_rate': self._calculate_recovery_rate(y_true, y_pred),
'revenue_impact': self._calculate_revenue_impact(y_true, y_pred),
'time_to_recovery': self._calculate_avg_time_to_recovery(y_pred),

# Per-strategy metrics
'strategy_metrics': self._per_strategy_metrics(y_true, y_pred)
}

return metrics

def _calculate_recovery_rate(self, y_true, y_pred):
"""Calculate actual recovery rate from predictions"""
# Predictions that led to successful recovery
successful = (y_pred != 'not_recoverable') & (y_true == 'recovered')
return successful.sum() / len(y_true)

def _calculate_revenue_impact(self, y_true, y_pred):
"""Calculate revenue recovered vs. potential"""
# Would need transaction amounts to calculate actual $$$
pass

5.2 Target Metrics (Year 1)

MetricQ1 TargetQ2 TargetQ3 TargetQ4 Target
Strategy Accuracy65%70%75%78%
Recovery Rate20%25%30%35%
Precision (weighted)60%65%70%73%
Recall (weighted)55%60%65%68%
Confidence Calibration70%75%80%85%
p95 Latency< 150ms< 120ms< 100ms< 80ms

5.3 Model Monitoring

interface ModelMonitoring {
// Real-time metrics
realtime: {
prediction_latency: TimeSeries;
prediction_distribution: Distribution;
confidence_scores: Distribution;
error_rate: number;
};

// Model performance tracking
performance: {
daily_accuracy: TimeSeries;
strategy_success_rates: Map<string, TimeSeries>;
calibration_error: TimeSeries;
feature_drift: Map<string, number>;
};

// Alerts
alerts: {
accuracy_drop: {
threshold: 0.05; // Alert if drops 5%
window: '24 hours';
};
latency_spike: {
threshold: 200; // Alert if p95 > 200ms
window: '5 minutes';
};
feature_drift: {
threshold: 0.15; // Alert if feature distribution shifts 15%
window: '7 days';
};
prediction_bias: {
threshold: 0.10; // Alert if bias towards one strategy
window: '24 hours';
};
};
}

6. Continuous Learning

6.1 Feedback Loop

class FeedbackCollector {
async recordOutcome(
recoveryId: string,
prediction: RecoveryPrediction,
actualOutcome: RecoveryOutcome
): Promise<void> {
// Store prediction and outcome for model retraining
await this.database.insert('model_feedback', {
recovery_id: recoveryId,
prediction: {
strategy: prediction.strategy,
probability: prediction.probability,
confidence: prediction.confidence,
model_version: prediction.modelVersion
},
outcome: {
success: actualOutcome.success,
strategy_used: actualOutcome.strategyUsed,
time_to_recovery: actualOutcome.timeToRecovery,
revenue_recovered: actualOutcome.revenueRecovered
},
features: prediction.features,
timestamp: new Date()
});

// Update real-time metrics
await this.metricsTracker.updateAccuracy(
prediction.strategy === actualOutcome.strategyUsed
);
}

async collectRetrainingData(): Promise<TrainingDataset> {
// Collect feedback since last training
const feedback = await this.database.query(`
SELECT * FROM model_feedback
WHERE timestamp > :last_training_date
AND outcome IS NOT NULL
`);

return this.prepareRetrainingDataset(feedback);
}
}

6.2 Online Learning Strategy

interface OnlineLearningStrategy {
// Incremental updates
incremental: {
enabled: true;
update_frequency: 'daily';
min_samples_per_update: 1000;
learning_rate_decay: 0.95;
};

// Model versioning
versioning: {
maintain_versions: 5;
rollback_enabled: true;
champion_challenger: {
enabled: true;
challenger_traffic: 0.10;
evaluation_period: '1 week';
};
};

// Adaptive learning
adaptive: {
merchant_specific_models: true;
regional_adaptations: true;
seasonal_adjustments: true;
};
}

7. Model Interpretability

7.1 SHAP Values

import shap

class ModelExplainer:
def __init__(self, models):
self.models = models
self.explainers = {
'xgboost': shap.TreeExplainer(models['xgboost']),
'neural_network': shap.DeepExplainer(models['neural_network']),
'random_forest': shap.TreeExplainer(models['random_forest'])
}

def explain_prediction(self, features, prediction):
"""Generate SHAP explanation for a single prediction"""

# Get SHAP values from each model
shap_values = {}
for model_name, explainer in self.explainers.items():
shap_values[model_name] = explainer.shap_values(features)

# Combine SHAP values using ensemble weights
ensemble_shap = self._combine_shap_values(shap_values)

# Get top contributing features
top_features = self._get_top_features(ensemble_shap, n=10)

return {
'prediction': prediction,
'feature_contributions': top_features,
'shap_values': ensemble_shap,
'explanation_text': self._generate_explanation_text(top_features)
}

def _generate_explanation_text(self, top_features):
"""Generate human-readable explanation"""
explanations = []

for feature, contribution in top_features:
if contribution > 0:
explanations.append(
f"{feature} increased the likelihood of this strategy by {contribution:.1%}"
)
else:
explanations.append(
f"{feature} decreased the likelihood of this strategy by {abs(contribution):.1%}"
)

return "\n".join(explanations)

7.2 Decision Transparency

interface PredictionExplanation {
// High-level explanation
summary: string;

// Feature contributions
topFeatures: Array<{
name: string;
value: number;
contribution: number;
description: string;
}>;

// Strategy reasoning
reasoning: {
whyRecommended: string;
alternativeStrategies: Array<{
strategy: string;
probability: number;
reason: string;
}>;
};

// Confidence factors
confidenceFactors: {
modelAgreement: number; // How much models agree
historicalAccuracy: number; // Model accuracy on similar cases
dataQuality: number; // Completeness of input data
};
}

8. Ethical AI & Fairness

8.1 Fairness Constraints

class FairnessValidator:
def validate_fairness(self, model, test_data):
"""Ensure model doesn't discriminate based on protected attributes"""

metrics = {}

# Test for demographic parity
metrics['demographic_parity'] = self._test_demographic_parity(
model, test_data,
protected_attributes=['geographic_region', 'customer_segment']
)

# Test for equal opportunity
metrics['equal_opportunity'] = self._test_equal_opportunity(
model, test_data,
protected_attributes=['geographic_region', 'customer_segment']
)

# Test for calibration across groups
metrics['calibration'] = self._test_calibration_across_groups(
model, test_data,
protected_attributes=['geographic_region', 'customer_segment']
)

return metrics

def _test_demographic_parity(self, model, data, protected_attributes):
"""Ensure similar prediction rates across demographic groups"""
results = {}

for attr in protected_attributes:
groups = data[attr].unique()
prediction_rates = {}

for group in groups:
group_data = data[data[attr] == group]
predictions = model.predict(group_data)
prediction_rates[group] = (predictions == 'recovered').mean()

# Calculate disparity
max_rate = max(prediction_rates.values())
min_rate = min(prediction_rates.values())
disparity = (max_rate - min_rate) / max_rate

results[attr] = {
'rates': prediction_rates,
'disparity': disparity,
'passes': disparity < 0.15 # Max 15% disparity allowed
}

return results

8.2 Bias Mitigation

interface BiasMitigation {
// Pre-processing
preprocessing: {
balanced_sampling: boolean;
protected_attribute_removal: string[];
fairness_aware_encoding: boolean;
};

// In-processing
inprocessing: {
fairness_constraints: boolean;
adversarial_debiasing: boolean;
prejudice_remover: boolean;
};

// Post-processing
postprocessing: {
threshold_optimization: boolean;
calibration_adjustment: boolean;
reject_option_classification: boolean;
};
}

4.2.3 Performance Targets

MetricTarget
Prediction Accuracy> 75%
Recovery Rate> 30%
Inference Latency (p95)< 100ms
Model Confidence Calibration> 80%

4.3 Processor Selection Algorithm

interface ProcessorScore {
processorId: string;
score: number;
factors: {
historicalSuccessRate: number;
cardBINCompatibility: number;
geographicOptimization: number;
costEfficiency: number;
responseTime: number;
};
}

class ProcessorSelector {
async selectOptimalProcessor(
failureContext: FailureContext,
customerContext: CustomerContext,
availableProcessors: Processor[]
): Promise<ProcessorScore[]> {
const scores: ProcessorScore[] = [];

for (const processor of availableProcessors) {
const score = await this.calculateProcessorScore(
processor,
failureContext,
customerContext
);
scores.push(score);
}

// Sort by score descending
return scores.sort((a, b) => b.score - a.score);
}

private async calculateProcessorScore(
processor: Processor,
failureContext: FailureContext,
customerContext: CustomerContext
): Promise<ProcessorScore> {
// Fetch historical data
const historicalSuccessRate = await this.getProcessorSuccessRate(
processor.id,
failureContext.failureCategory,
customerContext.customerSegment
);

// Check BIN routing tables
const cardBINCompatibility = await this.checkBINCompatibility(
processor.id,
failureContext.cardBIN
);

// Geographic optimization
const geographicOptimization = this.calculateGeoScore(
processor.supportedRegions,
customerContext.country
);

// Cost analysis
const costEfficiency = this.calculateCostScore(
processor.fees,
failureContext.amount
);

// Performance metrics
const responseTime = await this.getAverageResponseTime(processor.id);

// Weighted scoring
const score = (
historicalSuccessRate * 0.40 +
cardBINCompatibility * 0.25 +
geographicOptimization * 0.15 +
costEfficiency * 0.10 +
responseTime * 0.10
);

return {
processorId: processor.id,
score,
factors: {
historicalSuccessRate,
cardBINCompatibility,
geographicOptimization,
costEfficiency,
responseTime
}
};
}
}

5. Recovery Strategies

5.1 Alternative Processor Routing

Strategy Logic

class AlternativeProcessorStrategy implements RecoveryStrategy {
async execute(context: RecoveryContext): Promise<RecoveryResult> {
// 1. Select optimal alternative processor
const processors = await this.processorSelector.selectOptimalProcessor(
context.failure,
context.customer,
context.availableProcessors.filter(p => p.id !== context.originalProcessor)
);

if (processors.length === 0) {
return { success: false, reason: 'No alternative processors available' };
}

// 2. Attempt payment with top 3 processors in sequence
for (const processor of processors.slice(0, 3)) {
try {
const result = await this.attemptPayment(
processor.processorId,
context.payment,
context.customer
);

if (result.success) {
// Track success for ML model
await this.trackStrategySuccess(processor.processorId, context);

return {
success: true,
transactionId: result.transactionId,
processor: processor.processorId,
strategy: 'alternative_processor'
};
}
} catch (error) {
// Log failure and try next processor
await this.trackStrategyFailure(processor.processorId, context, error);
continue;
}
}

return { success: false, reason: 'All alternative processors failed' };
}
}

5.2 Delayed Retry Strategy

Strategy Logic

class DelayedRetryStrategy implements RecoveryStrategy {
async execute(context: RecoveryContext): Promise<RecoveryResult> {
// Calculate optimal retry time based on failure type
const retrySchedule = this.calculateRetrySchedule(context.failure);

// Schedule retry attempts
for (const retryTime of retrySchedule) {
await this.scheduleRetry(context.recoveryId, retryTime);

// Send customer notification
await this.notificationService.sendRetryNotification(
context.customer,
retryTime,
context.payment.amount
);
}

return {
success: true,
strategy: 'delayed_retry',
nextAttemptAt: retrySchedule[0]
};
}

private calculateRetrySchedule(failure: FailureContext): Date[] {
const schedule: Date[] = [];
const now = new Date();

switch (failure.category) {
case FailureCategory.INSUFFICIENT_FUNDS:
// Retry after typical pay cycles
schedule.push(addDays(now, 3)); // Next pay cycle
schedule.push(addDays(now, 7)); // Bi-weekly
schedule.push(addDays(now, 15)); // Monthly
break;

case FailureCategory.PROCESSING_ERROR:
// Quick retries for transient errors
schedule.push(addMinutes(now, 15));
schedule.push(addHours(now, 1));
schedule.push(addHours(now, 6));
break;

case FailureCategory.DO_NOT_HONOR:
// Medium-term retries
schedule.push(addDays(now, 1));
schedule.push(addDays(now, 3));
schedule.push(addDays(now, 7));
break;

default:
schedule.push(addHours(now, 24));
schedule.push(addDays(now, 3));
break;
}

return schedule;
}
}

5.3 Alternative Payment Method Strategy

Strategy Logic

class AlternativePaymentMethodStrategy implements RecoveryStrategy {
async execute(context: RecoveryContext): Promise<RecoveryResult> {
// Generate recovery UI for customer
const recoverySession = await this.createRecoverySession(context);

// Determine alternative payment methods to offer
const alternativeMethods = this.selectAlternativeMethods(
context.payment.paymentMethod.type,
context.customer.country
);

// Send notification to customer
await this.notificationService.sendPaymentMethodUpdateRequest(
context.customer,
recoverySession.url,
alternativeMethods
);

return {
success: true,
strategy: 'alternative_payment_method',
recoveryUrl: recoverySession.url,
status: 'customer_action_required'
};
}

private selectAlternativeMethods(
failedMethod: PaymentMethodType,
country: string
): PaymentMethodType[] {
const alternatives: PaymentMethodType[] = [];

// Always offer these as alternatives
if (failedMethod !== 'bank_account') {
alternatives.push('bank_account');
}
if (failedMethod !== 'card') {
alternatives.push('card');
}

// Region-specific alternatives
switch (country) {
case 'US':
alternatives.push('venmo', 'cashapp', 'paypal');
break;
case 'GB':
alternatives.push('open_banking', 'paypal');
break;
case 'DE':
alternatives.push('sofort', 'giropay', 'paypal');
break;
default:
alternatives.push('paypal');
}

return alternatives;
}
}

5.4 Installment Plan Strategy

Strategy Logic

class InstallmentStrategy implements RecoveryStrategy {
async execute(context: RecoveryContext): Promise<RecoveryResult> {
// Only offer installments for insufficient funds failures
if (context.failure.category !== FailureCategory.INSUFFICIENT_FUNDS) {
return { success: false, reason: 'Installments not applicable for this failure type' };
}

// Calculate installment plans
const plans = this.calculateInstallmentPlans(context.payment.amount.value);

// Create recovery session with installment options
const recoverySession = await this.createInstallmentSession(
context,
plans
);

// Notify customer
await this.notificationService.sendInstallmentOffer(
context.customer,
recoverySession.url,
plans
);

return {
success: true,
strategy: 'installments',
recoveryUrl: recoverySession.url,
status: 'customer_action_required'
};
}

private calculateInstallmentPlans(amount: number): InstallmentPlan[] {
const plans: InstallmentPlan[] = [];

// Minimum $50 per installment
const minInstallmentAmount = 5000; // cents

if (amount >= minInstallmentAmount * 2) {
plans.push({
numberOfPayments: 2,
paymentAmount: Math.ceil(amount / 2),
frequency: 'biweekly'
});
}

if (amount >= minInstallmentAmount * 3) {
plans.push({
numberOfPayments: 3,
paymentAmount: Math.ceil(amount / 3),
frequency: 'monthly'
});
}

if (amount >= minInstallmentAmount * 4) {
plans.push({
numberOfPayments: 4,
paymentAmount: Math.ceil(amount / 4),
frequency: 'monthly'
});
}

return plans;
}
}

6. Customer Recovery Experience

6.1 Recovery Flow UI

6.1.1 Email Notification Template

<!DOCTYPE html>
<html>
<head>
<style>
/* Responsive email template */
</style>
</head>
<body>
<div class="container">
<img src="{{merchant_logo}}" alt="{{merchant_name}}" />

<h1>Payment Issue - Easy Fix Available</h1>

<p>Hi {{customer_first_name}},</p>

<p>We noticed your recent payment of <strong>{{amount}}</strong> for
<strong>{{order_description}}</strong> couldn't be processed.</p>

<div class="issue-box">
<strong>Issue:</strong> {{failure_message}}
</div>

<p>Good news! We have several easy ways to complete your purchase:</p>

<div class="options">
{{#if show_retry}}
<div class="option">
<h3>🔄 Retry Payment</h3>
<p>We'll automatically retry your payment on {{retry_date}}</p>
</div>
{{/if}}

{{#if show_alternative_method}}
<div class="option">
<h3>💳 Use Different Payment Method</h3>
<p>Add a different card, bank account, or digital wallet</p>
</div>
{{/if}}

{{#if show_installments}}
<div class="option">
<h3>📅 Pay in Installments</h3>
<p>Split your payment into {{installment_count}} smaller payments</p>
</div>
{{/if}}
</div>

<a href="{{recovery_url}}" class="cta-button">Complete Your Purchase</a>

<p class="footer">
This link expires in {{expiry_hours}} hours.
<a href="{{contact_url}}">Need help?</a>
</p>
</div>
</body>
</html>

6.1.2 Recovery UI Components

interface RecoveryUIConfig {
merchantBranding: {
logo: string;
primaryColor: string;
fontFamily?: string;
};
paymentDetails: {
amount: Money;
description: string;
orderId: string;
};
availableOptions: RecoveryOption[];
customerInfo: {
name: string;
email: string;
};
expiresAt: Date;
}

interface RecoveryOption {
type: 'retry' | 'alternative_method' | 'installments';
title: string;
description: string;
icon: string;
recommended?: boolean;
metadata?: {
retryDate?: Date;
availableMethods?: PaymentMethodType[];
installmentPlans?: InstallmentPlan[];
};
}

6.2 SMS Notification

{{merchant_name}}: Payment issue for order #{{order_id}}. Complete your ${{amount}} purchase here: {{short_url}}

Options available:
- Retry payment
- Different payment method
- Pay in installments

Link expires in {{expiry_hours}}h.

8. Data Storage & Analytics

8.1 Database Schema

-- Recovery sessions
CREATE TABLE recovery_sessions (
id VARCHAR(36) PRIMARY KEY,
merchant_id VARCHAR(36) NOT NULL,
merchant_order_id VARCHAR(255) NOT NULL,
customer_id VARCHAR(36) NOT NULL,
status VARCHAR(50) NOT NULL,
created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL,
expires_at TIMESTAMP NOT NULL,
completed_at TIMESTAMP,

-- Original payment details
original_amount INTEGER NOT NULL,
original_currency VARCHAR(3) NOT NULL,
original_processor VARCHAR(100) NOT NULL,
original_payment_method_type VARCHAR(50) NOT NULL,

-- Failure details
failure_category VARCHAR(50) NOT NULL,
failure_code VARCHAR(100) NOT NULL,
failure_message TEXT,
failure_timestamp TIMESTAMP NOT NULL,

-- Recovery details
current_strategy VARCHAR(50),
strategy_confidence DECIMAL(3,2),
total_attempts INTEGER DEFAULT 0,
recovery_url TEXT,

-- Result
recovered_amount INTEGER,
recovered_at TIMESTAMP,
final_strategy VARCHAR(50),
final_processor VARCHAR(100),
final_transaction_id VARCHAR(255),

INDEX idx_merchant_order (merchant_id, merchant_order_id),
INDEX idx_customer (customer_id),
INDEX idx_status (status),
INDEX idx_created_at (created_at)
);

-- Recovery attempts
CREATE TABLE recovery_attempts (
id VARCHAR(36) PRIMARY KEY,
recovery_id VARCHAR(36) NOT NULL,
attempt_number INTEGER NOT NULL,
strategy VARCHAR(50) NOT NULL,
processor VARCHAR(100),
status VARCHAR(50) NOT NULL,
started_at TIMESTAMP NOT NULL,
completed_at TIMESTAMP,

-- Result
success BOOLEAN,
transaction_id VARCHAR(255),
failure_reason TEXT,
failure_code VARCHAR(100),

-- Performance metrics
processing_time_ms INTEGER,

FOREIGN KEY (recovery_id) REFERENCES recovery_sessions(id),
INDEX idx_recovery (recovery_id),
INDEX idx_strategy (strategy),
INDEX idx_processor (processor)
);

-- Customer interactions
CREATE TABLE recovery_interactions (
id VARCHAR(36) PRIMARY KEY,
recovery_id VARCHAR(36) NOT NULL,
interaction_type VARCHAR(50) NOT NULL,
timestamp TIMESTAMP NOT NULL,

-- Context
device_type VARCHAR(50),
browser VARCHAR(100),
location VARCHAR(100),

-- Details
details JSONB,

FOREIGN KEY (recovery_id) REFERENCES recovery_sessions(id),
INDEX idx_recovery (recovery_id),
INDEX idx_type (interaction_type)
);

-- Processor performance metrics
CREATE TABLE processor_performance (
id VARCHAR(36) PRIMARY KEY,
processor VARCHAR(100) NOT NULL,
failure_category VARCHAR(50) NOT NULL,

-- Time period
date DATE NOT NULL,
hour INTEGER,

-- Metrics
total_attempts INTEGER DEFAULT 0,
successful_attempts INTEGER DEFAULT 0,
failed_attempts INTEGER DEFAULT 0,
success_rate DECIMAL(5,4),
avg_processing_time_ms INTEGER,

-- Amount metrics
total_amount INTEGER DEFAULT 0,
recovered_amount INTEGER DEFAULT 0,

INDEX idx_processor_date (processor, date),
INDEX idx_category (failure_category)
);

8.2 Analytics & Metrics

8.2.1 Key Metrics to Track

interface RecoveryMetrics {
// Overall performance
totalRecoveries: number;
successfulRecoveries: number;
failedRecoveries: number;
overallSuccessRate: number;

// Amount metrics
totalAttemptedAmount: Money;
totalRecoveredAmount: Money;
recoveryRate: number;

// Time metrics
averageTimeToRecovery: number; // hours
medianTimeToRecovery: number;

// Strategy performance
strategyBreakdown: {
[strategy: string]: {
attempts: number;
successes: number;
successRate: number;
avgTimeToRecovery: number;
};
};

// Processor performance
processorBreakdown: {
[processor: string]: {
attempts: number;
successes: number;
successRate: number;
avgProcessingTime: number;
};
};

// Failure category analysis
failureCategoryBreakdown: {
[category: string]: {
total: number;
recovered: number;
recoveryRate: number;
};
};

// Customer behavior
customerInteractionRate: number;
averageTimeToInteraction: number;
customerDropoffRate: number;
}

8.2.2 Dashboard Queries

class RecoveryAnalytics {
async getRecoveryMetrics(
merchantId: string,
startDate: Date,
endDate: Date
): Promise<RecoveryMetrics> {
// Implementation would query database and calculate metrics
}

async getStrategyEffectiveness(
strategy: string,
failureCategory: FailureCategory
): Promise<StrategyEffectiveness> {
// Analyze which strategies work best for which failure types
}

async getProcessorRecommendations(
merchantId: string,
failureCategory: FailureCategory
): Promise<ProcessorRecommendation[]> {
// Return ranked list of processors for specific failure types
}

async predictRecoveryProbability(
failureContext: FailureContext,
customerContext: CustomerContext
): Promise<number> {
// ML model prediction of recovery success
// See Payment_Recovery_ML_Model_Spec.md for details
}
}

9. Security & Compliance

9.1 Data Security

9.1.1 Sensitive Data Handling

  • All payment card data encrypted at rest using AES-256
  • PCI DSS Level 1 compliance
  • No storage of full card numbers (only last 4 digits + BIN)
  • Tokenization for all payment methods
  • TLS 1.3 for data in transit

9.1.2 API Security

interface SecurityControls {
authentication: {
type: 'api_key' | 'oauth2';
keyRotation: number; // days
rateLimiting: {
requestsPerMinute: number;
burstLimit: number;
};
};

webhookSecurity: {
signatureValidation: boolean;
secretRotation: number; // days
ipWhitelisting?: string[];
};

dataAccess: {
encryptionAtRest: boolean;
encryptionInTransit: boolean;
dataRetention: number; // days
autoRedaction: boolean;
};
}

9.2 Privacy & Compliance

9.2.1 Data Retention

  • Recovery session data: 90 days
  • Customer interaction logs: 90 days
  • Analytics aggregates: 2 years
  • Auto-deletion of PII after retention period

9.2.2 GDPR Compliance

  • Right to erasure (customer data deletion API)
  • Data portability (export API)
  • Consent management for notifications
  • Data processing agreements with processors

9.2.3 PSD2 Compliance (EU)

  • Strong Customer Authentication (SCA) for retries
  • Dynamic linking for payment confirmations
  • Transaction monitoring and reporting

10. Implementation Roadmap

Phase 1: MVP (Months 1-3)

Core Features:

  • Basic API for failed payment submission
  • Simple processor routing (2-3 alternative processors)
  • Delayed retry strategy
  • Email notifications
  • Basic recovery UI
  • Webhook events (initiated, completed, failed)
  • JavaScript SDK

Success Criteria:

  • 10+ beta merchants onboarded
  • 20% recovery rate on failed payments
  • < 500ms API response time

Phase 2: Intelligence (Months 4-6)

Core Features:

  • ML-based routing decisions (see ML Model Spec in this document)
  • Alternative payment method strategy
  • Installment plan offering
  • SMS notifications
  • Mobile SDKs (iOS/Android)
  • Advanced analytics dashboard
  • A/B testing framework

Success Criteria:

  • 50+ active merchants
  • 30% recovery rate
  • ML model accuracy > 70%

Phase 3: Scale (Months 7-9)

Core Features:

  • Multi-region support
  • 10+ payment processor integrations
  • Advanced fraud detection
  • Custom recovery flow builder
  • White-label solutions
  • Real-time decisioning (< 100ms)

Success Criteria:

  • 200+ active merchants
  • 35% recovery rate
  • 99.9% uptime SLA

Phase 4: Enterprise (Months 10-12)

Core Features:

  • Enterprise SLA tiers
  • Custom ML model training
  • Dedicated support
  • Advanced reporting & BI tools
  • Multi-merchant orchestration
  • Global processor network

Success Criteria:

  • 500+ active merchants
  • 40% recovery rate
  • Enterprise customer acquisition

11. Pricing Model

11.1 Pricing Structure

interface PricingTier {
name: string;
monthlyFee: number;
successFee: {
percentage: number;
perTransaction: number;
};
limits: {
monthlyRecoveries: number;
apiCalls: number;
};
features: string[];
}

const pricingTiers: PricingTier[] = [
{
name: 'Starter',
monthlyFee: 0,
successFee: {
percentage: 5.0, // 5% of recovered amount
perTransaction: 50 // $0.50
},
limits: {
monthlyRecoveries: 100,
apiCalls: 10000
},
features: [
'Basic processor routing',
'Email notifications',
'Standard recovery UI',
'Basic analytics'
]
},
{
name: 'Growth',
monthlyFee: 299,
successFee: {
percentage: 3.5,
perTransaction: 50
},
limits: {
monthlyRecoveries: 1000,
apiCalls: 100000
},
features: [
'ML-powered routing',
'SMS + Email notifications',
'Custom recovery flows',
'Advanced analytics',
'Installment plans',
'Priority support'
]
},
{
name: 'Enterprise',
monthlyFee: 'Custom',
successFee: {
percentage: 2.5,
perTransaction: 50
},
limits: {
monthlyRecoveries: -1, // unlimited
apiCalls: -1
},
features: [
'Everything in Growth',
'Custom ML model training',
'White-label solutions',
'Dedicated support',
'Custom SLA',
'Multi-merchant management'
]
}
];

11.2 ROI Calculator

function calculateROI(
monthlyFailedPayments: number,
averagePaymentAmount: number,
currentRecoveryRate: number = 0
): ROIAnalysis {
const handsInRecoveryRate = 0.30; // 30% average
const additionalRecoveries = monthlyFailedPayments * (handsInRecoveryRate - currentRecoveryRate);
const additionalRevenue = additionalRecoveries * averagePaymentAmount;
const serviceFee = additionalRevenue * 0.035; // 3.5% for Growth tier
const netGain = additionalRevenue - serviceFee;

return {
additionalRecoveries: Math.round(additionalRecoveries),
additionalRevenue: Math.round(additionalRevenue),
serviceFee: Math.round(serviceFee),
netGain: Math.round(netGain),
roi: Math.round((netGain / serviceFee) * 100)
};
}

// Example:
// 1000 failed payments/month
// $150 average payment
// 0% current recovery
// = 300 additional recoveries
// = $45,000 additional revenue
// = $1,575 service fee
// = $43,425 net gain
// = 2,756% ROI

12. Success Metrics & KPIs

12.1 Product KPIs

interface ProductKPIs {
// Core metrics
recoveryRate: number; // % of submitted failures that recover
revenueRecovered: Money; // total $ recovered
averageTimeToRecovery: number; // hours

// Strategy metrics
strategySuccessRates: Map<string, number>;
strategyUsageDistribution: Map<string, number>;

// Processor metrics
processorSuccessRates: Map<string, number>;
processorUtilization: Map<string, number>;

// Customer experience
customerInteractionRate: number; // % who engage with recovery flow
customerSatisfaction: number; // CSAT score
completionTime: number; // time from view to completion

// Technical performance
apiLatency: {
p50: number;
p95: number;
p99: number;
};
uptime: number; // %
errorRate: number; // %

// Business metrics
activemerchants: number;
revenuePerMerchant: Money;
merchantRetention: number; // %
nps: number; // Net Promoter Score
}

12.2 Success Targets (Year 1)

MetricQ1 TargetQ2 TargetQ3 TargetQ4 Target
Recovery Rate20%25%30%35%
Active Merchants2575150300
Monthly Recovered Revenue$100K$500K$1.5M$3M
API Latency (p95)< 750ms< 500ms< 300ms< 200ms
Customer Interaction Rate40%50%60%65%
Merchant NPS30405060

Appendices

Appendix A: Model Hyperparameters

Complete hyperparameter configurations for all models.

Appendix B: Feature Dictionary

Detailed descriptions of all input features and their ranges.

Appendix C: Training Data Schema

Database schema for training data collection and storage.

Appendix D: Model Performance Benchmarks

Comprehensive benchmark results across different scenarios.

Appendix E: API for Model Serving

Technical API specification for model inference endpoints.


End of ML Model Specification

This document is subject to updates as the model evolves. Last updated: October 6, 2025