Chapter 5: Relationship AnalysisΒΆ

Purpose: Explore feature correlations, relationships with the target, and identify predictive signals.

What you'll learn:

  • How to interpret correlation matrices and identify multicollinearity
  • How to visualize feature distributions by target class
  • How to identify which features have the strongest relationship with retention
  • How to analyze categorical features for predictive power

Outputs:

  • Correlation heatmap with multicollinearity detection
  • Feature distributions by retention status (box plots)
  • Retention rates by categorical features
  • Feature-target correlation rankings

Understanding Feature RelationshipsΒΆ

Analysis What It Tells You Action
High Correlation (r > 0.7) Features carry redundant information Consider removing one
Target Correlation Feature's predictive power Prioritize high-correlation features
Class Separation How different retained vs churned look Good separation = good predictor
Categorical Rates Retention varies by category Use for segmentation and encoding

5.1 SetupΒΆ

InΒ [1]:
Show/Hide Code
from customer_retention.analysis.notebook_progress import track_and_export_previous

track_and_export_previous("05_relationship_analysis.ipynb")

import numpy as np
import pandas as pd
import plotly.graph_objects as go
import yaml
from plotly.subplots import make_subplots

from customer_retention.analysis.auto_explorer import ExplorationFindings, ExplorationManager, RecommendationRegistry
from customer_retention.analysis.visualization import ChartBuilder, display_figure
from customer_retention.core.config.column_config import ColumnType
from customer_retention.core.config.experiments import (
    FINDINGS_DIR,
)
from customer_retention.core.utils.leakage import detect_leaking_features
from customer_retention.stages.profiling import RecommendationCategory, RelationshipRecommender
InΒ [2]:
Show/Hide Code
from pathlib import Path

from customer_retention.analysis.auto_explorer import load_notebook_findings

FINDINGS_PATH, _namespace, dataset_name = load_notebook_findings("05_relationship_analysis.ipynb")
print(f"Using: {FINDINGS_PATH}")

RECOMMENDATIONS_PATH = FINDINGS_PATH.replace("_findings.yaml", "_recommendations.yaml")

findings = ExplorationFindings.load(FINDINGS_PATH)

# Load data - handle aggregated vs standard paths
from customer_retention.analysis.auto_explorer.active_dataset_store import load_active_dataset
from customer_retention.stages.temporal import TEMPORAL_METADATA_COLS

if "_aggregated" in FINDINGS_PATH:
    source_path = Path(findings.source_path)
    if not source_path.is_absolute():
        if str(source_path).startswith("experiments"):
            source_path = Path("..") / source_path
        else:
            source_path = FINDINGS_DIR / source_path.name
    if source_path.is_dir():
        from customer_retention.integrations.adapters.factory import get_delta
        df = get_delta(force_local=True).read(str(source_path))
    elif source_path.is_file():
        df = pd.read_parquet(source_path)
    else:
        df = load_active_dataset(_namespace, dataset_name)
    data_source = f"aggregated:{source_path.name}"
else:
    df = load_active_dataset(_namespace, dataset_name)
    data_source = dataset_name

_df_cols = set(df.columns)
findings.columns = {k: v for k, v in findings.columns.items() if k in _df_cols}
if findings.target_column and findings.target_column not in _df_cols:
    findings.target_column = None

charts = ChartBuilder()

if Path(RECOMMENDATIONS_PATH).exists():
    with open(RECOMMENDATIONS_PATH, "r") as f:
        registry = RecommendationRegistry.from_dict(yaml.safe_load(f))
    print(f"Loaded existing recommendations: {len(registry.all_recommendations)} total")
else:
    registry = RecommendationRegistry()
    registry.init_bronze(findings.source_path)
    _entity_col = (findings.time_series_metadata.entity_column
                   if findings.time_series_metadata else None)
    registry.init_silver(_entity_col or "entity_id")
    registry.init_gold(findings.target_column or "target")
    print("Initialized new recommendation registry")

print(f"\nLoaded {len(df):,} rows from: {data_source}")
Using: /Users/Vital/python/CustomerRetention/experiments/runs/email-6301db6c/datasets/customer_emails/findings/customer_emails_aggregated_findings.yaml
Loaded existing recommendations: 164 total

Loaded 4,998 rows from: aggregated:customer_emails_aggregated

5.1b Leakage Exclusion GateΒΆ

Features that leak target information are automatically detected and removed before relationship analysis. Add column names to EXCLUDE_LEAKING_FEATURES to manually exclude additional features you suspect of leakage.

InΒ [3]:
Show/Hide Code
EXCLUDE_LEAKING_FEATURES = []

_check_cols = [
    name for name, col in findings.columns.items()
    if col.inferred_type in [ColumnType.NUMERIC_CONTINUOUS, ColumnType.NUMERIC_DISCRETE, ColumnType.BINARY]
    and name != findings.target_column
    and name not in TEMPORAL_METADATA_COLS
]

_auto_leakers = detect_leaking_features(df, _check_cols, findings.target_column)
_all_excluded = sorted(set(_auto_leakers) | set(EXCLUDE_LEAKING_FEATURES))

if _all_excluded:
    for _col in _all_excluded:
        findings.columns.pop(_col, None)
    df = df.drop(columns=[c for c in _all_excluded if c in df.columns])
    findings.excluded_leaking_features = _all_excluded

    _auto_only = [c for c in _auto_leakers if c not in EXCLUDE_LEAKING_FEATURES]
    _manual_only = [c for c in EXCLUDE_LEAKING_FEATURES if c not in _auto_leakers]
    print(f"Excluded {len(_all_excluded)} leaking feature(s):")
    if _auto_only:
        print(f"  Auto-detected: {', '.join(_auto_only)}")
    if _manual_only:
        print(f"  Manual: {', '.join(_manual_only)}")
    _both = [c for c in _all_excluded if c in _auto_leakers and c in EXCLUDE_LEAKING_FEATURES]
    if _both:
        print(f"  Both: {', '.join(_both)}")
else:
    print("No leaking features detected.")
No leaking features detected.

5.2 Numeric Correlation MatrixΒΆ

πŸ“– How to Read the Heatmap:

  • Red (+1): Perfect positive correlation - features move together
  • Blue (-1): Perfect negative correlation - features move opposite
  • White (0): No linear relationship

⚠️ Multicollinearity Warning:

  • Pairs with |r| > 0.7 may cause issues in linear models
  • Consider removing one feature from highly correlated pairs
  • Tree-based models are more robust to multicollinearity
InΒ [4]:
Show/Hide Code
numeric_cols = [
    name for name, col in findings.columns.items()
    if col.inferred_type in [ColumnType.NUMERIC_CONTINUOUS, ColumnType.NUMERIC_DISCRETE, ColumnType.TARGET]
    and name not in TEMPORAL_METADATA_COLS
]

if len(numeric_cols) >= 2:
    corr_matrix = df[numeric_cols].corr()
    fig = charts.heatmap(
        corr_matrix.values,
        x_labels=numeric_cols,
        y_labels=numeric_cols,
        title="Numeric Correlation Matrix"
    )
    display_figure(fig)
else:
    print("Not enough numeric columns for correlation analysis.")
No description has been provided for this image

5.3 High Correlation PairsΒΆ

InΒ [5]:
Show/Hide Code
high_corr_threshold = 0.7
high_corr_pairs = []

if len(numeric_cols) >= 2:
    corr_matrix = df[numeric_cols].corr()
    for i in range(len(numeric_cols)):
        for j in range(i+1, len(numeric_cols)):
            corr_val = corr_matrix.iloc[i, j]
            if abs(corr_val) >= high_corr_threshold:
                high_corr_pairs.append({
                    "Column 1": numeric_cols[i],
                    "Column 2": numeric_cols[j],
                    "Correlation": f"{corr_val:.3f}"
                })

if high_corr_pairs:
    print(f"High Correlation Pairs (|r| >= {high_corr_threshold}):")
    display(pd.DataFrame(high_corr_pairs))
    print("\nConsider removing one of each pair to reduce multicollinearity.")
else:
    print("No high correlation pairs detected.")
High Correlation Pairs (|r| >= 0.7):
Column 1 Column 2 Correlation
0 event_count_180d event_count_365d 0.821
1 event_count_180d opened_count_180d 1.000
2 event_count_180d clicked_count_180d 1.000
3 event_count_180d send_hour_sum_180d 0.975
4 event_count_180d send_hour_count_180d 1.000
... ... ... ...
440 bounced_vs_cohort_mean bounced_cohort_zscore 1.000
441 bounced_vs_cohort_pct bounced_cohort_zscore 1.000
442 time_to_open_hours_vs_cohort_mean time_to_open_hours_vs_cohort_pct 1.000
443 time_to_open_hours_vs_cohort_mean time_to_open_hours_cohort_zscore 1.000
444 time_to_open_hours_vs_cohort_pct time_to_open_hours_cohort_zscore 1.000

445 rows Γ— 3 columns

Consider removing one of each pair to reduce multicollinearity.

5.4 Feature Distributions by Retention StatusΒΆ

πŸ“– How to Interpret Box Plots:

  • Box = Middle 50% of data (IQR)
  • Line inside box = Median
  • Whiskers = 1.5 Γ— IQR from box edges
  • Points outside = Outliers

⚠️ What Makes a Good Predictor:

  • Clear separation between retained (green) and churned (red) boxes
  • Different medians = Feature values differ between classes
  • Minimal overlap = Easier to distinguish classes
InΒ [6]:
Show/Hide Code
# Feature Distributions by Retention Status
if findings.target_column and findings.target_column in df.columns:
    target = findings.target_column

    feature_cols = [
        name for name, col in findings.columns.items()
        if col.inferred_type in [ColumnType.NUMERIC_CONTINUOUS, ColumnType.NUMERIC_DISCRETE]
        and name != target
        and name not in TEMPORAL_METADATA_COLS
    ]

    if feature_cols:
        print("=" * 80)
        print(f"FEATURE DISTRIBUTIONS BY TARGET: {target}")
        print("=" * 80)

        # Calculate summary statistics by target
        summary_by_target = []
        for col in feature_cols:
            for target_val, label in [(0, "Churned"), (1, "Retained")]:
                subset = df[df[target] == target_val][col].dropna()
                if len(subset) > 0:
                    summary_by_target.append({
                        "Feature": col,
                        "Group": label,
                        "Count": len(subset),
                        "Mean": subset.mean(),
                        "Median": subset.median(),
                        "Std": subset.std()
                    })

        if summary_by_target:
            summary_df = pd.DataFrame(summary_by_target)

            # Display summary table
            print("\nπŸ“Š Summary Statistics by Retention Status:")
            display_summary = summary_df.pivot(index="Feature", columns="Group", values=["Mean", "Median"])
            display_summary.columns = [f"{stat} ({group})" for stat, group in display_summary.columns]
            display(display_summary.round(3))

        # Calculate effect size (Cohen's d) for each feature
        print("\nπŸ“ˆ Feature Importance Indicators (Effect Size - Cohen's d):")
        print("-" * 70)
        effect_sizes = []
        for col in feature_cols:
            churned = df[df[target] == 0][col].dropna()
            retained = df[df[target] == 1][col].dropna()

            if len(churned) > 0 and len(retained) > 0:
                # Cohen's d
                pooled_std = np.sqrt(((len(churned)-1)*churned.std()**2 + (len(retained)-1)*retained.std()**2) /
                                     (len(churned) + len(retained) - 2))
                if pooled_std > 0:
                    d = (retained.mean() - churned.mean()) / pooled_std
                else:
                    d = 0

                # Interpret effect size
                abs_d = abs(d)
                if abs_d >= 0.8:
                    interpretation = "Large effect"
                    emoji = "πŸ”΄"
                elif abs_d >= 0.5:
                    interpretation = "Medium effect"
                    emoji = "🟑"
                elif abs_d >= 0.2:
                    interpretation = "Small effect"
                    emoji = "🟒"
                else:
                    interpretation = "Negligible"
                    emoji = "βšͺ"

                effect_sizes.append({
                    "feature": col,
                    "cohens_d": d,
                    "abs_d": abs_d,
                    "interpretation": interpretation
                })

                direction = "↑ Higher in retained" if d > 0 else "↓ Lower in retained"
                print(f"  {emoji} {col}: d={d:+.3f} ({interpretation}) {direction}")

        # Sort by effect size for identifying important features
        if effect_sizes:
            effect_df = pd.DataFrame(effect_sizes).sort_values("abs_d", ascending=False)
            important_features = effect_df[effect_df["abs_d"] >= 0.2]["feature"].tolist()
            if important_features:
                print(f"\n⭐ Features with notable effect (|d| β‰₯ 0.2): {', '.join(important_features)}")
        else:
            print("  No effect sizes could be calculated (insufficient data in one or both groups)")
    else:
        print("No numeric feature columns found for distribution analysis.")
else:
    print("Target column not available.")
================================================================================
FEATURE DISTRIBUTIONS BY TARGET: unsubscribed
================================================================================
πŸ“Š Summary Statistics by Retention Status:
Mean (Churned) Mean (Retained) Median (Churned) Median (Retained)
Feature
active_span_days 2943.246 1446.595 3002.000 1395.000
bounced_acceleration 0.001 0.002 0.000 0.000
bounced_beginning 0.151 0.091 0.000 0.000
bounced_cohort_zscore -0.006 0.007 -0.154 -0.154
bounced_count_180d 1.093 0.073 1.000 0.000
... ... ... ... ...
time_to_open_hours_trend_ratio 3.052 1.409 0.714 0.000
time_to_open_hours_velocity 0.010 -0.005 0.000 0.000
time_to_open_hours_velocity_pct 0.311 -0.659 -1.000 -1.000
time_to_open_hours_vs_cohort_mean 0.426 -0.530 -0.689 -0.689
time_to_open_hours_vs_cohort_pct 1.619 0.230 0.000 0.000

179 rows Γ— 4 columns

πŸ“ˆ Feature Importance Indicators (Effect Size - Cohen's d):
----------------------------------------------------------------------
  πŸ”΄ event_count_180d: d=-1.170 (Large effect) ↓ Lower in retained
  πŸ”΄ event_count_365d: d=-1.524 (Large effect) ↓ Lower in retained
  πŸ”΄ event_count_all_time: d=-0.893 (Large effect) ↓ Lower in retained
  🟑 opened_sum_180d: d=-0.663 (Medium effect) ↓ Lower in retained
  🟑 opened_mean_180d: d=-0.579 (Medium effect) ↓ Lower in retained
  πŸ”΄ opened_count_180d: d=-1.170 (Large effect) ↓ Lower in retained
  🟒 clicked_sum_180d: d=-0.385 (Small effect) ↓ Lower in retained
  🟒 clicked_mean_180d: d=-0.297 (Small effect) ↓ Lower in retained
  πŸ”΄ clicked_count_180d: d=-1.170 (Large effect) ↓ Lower in retained
  πŸ”΄ send_hour_sum_180d: d=-1.136 (Large effect) ↓ Lower in retained
  βšͺ send_hour_mean_180d: d=+0.026 (Negligible) ↑ Higher in retained
  βšͺ send_hour_max_180d: d=-0.070 (Negligible) ↓ Lower in retained
  πŸ”΄ send_hour_count_180d: d=-1.170 (Large effect) ↓ Lower in retained
  βšͺ bounced_sum_180d: d=-0.173 (Negligible) ↓ Lower in retained
  βšͺ bounced_mean_180d: d=+0.012 (Negligible) ↑ Higher in retained
  πŸ”΄ bounced_count_180d: d=-1.170 (Large effect) ↓ Lower in retained
  🟒 time_to_open_hours_sum_180d: d=-0.487 (Small effect) ↓ Lower in retained
  βšͺ time_to_open_hours_mean_180d: d=-0.055 (Negligible) ↓ Lower in retained
  βšͺ time_to_open_hours_max_180d: d=+0.025 (Negligible) ↑ Higher in retained
  🟑 time_to_open_hours_count_180d: d=-0.663 (Medium effect) ↓ Lower in retained
  πŸ”΄ opened_sum_365d: d=-0.957 (Large effect) ↓ Lower in retained
  🟑 opened_mean_365d: d=-0.705 (Medium effect) ↓ Lower in retained
  πŸ”΄ opened_count_365d: d=-1.524 (Large effect) ↓ Lower in retained
  🟑 clicked_sum_365d: d=-0.546 (Medium effect) ↓ Lower in retained
  🟒 clicked_mean_365d: d=-0.361 (Small effect) ↓ Lower in retained
  πŸ”΄ clicked_count_365d: d=-1.524 (Large effect) ↓ Lower in retained
  πŸ”΄ send_hour_sum_365d: d=-1.481 (Large effect) ↓ Lower in retained
  βšͺ send_hour_mean_365d: d=-0.000 (Negligible) ↓ Lower in retained
  βšͺ send_hour_max_365d: d=-0.122 (Negligible) ↓ Lower in retained
  πŸ”΄ send_hour_count_365d: d=-1.524 (Large effect) ↓ Lower in retained
  🟒 bounced_sum_365d: d=-0.261 (Small effect) ↓ Lower in retained
  βšͺ bounced_mean_365d: d=-0.134 (Negligible) ↓ Lower in retained
  πŸ”΄ bounced_count_365d: d=-1.524 (Large effect) ↓ Lower in retained
  🟑 time_to_open_hours_sum_365d: d=-0.687 (Medium effect) ↓ Lower in retained
  βšͺ time_to_open_hours_mean_365d: d=+0.051 (Negligible) ↑ Higher in retained
  βšͺ time_to_open_hours_max_365d: d=+0.044 (Negligible) ↑ Higher in retained
  πŸ”΄ time_to_open_hours_count_365d: d=-0.957 (Large effect) ↓ Lower in retained
  πŸ”΄ opened_sum_all_time: d=-1.008 (Large effect) ↓ Lower in retained
  πŸ”΄ opened_mean_all_time: d=-0.839 (Large effect) ↓ Lower in retained
  πŸ”΄ opened_count_all_time: d=-0.893 (Large effect) ↓ Lower in retained
  🟑 clicked_sum_all_time: d=-0.692 (Medium effect) ↓ Lower in retained
  🟒 clicked_mean_all_time: d=-0.467 (Small effect) ↓ Lower in retained
  πŸ”΄ clicked_count_all_time: d=-0.893 (Large effect) ↓ Lower in retained
  πŸ”΄ send_hour_sum_all_time: d=-0.886 (Large effect) ↓ Lower in retained
  βšͺ send_hour_mean_all_time: d=-0.013 (Negligible) ↓ Lower in retained
  🟑 send_hour_max_all_time: d=-0.594 (Medium effect) ↓ Lower in retained
  πŸ”΄ send_hour_count_all_time: d=-0.893 (Large effect) ↓ Lower in retained
  🟒 bounced_sum_all_time: d=-0.338 (Small effect) ↓ Lower in retained
  βšͺ bounced_mean_all_time: d=-0.049 (Negligible) ↓ Lower in retained
  πŸ”΄ bounced_count_all_time: d=-0.893 (Large effect) ↓ Lower in retained
  πŸ”΄ time_to_open_hours_sum_all_time: d=-0.823 (Large effect) ↓ Lower in retained
  βšͺ time_to_open_hours_mean_all_time: d=-0.023 (Negligible) ↓ Lower in retained
  🟒 time_to_open_hours_max_all_time: d=-0.412 (Small effect) ↓ Lower in retained
  πŸ”΄ time_to_open_hours_count_all_time: d=-1.008 (Large effect) ↓ Lower in retained
  πŸ”΄ days_since_last_event_x: d=+2.403 (Large effect) ↑ Higher in retained
  βšͺ days_since_first_event_x: d=+0.134 (Negligible) ↑ Higher in retained
  βšͺ dow_sin: d=-0.105 (Negligible) ↓ Lower in retained
  🟒 dow_cos: d=+0.350 (Small effect) ↑ Higher in retained
  βšͺ bounced_momentum_180_365: d=+0.052 (Negligible) ↑ Higher in retained
  βšͺ clicked_momentum_180_365: d=+0.009 (Negligible) ↑ Higher in retained
  🟑 lag0_opened_sum: d=-0.632 (Medium effect) ↓ Lower in retained
  🟑 lag0_opened_mean: d=-0.729 (Medium effect) ↓ Lower in retained
  🟒 lag0_opened_count: d=+0.210 (Small effect) ↑ Higher in retained
  🟒 lag0_clicked_sum: d=-0.314 (Small effect) ↓ Lower in retained
  🟒 lag0_clicked_mean: d=-0.361 (Small effect) ↓ Lower in retained
  🟒 lag0_clicked_count: d=+0.210 (Small effect) ↑ Higher in retained
  βšͺ lag0_send_hour_sum: d=+0.189 (Negligible) ↑ Higher in retained
  βšͺ lag0_send_hour_mean: d=-0.001 (Negligible) ↓ Lower in retained
  🟒 lag0_send_hour_count: d=+0.210 (Small effect) ↑ Higher in retained
  βšͺ lag0_send_hour_max: d=+0.062 (Negligible) ↑ Higher in retained
  βšͺ lag0_bounced_sum: d=+0.013 (Negligible) ↑ Higher in retained
  βšͺ lag0_bounced_mean: d=-0.008 (Negligible) ↓ Lower in retained
  🟒 lag0_bounced_count: d=+0.210 (Small effect) ↑ Higher in retained
  🟒 lag0_time_to_open_hours_sum: d=-0.418 (Small effect) ↓ Lower in retained
  βšͺ lag0_time_to_open_hours_mean: d=+0.106 (Negligible) ↑ Higher in retained
  🟑 lag0_time_to_open_hours_count: d=-0.632 (Medium effect) ↓ Lower in retained
  βšͺ lag0_time_to_open_hours_max: d=+0.135 (Negligible) ↑ Higher in retained
  🟒 lag1_opened_sum: d=-0.390 (Small effect) ↓ Lower in retained
  🟑 lag1_opened_mean: d=-0.534 (Medium effect) ↓ Lower in retained
  βšͺ lag1_opened_count: d=+0.146 (Negligible) ↑ Higher in retained
  🟒 lag1_clicked_sum: d=-0.282 (Small effect) ↓ Lower in retained
  🟒 lag1_clicked_mean: d=-0.337 (Small effect) ↓ Lower in retained
  βšͺ lag1_clicked_count: d=+0.146 (Negligible) ↑ Higher in retained
  🟒 lag1_send_hour_sum: d=+0.321 (Small effect) ↑ Higher in retained
  βšͺ lag1_send_hour_mean: d=-0.013 (Negligible) ↓ Lower in retained
  βšͺ lag1_send_hour_count: d=+0.146 (Negligible) ↑ Higher in retained
  βšͺ lag1_send_hour_max: d=+0.084 (Negligible) ↑ Higher in retained
  βšͺ lag1_bounced_mean: d=-0.045 (Negligible) ↓ Lower in retained
  βšͺ lag1_bounced_count: d=+0.146 (Negligible) ↑ Higher in retained
  🟒 lag1_time_to_open_hours_sum: d=-0.291 (Small effect) ↓ Lower in retained
  βšͺ lag1_time_to_open_hours_mean: d=+0.057 (Negligible) ↑ Higher in retained
  βšͺ lag1_time_to_open_hours_count: d=-0.125 (Negligible) ↓ Lower in retained
  βšͺ lag1_time_to_open_hours_max: d=+0.107 (Negligible) ↑ Higher in retained
  🟒 lag2_opened_sum: d=-0.359 (Small effect) ↓ Lower in retained
  🟒 lag2_opened_mean: d=-0.498 (Small effect) ↓ Lower in retained
  βšͺ lag2_opened_count: d=+0.144 (Negligible) ↑ Higher in retained
  βšͺ lag2_clicked_sum: d=-0.162 (Negligible) ↓ Lower in retained
  🟒 lag2_clicked_mean: d=-0.225 (Small effect) ↓ Lower in retained
  βšͺ lag2_clicked_count: d=+0.144 (Negligible) ↑ Higher in retained
  🟒 lag2_send_hour_sum: d=+0.228 (Small effect) ↑ Higher in retained
  βšͺ lag2_send_hour_mean: d=-0.104 (Negligible) ↓ Lower in retained
  βšͺ lag2_send_hour_count: d=+0.144 (Negligible) ↑ Higher in retained
  βšͺ lag2_send_hour_max: d=-0.049 (Negligible) ↓ Lower in retained
  βšͺ lag2_bounced_mean: d=+0.009 (Negligible) ↑ Higher in retained
  βšͺ lag2_bounced_count: d=+0.144 (Negligible) ↑ Higher in retained
  🟒 lag2_time_to_open_hours_sum: d=-0.302 (Small effect) ↓ Lower in retained
  βšͺ lag2_time_to_open_hours_mean: d=-0.098 (Negligible) ↓ Lower in retained
  βšͺ lag2_time_to_open_hours_count: d=-0.102 (Negligible) ↓ Lower in retained
  βšͺ lag2_time_to_open_hours_max: d=-0.055 (Negligible) ↓ Lower in retained
  βšͺ lag3_opened_sum: d=-0.177 (Negligible) ↓ Lower in retained
  🟒 lag3_opened_mean: d=-0.284 (Small effect) ↓ Lower in retained
  βšͺ lag3_opened_count: d=+0.145 (Negligible) ↑ Higher in retained
  βšͺ lag3_clicked_mean: d=-0.199 (Negligible) ↓ Lower in retained
  βšͺ lag3_clicked_count: d=+0.145 (Negligible) ↑ Higher in retained
  🟒 lag3_send_hour_sum: d=+0.264 (Small effect) ↑ Higher in retained
  βšͺ lag3_send_hour_mean: d=+0.025 (Negligible) ↑ Higher in retained
  βšͺ lag3_send_hour_count: d=+0.145 (Negligible) ↑ Higher in retained
  βšͺ lag3_send_hour_max: d=+0.090 (Negligible) ↑ Higher in retained
  βšͺ lag3_bounced_mean: d=-0.070 (Negligible) ↓ Lower in retained
  βšͺ lag3_bounced_count: d=+0.145 (Negligible) ↑ Higher in retained
  βšͺ lag3_time_to_open_hours_sum: d=-0.076 (Negligible) ↓ Lower in retained
  βšͺ lag3_time_to_open_hours_mean: d=+0.189 (Negligible) ↑ Higher in retained
  βšͺ lag3_time_to_open_hours_count: d=-0.030 (Negligible) ↓ Lower in retained
  🟒 lag3_time_to_open_hours_max: d=+0.220 (Small effect) ↑ Higher in retained
  βšͺ opened_velocity: d=-0.169 (Negligible) ↓ Lower in retained
  🟒 opened_velocity_pct: d=-0.264 (Small effect) ↓ Lower in retained
  βšͺ clicked_velocity: d=-0.007 (Negligible) ↓ Lower in retained
  βšͺ send_hour_velocity: d=+0.086 (Negligible) ↑ Higher in retained
  βšͺ send_hour_velocity_pct: d=+0.050 (Negligible) ↑ Higher in retained
  βšͺ bounced_velocity: d=+0.059 (Negligible) ↑ Higher in retained
  βšͺ time_to_open_hours_velocity: d=-0.124 (Negligible) ↓ Lower in retained
  βšͺ time_to_open_hours_velocity_pct: d=-0.167 (Negligible) ↓ Lower in retained
  βšͺ opened_acceleration: d=-0.020 (Negligible) ↓ Lower in retained
  🟒 opened_momentum: d=-0.266 (Small effect) ↓ Lower in retained
  🟒 clicked_acceleration: d=+0.266 (Small effect) ↑ Higher in retained
  βšͺ send_hour_acceleration: d=+0.052 (Negligible) ↑ Higher in retained
  βšͺ send_hour_momentum: d=+0.118 (Negligible) ↑ Higher in retained
  βšͺ bounced_acceleration: d=+0.075 (Negligible) ↑ Higher in retained
  βšͺ time_to_open_hours_acceleration: d=-0.127 (Negligible) ↓ Lower in retained
  βšͺ time_to_open_hours_momentum: d=-0.167 (Negligible) ↓ Lower in retained
  🟑 opened_beginning: d=-0.555 (Medium effect) ↓ Lower in retained
  πŸ”΄ opened_end: d=-1.007 (Large effect) ↓ Lower in retained
  🟑 opened_trend_ratio: d=-0.701 (Medium effect) ↓ Lower in retained
  🟒 clicked_beginning: d=-0.338 (Small effect) ↓ Lower in retained
  🟑 clicked_end: d=-0.623 (Medium effect) ↓ Lower in retained
  🟒 clicked_trend_ratio: d=-0.453 (Small effect) ↓ Lower in retained
  🟑 send_hour_beginning: d=-0.697 (Medium effect) ↓ Lower in retained
  🟑 send_hour_end: d=-0.735 (Medium effect) ↓ Lower in retained
  βšͺ send_hour_trend_ratio: d=+0.058 (Negligible) ↑ Higher in retained
  βšͺ bounced_beginning: d=-0.167 (Negligible) ↓ Lower in retained
  🟒 bounced_end: d=-0.202 (Small effect) ↓ Lower in retained
  βšͺ bounced_trend_ratio: d=+0.011 (Negligible) ↑ Higher in retained
  🟒 time_to_open_hours_beginning: d=-0.443 (Small effect) ↓ Lower in retained
  🟑 time_to_open_hours_end: d=-0.753 (Medium effect) ↓ Lower in retained
  βšͺ time_to_open_hours_trend_ratio: d=-0.175 (Negligible) ↓ Lower in retained
  βšͺ days_since_last_event_y: d=+0.000 (Negligible) ↓ Lower in retained
  πŸ”΄ days_since_first_event_y: d=-2.365 (Large effect) ↓ Lower in retained
  πŸ”΄ active_span_days: d=-2.365 (Large effect) ↓ Lower in retained
  βšͺ recency_ratio: d=+0.000 (Negligible) ↓ Lower in retained
  βšͺ event_frequency: d=+0.172 (Negligible) ↑ Higher in retained
  🟒 inter_event_gap_mean: d=-0.381 (Small effect) ↓ Lower in retained
  🟑 inter_event_gap_std: d=-0.549 (Medium effect) ↓ Lower in retained
  πŸ”΄ inter_event_gap_max: d=-0.929 (Large effect) ↓ Lower in retained
  🟒 regularity_score: d=+0.476 (Small effect) ↑ Higher in retained
  🟑 opened_vs_cohort_mean: d=-0.632 (Medium effect) ↓ Lower in retained
  🟑 opened_vs_cohort_pct: d=-0.632 (Medium effect) ↓ Lower in retained
  🟑 opened_cohort_zscore: d=-0.632 (Medium effect) ↓ Lower in retained
  🟒 clicked_vs_cohort_mean: d=-0.314 (Small effect) ↓ Lower in retained
  🟒 clicked_vs_cohort_pct: d=-0.314 (Small effect) ↓ Lower in retained
  🟒 clicked_cohort_zscore: d=-0.314 (Small effect) ↓ Lower in retained
  βšͺ send_hour_vs_cohort_mean: d=+0.189 (Negligible) ↑ Higher in retained
  βšͺ send_hour_vs_cohort_pct: d=+0.189 (Negligible) ↑ Higher in retained
  βšͺ send_hour_cohort_zscore: d=+0.189 (Negligible) ↑ Higher in retained
  βšͺ bounced_vs_cohort_mean: d=+0.013 (Negligible) ↑ Higher in retained
  βšͺ bounced_vs_cohort_pct: d=+0.013 (Negligible) ↑ Higher in retained
  βšͺ bounced_cohort_zscore: d=+0.013 (Negligible) ↑ Higher in retained
  🟒 time_to_open_hours_vs_cohort_mean: d=-0.418 (Small effect) ↓ Lower in retained
  🟒 time_to_open_hours_vs_cohort_pct: d=-0.418 (Small effect) ↓ Lower in retained
  🟒 time_to_open_hours_cohort_zscore: d=-0.418 (Small effect) ↓ Lower in retained

⭐ Features with notable effect (|d| β‰₯ 0.2): days_since_last_event_x, days_since_first_event_y, active_span_days, send_hour_count_365d, event_count_365d, bounced_count_365d, clicked_count_365d, opened_count_365d, send_hour_sum_365d, send_hour_count_180d, bounced_count_180d, event_count_180d, clicked_count_180d, opened_count_180d, send_hour_sum_180d, opened_sum_all_time, time_to_open_hours_count_all_time, opened_end, opened_sum_365d, time_to_open_hours_count_365d, inter_event_gap_max, send_hour_count_all_time, clicked_count_all_time, bounced_count_all_time, event_count_all_time, opened_count_all_time, send_hour_sum_all_time, opened_mean_all_time, time_to_open_hours_sum_all_time, time_to_open_hours_end, send_hour_end, lag0_opened_mean, opened_mean_365d, opened_trend_ratio, send_hour_beginning, clicked_sum_all_time, time_to_open_hours_sum_365d, opened_sum_180d, time_to_open_hours_count_180d, opened_cohort_zscore, lag0_opened_sum, lag0_time_to_open_hours_count, opened_vs_cohort_pct, opened_vs_cohort_mean, clicked_end, send_hour_max_all_time, opened_mean_180d, opened_beginning, inter_event_gap_std, clicked_sum_365d, lag1_opened_mean, lag2_opened_mean, time_to_open_hours_sum_180d, regularity_score, clicked_mean_all_time, clicked_trend_ratio, time_to_open_hours_beginning, time_to_open_hours_vs_cohort_pct, time_to_open_hours_cohort_zscore, time_to_open_hours_vs_cohort_mean, lag0_time_to_open_hours_sum, time_to_open_hours_max_all_time, lag1_opened_sum, clicked_sum_180d, inter_event_gap_mean, clicked_mean_365d, lag0_clicked_mean, lag2_opened_sum, dow_cos, bounced_sum_all_time, clicked_beginning, lag1_clicked_mean, lag1_send_hour_sum, clicked_vs_cohort_pct, lag0_clicked_sum, clicked_cohort_zscore, clicked_vs_cohort_mean, lag2_time_to_open_hours_sum, clicked_mean_180d, lag1_time_to_open_hours_sum, lag3_opened_mean, lag1_clicked_sum, clicked_acceleration, opened_momentum, lag3_send_hour_sum, opened_velocity_pct, bounced_sum_365d, lag2_send_hour_sum, lag2_clicked_mean, lag3_time_to_open_hours_max, lag0_opened_count, lag0_send_hour_count, lag0_bounced_count, lag0_clicked_count, bounced_end

Interpreting Effect Sizes (Cohen's d)ΒΆ

Effect Size Interpretation What It Means for Modeling
|d| β‰₯ 0.8 Large Strong discriminator - prioritize this feature
|d| = 0.5-0.8 Medium Useful predictor - include in model
|d| = 0.2-0.5 Small Weak but may help in combination with others
|d| < 0.2 Negligible Limited predictive value alone

🎯 Actionable Insights:

  • Features with large effects are your best predictors - ensure they're included in your model
  • Direction matters: "Higher in retained" means customers with high values tend to stay; use this for threshold-based business rules
  • Features with small/negligible effects may still be useful in combination or as interaction terms

⚠️ Cautions:

  • Effect size assumes roughly normal distributions - check skewness in notebook 03
  • Large effects could be due to confounding variables - validate with domain knowledge
  • Correlation β‰  causation: high engagement may not cause retention

Box Plot VisualizationΒΆ

πŸ“ˆ How to Read the Box Plots Below:

  • Well-separated boxes (little/no overlap) β†’ Feature clearly distinguishes retained vs churned
  • Different medians (center lines at different heights) β†’ Groups have different typical values
  • Many outliers in one group β†’ May indicate subpopulations worth investigating
InΒ [7]:
Show/Hide Code
# Box Plots: Visual comparison of distributions
if findings.target_column and findings.target_column in df.columns:
    target = findings.target_column

    feature_cols = [
        name for name, col in findings.columns.items()
        if col.inferred_type in [ColumnType.NUMERIC_CONTINUOUS, ColumnType.NUMERIC_DISCRETE]
        and name != target
        and name not in TEMPORAL_METADATA_COLS
    ]

    if feature_cols:
        # Create box plots - one subplot per feature for better control
        n_features = min(len(feature_cols), 6)

        fig = make_subplots(
            rows=1, cols=n_features,
            subplot_titles=feature_cols[:n_features],
            horizontal_spacing=0.05
        )

        for i, col in enumerate(feature_cols[:n_features]):
            col_num = i + 1

            # Retained (1) - Green
            retained_data = df[df[target] == 1][col].dropna()
            fig.add_trace(
                go.Box(
                    y=retained_data,
                    name='Retained',
                    fillcolor='rgba(46, 204, 113, 0.7)',
                    line=dict(color='#1e8449', width=2),
                    marker=dict(
                        color='rgba(46, 204, 113, 0.5)',  # Light green outliers
                        size=5,
                        line=dict(color='#1e8449', width=1)
                    ),
                    boxpoints='outliers',
                    width=0.35,
                    showlegend=(i == 0),
                    legendgroup='retained',
                    offsetgroup='retained'
                ),
                row=1, col=col_num
            )

            # Churned (0) - Red
            churned_data = df[df[target] == 0][col].dropna()
            fig.add_trace(
                go.Box(
                    y=churned_data,
                    name='Churned',
                    fillcolor='rgba(231, 76, 60, 0.7)',
                    line=dict(color='#922b21', width=2),
                    marker=dict(
                        color='rgba(231, 76, 60, 0.5)',  # Light red outliers
                        size=5,
                        line=dict(color='#922b21', width=1)
                    ),
                    boxpoints='outliers',
                    width=0.35,
                    showlegend=(i == 0),
                    legendgroup='churned',
                    offsetgroup='churned'
                ),
                row=1, col=col_num
            )

        fig.update_layout(
            height=450,
            title_text="Feature Distributions: Retained (Green) vs Churned (Red)",
            template='plotly_white',
            showlegend=True,
            legend=dict(orientation="h", yanchor="bottom", y=1.05, xanchor="center", x=0.5),
            boxmode='group',
            boxgap=0.3,
            boxgroupgap=0.1
        )

        # Center the boxes by removing x-axis tick labels (title is above each subplot)
        fig.update_xaxes(showticklabels=False)

        display_figure(fig)

        # Print mean comparison
        print("\nπŸ“Š MEAN COMPARISON BY RETENTION STATUS:")
        print("-" * 70)
        for col in feature_cols[:n_features]:
            retained_mean = df[df[target] == 1][col].mean()
            churned_mean = df[df[target] == 0][col].mean()
            diff_pct = ((retained_mean - churned_mean) / churned_mean * 100) if churned_mean != 0 else 0
            print(f"  {col}:")
            print(f"     Retained: {retained_mean:.2f}  |  Churned: {churned_mean:.2f}  |  Diff: {diff_pct:+.1f}%")
No description has been provided for this image
πŸ“Š MEAN COMPARISON BY RETENTION STATUS:
----------------------------------------------------------------------
  event_count_180d:
     Retained: 0.07  |  Churned: 1.09  |  Diff: -93.3%
  event_count_365d:
     Retained: 0.20  |  Churned: 2.21  |  Diff: -90.9%
  event_count_all_time:
     Retained: 12.43  |  Churned: 19.89  |  Diff: -37.5%
  opened_sum_180d:
     Retained: 0.00  |  Churned: 0.27  |  Diff: -98.5%
  opened_mean_180d:
     Retained: 0.03  |  Churned: 0.24  |  Diff: -86.1%
  opened_count_180d:
     Retained: 0.07  |  Churned: 1.09  |  Diff: -93.3%

5.5 Feature-Target CorrelationsΒΆ

Features ranked by absolute correlation with the target variable.

πŸ“– Interpretation:

  • Positive correlation: Higher values = more likely retained
  • Negative correlation: Higher values = more likely churned
  • |r| > 0.3: Moderately predictive
  • |r| > 0.5: Strongly predictive
InΒ [8]:
Show/Hide Code
if findings.target_column and findings.target_column in df.columns:
    target = findings.target_column
    feature_cols = [
        name for name, col in findings.columns.items()
        if col.inferred_type in [ColumnType.NUMERIC_CONTINUOUS, ColumnType.NUMERIC_DISCRETE]
        and name != target
        and name not in TEMPORAL_METADATA_COLS
    ]

    if feature_cols:
        correlations = []
        for col in feature_cols:
            corr = df[[col, target]].corr().iloc[0, 1]
            correlations.append({"Feature": col, "Correlation": corr})

        corr_df = pd.DataFrame(correlations).sort_values("Correlation", key=abs, ascending=False)

        fig = charts.bar_chart(
            corr_df["Feature"].tolist(),
            corr_df["Correlation"].tolist(),
            title=f"Feature Correlations with {target}"
        )
        display_figure(fig)
else:
    print("Target column not available for correlation analysis.")
No description has been provided for this image

5.6 Categorical Feature AnalysisΒΆ

Retention rates by category help identify which segments are at higher risk.

πŸ“– What to Look For:

  • Categories with low retention rates = high-risk segments for intervention
  • Large variation across categories = strong predictive feature
  • Small categories with extreme rates may be unreliable (small sample size)

πŸ“Š Metrics Explained:

  • Retention Rate: % of customers in category who were retained
  • Lift: How much better/worse than overall retention rate (>1 = better, <1 = worse)
  • CramΓ©r's V: Strength of association (0-1 scale, like correlation for categorical)
InΒ [9]:
Show/Hide Code
from customer_retention.stages.profiling import CategoricalTargetAnalyzer

if findings.target_column:
    target = findings.target_column
    overall_retention = df[target].mean()

    categorical_cols = [
        name for name, col in findings.columns.items()
        if col.inferred_type in [ColumnType.CATEGORICAL_NOMINAL, ColumnType.CATEGORICAL_ORDINAL]
        and name not in TEMPORAL_METADATA_COLS
    ]

    print("=" * 80)
    print("CATEGORICAL FEATURE ANALYSIS")
    print("=" * 80)
    print(f"Overall retention rate: {overall_retention:.1%}")

    if categorical_cols:
        # Use framework analyzer for summary
        cat_analyzer = CategoricalTargetAnalyzer(min_samples_per_category=10)
        summary_df = cat_analyzer.analyze_multiple(df, categorical_cols, target)

        print("\nπŸ“ˆ Categorical Feature Strength (CramΓ©r's V):")
        print("-" * 60)
        for _, row in summary_df.iterrows():
            if row["cramers_v"] >= 0.3:
                strength = "Strong"
                emoji = "πŸ”΄"
            elif row["cramers_v"] >= 0.1:
                strength = "Moderate"
                emoji = "🟑"
            else:
                strength = "Weak"
                emoji = "🟒"
            sig = "***" if row["p_value"] < 0.001 else "**" if row["p_value"] < 0.01 else "*" if row["p_value"] < 0.05 else ""
            print(f"  {emoji} {row['feature']}: V={row['cramers_v']:.3f} ({strength}) {sig}")

        # Detailed analysis for each categorical feature
        for col_name in categorical_cols[:5]:
            result = cat_analyzer.analyze(df, col_name, target)

            print(f"\n{'='*60}")
            print(f"πŸ“Š {col_name.upper()}")
            print("="*60)

            # Display stats table
            if len(result.category_stats) > 0:
                display_stats = result.category_stats[['category', 'total_count', 'retention_rate', 'lift', 'pct_of_total']].copy()
                display_stats['retention_rate'] = display_stats['retention_rate'].apply(lambda x: f"{x:.1%}")
                display_stats['lift'] = display_stats['lift'].apply(lambda x: f"{x:.2f}x")
                display_stats['pct_of_total'] = display_stats['pct_of_total'].apply(lambda x: f"{x:.1%}")
                display_stats.columns = [col_name, 'Count', 'Retention Rate', 'Lift', '% of Data']
                display(display_stats)

                # Stacked bar chart
                cat_stats = result.category_stats
                categories = cat_stats['category'].tolist()
                retained_counts = cat_stats['retained_count'].tolist()
                churned_counts = cat_stats['churned_count'].tolist()

                fig = go.Figure()

                fig.add_trace(go.Bar(
                    name='Retained',
                    x=categories,
                    y=retained_counts,
                    marker_color='rgba(46, 204, 113, 0.8)',
                    text=[f"{r/(r+c)*100:.0f}%" for r, c in zip(retained_counts, churned_counts)],
                    textposition='inside',
                    textfont=dict(color='white', size=12)
                ))

                fig.add_trace(go.Bar(
                    name='Churned',
                    x=categories,
                    y=churned_counts,
                    marker_color='rgba(231, 76, 60, 0.8)',
                    text=[f"{c/(r+c)*100:.0f}%" for r, c in zip(retained_counts, churned_counts)],
                    textposition='inside',
                    textfont=dict(color='white', size=12)
                ))

                fig.update_layout(
                    barmode='stack',
                    title=f"Retention by {col_name}",
                    xaxis_title=col_name,
                    yaxis_title="Count",
                    template='plotly_white',
                    height=350,
                    legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="center", x=0.5)
                )
                display_figure(fig)

                # Flag high-risk categories from framework result
                if result.high_risk_categories:
                    print("\n  ⚠️ High-risk categories (lift < 0.9x):")
                    for cat in result.high_risk_categories:
                        cat_row = cat_stats[cat_stats['category'] == cat].iloc[0]
                        print(f"     β€’ {cat}: {cat_row['retention_rate']:.1%} retention ({cat_row['lift']:.2f}x lift)")
    else:
        print("\n  ℹ️ No categorical columns detected.")
else:
    print("No target column available for categorical analysis.")
================================================================================
CATEGORICAL FEATURE ANALYSIS
================================================================================
Overall retention rate: 44.6%

πŸ“ˆ Categorical Feature Strength (CramΓ©r's V):
------------------------------------------------------------
  πŸ”΄ lifecycle_quadrant: V=0.728 (Strong) ***
  πŸ”΄ recency_bucket: V=0.619 (Strong) ***

============================================================
πŸ“Š LIFECYCLE_QUADRANT
============================================================
lifecycle_quadrant Count Retention Rate Lift % of Data
0 Intense & Brief 1679 82.7% 1.86x 33.6%
1 One-shot 816 76.7% 1.72x 16.3%
2 Steady & Loyal 820 10.4% 0.23x 16.4%
3 Occasional & Loyal 1683 7.6% 0.17x 33.7%
No description has been provided for this image
  ⚠️ High-risk categories (lift < 0.9x):
     β€’ Steady & Loyal: 10.4% retention (0.23x lift)
     β€’ Occasional & Loyal: 7.6% retention (0.17x lift)

============================================================
πŸ“Š RECENCY_BUCKET
============================================================
recency_bucket Count Retention Rate Lift % of Data
0 >180d 3084 68.8% 1.54x 61.7%
1 91-180d 702 7.0% 0.16x 14.0%
2 31-90d 725 5.9% 0.13x 14.5%
3 0-7d 123 4.9% 0.11x 2.5%
4 8-30d 364 2.2% 0.05x 7.3%
No description has been provided for this image
  ⚠️ High-risk categories (lift < 0.9x):
     β€’ 91-180d: 7.0% retention (0.16x lift)
     β€’ 31-90d: 5.9% retention (0.13x lift)
     β€’ 0-7d: 4.9% retention (0.11x lift)
     β€’ 8-30d: 2.2% retention (0.05x lift)

5.7 Scatter Plot Matrix (Sample)ΒΆ

Visual exploration of pairwise relationships between numeric features.

πŸ“– How to Read the Scatter Matrix:

  • Diagonal: Distribution of each feature (histogram or density)
  • Off-diagonal: Scatter plot showing relationship between two features
  • Each row/column represents one feature

πŸ” What to Look For:

Pattern What It Means Action
Linear trend (diagonal line of points) Strong correlation Check if redundant; may cause multicollinearity
Curved pattern Non-linear relationship Consider polynomial features or transformations
Clusters/groups Natural segments in data May benefit from segment-aware modeling
Fan shape (spreading out) Heteroscedasticity May need log transform or robust methods
Random scatter No relationship Features are independent

⚠️ Cautions:

  • Sample shown (max 1000 points) for performance - patterns may differ in full data
  • Look for the same patterns in correlation matrix (section 4.2) to confirm
InΒ [10]:
Show/Hide Code
top_numeric = numeric_cols[:4] if len(numeric_cols) > 4 else numeric_cols

if len(top_numeric) >= 2:
    fig = charts.scatter_matrix(
        df[top_numeric].sample(min(1000, len(df))),
        title="Scatter Plot Matrix (Sample)"
    )
    display_figure(fig)
No description has been provided for this image

Interpreting the Scatter Matrix AboveΒΆ

🎯 Key Questions to Answer:

  1. Are any features redundant?

    • Look for tight linear patterns β†’ high correlation β†’ consider dropping one
    • Cross-reference with high correlation pairs in section 4.3
  2. Are there natural customer segments?

    • Distinct clusters suggest different customer types
    • Links to segment-aware outlier analysis in notebook 03
  3. Do relationships suggest feature engineering?

    • Curved patterns β†’ polynomial or interaction terms may help
    • Ratios between correlated features may be more predictive
  4. Are distributions suitable for linear models?

    • Fan shapes or heavy skew β†’ consider transformations
    • Outlier clusters β†’ verify with segment analysis

πŸ’‘ Pro Tip: Hover over points in the interactive plot to see exact values. Look for outliers that appear across multiple scatter plots - these may be influential observations worth investigating.

5.8 Datetime Feature AnalysisΒΆ

Temporal patterns can reveal important retention signals - when customers joined, their last activity, and seasonal patterns.

πŸ“– What to Look For:

  • Cohort effects: Do customers who joined in certain periods have different retention?
  • Recency patterns: How does time since last activity relate to retention?
  • Seasonal trends: Are there monthly or quarterly patterns?

πŸ“Š Common Temporal Features:

Feature Type Example Typical Insight
Tenure Days since signup Longer tenure often = higher retention
Recency Days since last order Recent activity = engaged customer
Cohort Signup month/year Economic conditions affect cohorts
Day of Week Signup day Weekend vs weekday patterns
InΒ [11]:
Show/Hide Code
from customer_retention.stages.profiling import TemporalTargetAnalyzer

datetime_cols = [
    name for name, col in findings.columns.items()
    if col.inferred_type == ColumnType.DATETIME
]

print("=" * 80)
print("DATETIME FEATURE ANALYSIS")
print("=" * 80)
print(f"Detected datetime columns: {datetime_cols}")

if datetime_cols and findings.target_column:
    target = findings.target_column
    overall_retention = df[target].mean()

    # Use framework analyzer
    temporal_analyzer = TemporalTargetAnalyzer(min_samples_per_period=10)

    for col_name in datetime_cols[:3]:
        result = temporal_analyzer.analyze(df, col_name, target)

        print(f"\n{'='*60}")
        print(f"πŸ“… {col_name.upper()}")
        print("="*60)

        if result.n_valid_dates == 0:
            print("  No valid dates found")
            continue

        print(f"  Date range: {result.min_date} to {result.max_date}")
        print(f"  Valid dates: {result.n_valid_dates:,}")

        # 1. Retention by Year (from framework result)
        if len(result.yearly_stats) > 1:
            print(f"\n  πŸ“Š Retention by Year: Trend is {result.yearly_trend}")

            year_stats = result.yearly_stats

            fig = make_subplots(rows=1, cols=2, subplot_titles=["Retention Rate by Year", "Customer Count by Year"],
                               column_widths=[0.6, 0.4])

            fig.add_trace(
                go.Scatter(
                    x=year_stats['period'].astype(str),
                    y=year_stats['retention_rate'],
                    mode='lines+markers',
                    name='Retention Rate',
                    line=dict(color='#3498db', width=3),
                    marker=dict(size=10)
                ),
                row=1, col=1
            )
            fig.add_hline(y=overall_retention, line_dash="dash", line_color="gray",
                         annotation_text=f"Overall: {overall_retention:.1%}", row=1, col=1)

            fig.add_trace(
                go.Bar(
                    x=year_stats['period'].astype(str),
                    y=year_stats['count'],
                    name='Count',
                    marker_color='rgba(52, 152, 219, 0.6)'
                ),
                row=1, col=2
            )

            fig.update_layout(height=350, template='plotly_white', showlegend=False)
            fig.update_yaxes(tickformat='.0%', row=1, col=1)
            display_figure(fig)

        # 2. Retention by Month (from framework result)
        if len(result.monthly_stats) > 1:
            print("\n  πŸ“Š Retention by Month (Seasonality):")

            month_stats = result.monthly_stats
            colors = ['rgba(46, 204, 113, 0.7)' if r >= overall_retention else 'rgba(231, 76, 60, 0.7)'
                     for r in month_stats['retention_rate']]

            fig = go.Figure()
            fig.add_trace(go.Bar(
                x=month_stats['month_name'],
                y=month_stats['retention_rate'],
                marker_color=colors,
                text=[f"{r:.0%}" for r in month_stats['retention_rate']],
                textposition='outside'
            ))
            fig.add_hline(y=overall_retention, line_dash="dash", line_color="gray",
                         annotation_text=f"Overall: {overall_retention:.1%}")

            fig.update_layout(
                title=f"Monthly Retention Pattern ({col_name})",
                xaxis_title="Month",
                yaxis_title="Retention Rate",
                template='plotly_white',
                height=350,
                yaxis_tickformat='.0%'
            )
            display_figure(fig)

            # Seasonal insights from framework
            if result.seasonal_spread > 0.05:
                print(f"  πŸ“ˆ Seasonal spread: {result.seasonal_spread:.1%}")
                print(f"     Best month: {result.best_month}")
                print(f"     Worst month: {result.worst_month}")

        # 3. Retention by Day of Week (from framework result)
        if len(result.dow_stats) > 1:
            print("\n  πŸ“Š Retention by Day of Week:")

            dow_stats = result.dow_stats
            colors = ['rgba(46, 204, 113, 0.7)' if r >= overall_retention else 'rgba(231, 76, 60, 0.7)'
                     for r in dow_stats['retention_rate']]

            fig = go.Figure()
            fig.add_trace(go.Bar(
                x=dow_stats['day_name'],
                y=dow_stats['retention_rate'],
                marker_color=colors,
                text=[f"{r:.0%}" for r in dow_stats['retention_rate']],
                textposition='outside'
            ))
            fig.add_hline(y=overall_retention, line_dash="dash", line_color="gray")

            fig.update_layout(
                title=f"Day of Week Pattern ({col_name})",
                xaxis_title="Day of Week",
                yaxis_title="Retention Rate",
                template='plotly_white',
                height=300,
                yaxis_tickformat='.0%'
            )
            display_figure(fig)
else:
    if not datetime_cols:
        print("\n  ℹ️ No datetime columns detected in this dataset.")
        print("     Consider adding date parsing in notebook 01 if dates exist as strings.")
    else:
        print("\n  ℹ️ No target column available for retention analysis.")
================================================================================
DATETIME FEATURE ANALYSIS
================================================================================
Detected datetime columns: []

  ℹ️ No datetime columns detected in this dataset.
     Consider adding date parsing in notebook 01 if dates exist as strings.

5.9 Actionable Recommendations SummaryΒΆ

This section consolidates all relationship analysis findings into actionable recommendations organized by their impact on the modeling pipeline.

πŸ“‹ Recommendation Categories:

Category Purpose Impact
Feature Selection Which features to keep/drop Reduces noise, improves interpretability
Feature Engineering New features to create Captures interactions, improves accuracy
Stratification Train/test split strategy Ensures fair evaluation, prevents leakage
Model Selection Which algorithms to try Matches model to data characteristics
InΒ [12]:
Show/Hide Code
# Generate comprehensive actionable recommendations
recommender = RelationshipRecommender()

# Gather columns by type
numeric_features = [
    name for name, col in findings.columns.items()
    if col.inferred_type in [ColumnType.NUMERIC_CONTINUOUS, ColumnType.NUMERIC_DISCRETE]
    and name != findings.target_column
    and name not in TEMPORAL_METADATA_COLS
]
categorical_features = [
    name for name, col in findings.columns.items()
    if col.inferred_type in [ColumnType.CATEGORICAL_NOMINAL, ColumnType.CATEGORICAL_ORDINAL]
    and name not in TEMPORAL_METADATA_COLS
]

# Run comprehensive analysis
analysis_summary = recommender.analyze(
    df,
    numeric_cols=numeric_features,
    categorical_cols=categorical_features,
    target_col=findings.target_column,
)

print("=" * 80)
print("ACTIONABLE RECOMMENDATIONS FROM RELATIONSHIP ANALYSIS")
print("=" * 80)

# Group recommendations by category
grouped_recs = analysis_summary.recommendations_by_category
high_priority = analysis_summary.high_priority_actions

if high_priority:
    print(f"\nπŸ”΄ HIGH PRIORITY ACTIONS ({len(high_priority)}):")
    print("-" * 60)
    for rec in high_priority:
        print(f"\n  πŸ“Œ {rec.title}")
        print(f"     {rec.description}")
        print(f"     β†’ Action: {rec.action}")
        if rec.affected_features:
            print(f"     β†’ Features: {', '.join(rec.affected_features[:5])}")

# Persist recommendations to registry
for pair in analysis_summary.multicollinear_pairs:
    registry.add_gold_drop_multicollinear(
        column=pair["feature1"], correlated_with=pair["feature2"],
        correlation=pair["correlation"],
        rationale=f"High correlation ({pair['correlation']:.2f}) - consider dropping one",
        source_notebook="05_relationship_analysis"
    )

for predictor in analysis_summary.strong_predictors:
    registry.add_gold_prioritize_feature(
        column=predictor["feature"], effect_size=predictor["effect_size"],
        correlation=predictor["correlation"],
        rationale=f"Strong predictor with effect size {predictor['effect_size']:.2f}",
        source_notebook="05_relationship_analysis"
    )

for weak_col in analysis_summary.weak_predictors[:10]:
    registry.add_gold_drop_weak(
        column=weak_col, effect_size=0.0, correlation=0.0,
        rationale="Negligible predictive power",
        source_notebook="05_relationship_analysis"
    )

# Persist ratio feature recommendations
for rec in grouped_recs.get(RecommendationCategory.FEATURE_ENGINEERING, []):
    if "ratio" in rec.title.lower() and len(rec.affected_features) >= 2:
        registry.add_silver_ratio(
            column=f"{rec.affected_features[0]}_to_{rec.affected_features[1]}_ratio",
            numerator=rec.affected_features[0], denominator=rec.affected_features[1],
            rationale=rec.description, source_notebook="05_relationship_analysis"
        )
    elif "interaction" in rec.title.lower() and len(rec.affected_features) >= 2:
        for i, f1 in enumerate(rec.affected_features[:3]):
            for f2 in rec.affected_features[i+1:4]:
                registry.add_silver_interaction(
                    column=f"{f1}_x_{f2}", features=[f1, f2],
                    rationale=rec.description, source_notebook="05_relationship_analysis"
                )

# Store for findings metadata
findings.metadata["relationship_analysis"] = {
    "n_recommendations": len(analysis_summary.recommendations),
    "n_high_priority": len(high_priority),
    "strong_predictors": [p["feature"] for p in analysis_summary.strong_predictors],
    "weak_predictors": analysis_summary.weak_predictors[:5],
    "multicollinear_pairs": [(p["feature1"], p["feature2"]) for p in analysis_summary.multicollinear_pairs],
}

print(f"\nβœ… Persisted {len(analysis_summary.multicollinear_pairs)} multicollinearity recommendations")
print(f"βœ… Persisted {len(analysis_summary.strong_predictors)} strong predictor recommendations")
print(f"βœ… Persisted {min(len(analysis_summary.weak_predictors), 10)} weak predictor recommendations")
================================================================================
ACTIONABLE RECOMMENDATIONS FROM RELATIONSHIP ANALYSIS
================================================================================

πŸ”΄ HIGH PRIORITY ACTIONS (241):
------------------------------------------------------------

  πŸ“Œ Remove multicollinear feature
     event_count_180d and opened_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_180d, opened_count_180d

  πŸ“Œ Remove multicollinear feature
     event_count_180d and clicked_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_180d, clicked_count_180d

  πŸ“Œ Remove multicollinear feature
     event_count_180d and send_hour_sum_180d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_180d, send_hour_sum_180d

  πŸ“Œ Remove multicollinear feature
     event_count_180d and send_hour_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_180d, send_hour_count_180d

  πŸ“Œ Remove multicollinear feature
     event_count_180d and bounced_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_180d, bounced_count_180d

  πŸ“Œ Remove multicollinear feature
     event_count_365d and opened_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_365d, opened_count_365d

  πŸ“Œ Remove multicollinear feature
     event_count_365d and clicked_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_365d, clicked_count_365d

  πŸ“Œ Remove multicollinear feature
     event_count_365d and send_hour_sum_365d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_365d, send_hour_sum_365d

  πŸ“Œ Remove multicollinear feature
     event_count_365d and send_hour_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_365d, send_hour_count_365d

  πŸ“Œ Remove multicollinear feature
     event_count_365d and bounced_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_365d, bounced_count_365d

  πŸ“Œ Remove multicollinear feature
     event_count_all_time and opened_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_all_time, opened_count_all_time

  πŸ“Œ Remove multicollinear feature
     event_count_all_time and clicked_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_all_time, clicked_count_all_time

  πŸ“Œ Remove multicollinear feature
     event_count_all_time and send_hour_sum_all_time are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_all_time, send_hour_sum_all_time

  πŸ“Œ Remove multicollinear feature
     event_count_all_time and send_hour_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_all_time, send_hour_count_all_time

  πŸ“Œ Remove multicollinear feature
     event_count_all_time and bounced_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: event_count_all_time, bounced_count_all_time

  πŸ“Œ Remove multicollinear feature
     opened_sum_180d and time_to_open_hours_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_sum_180d, time_to_open_hours_count_180d

  πŸ“Œ Remove multicollinear feature
     opened_mean_180d and lag0_opened_mean are highly correlated (r=0.90)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_mean_180d, lag0_opened_mean

  πŸ“Œ Remove multicollinear feature
     opened_count_180d and clicked_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_180d, clicked_count_180d

  πŸ“Œ Remove multicollinear feature
     opened_count_180d and send_hour_sum_180d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_180d, send_hour_sum_180d

  πŸ“Œ Remove multicollinear feature
     opened_count_180d and send_hour_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_180d, send_hour_count_180d

  πŸ“Œ Remove multicollinear feature
     opened_count_180d and bounced_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_180d, bounced_count_180d

  πŸ“Œ Remove multicollinear feature
     clicked_sum_180d and clicked_mean_180d are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_sum_180d, clicked_mean_180d

  πŸ“Œ Remove multicollinear feature
     clicked_mean_180d and lag0_clicked_mean are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_mean_180d, lag0_clicked_mean

  πŸ“Œ Remove multicollinear feature
     clicked_count_180d and send_hour_sum_180d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_180d, send_hour_sum_180d

  πŸ“Œ Remove multicollinear feature
     clicked_count_180d and send_hour_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_180d, send_hour_count_180d

  πŸ“Œ Remove multicollinear feature
     clicked_count_180d and bounced_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_180d, bounced_count_180d

  πŸ“Œ Remove multicollinear feature
     send_hour_sum_180d and send_hour_count_180d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_sum_180d, send_hour_count_180d

  πŸ“Œ Remove multicollinear feature
     send_hour_sum_180d and bounced_count_180d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_sum_180d, bounced_count_180d

  πŸ“Œ Remove multicollinear feature
     send_hour_mean_180d and send_hour_max_180d are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_mean_180d, send_hour_max_180d

  πŸ“Œ Remove multicollinear feature
     send_hour_mean_180d and lag0_send_hour_mean are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_mean_180d, lag0_send_hour_mean

  πŸ“Œ Remove multicollinear feature
     send_hour_mean_180d and lag0_send_hour_max are highly correlated (r=0.86)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_mean_180d, lag0_send_hour_max

  πŸ“Œ Remove multicollinear feature
     send_hour_count_180d and bounced_count_180d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_count_180d, bounced_count_180d

  πŸ“Œ Remove multicollinear feature
     bounced_sum_180d and bounced_mean_180d are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: bounced_sum_180d, bounced_mean_180d

  πŸ“Œ Remove multicollinear feature
     bounced_mean_180d and lag0_bounced_mean are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: bounced_mean_180d, lag0_bounced_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_sum_180d and time_to_open_hours_mean_180d are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_sum_180d, time_to_open_hours_mean_180d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_sum_180d and time_to_open_hours_max_180d are highly correlated (r=0.96)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_sum_180d, time_to_open_hours_max_180d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and time_to_open_hours_max_180d are highly correlated (r=0.97)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, time_to_open_hours_max_180d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and time_to_open_hours_mean_365d are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, time_to_open_hours_mean_365d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and time_to_open_hours_max_365d are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, time_to_open_hours_max_365d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and lag0_time_to_open_hours_mean are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, lag0_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and lag0_time_to_open_hours_max are highly correlated (r=0.96)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, lag0_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and lag2_time_to_open_hours_mean are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, lag2_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and lag2_time_to_open_hours_max are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, lag2_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and lag3_time_to_open_hours_mean are highly correlated (r=0.86)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, lag3_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_180d and lag3_time_to_open_hours_max are highly correlated (r=0.87)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_180d, lag3_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_max_180d and time_to_open_hours_mean_365d are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_max_180d, time_to_open_hours_mean_365d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_max_180d and time_to_open_hours_max_365d are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_max_180d, time_to_open_hours_max_365d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_max_180d and lag0_time_to_open_hours_mean are highly correlated (r=0.95)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_max_180d, lag0_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_max_180d and lag0_time_to_open_hours_max are highly correlated (r=0.96)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_max_180d, lag0_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_max_180d and lag2_time_to_open_hours_mean are highly correlated (r=0.90)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_max_180d, lag2_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_max_180d and lag2_time_to_open_hours_max are highly correlated (r=0.90)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_max_180d, lag2_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     opened_sum_365d and time_to_open_hours_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_sum_365d, time_to_open_hours_count_365d

  πŸ“Œ Remove multicollinear feature
     opened_count_365d and clicked_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_365d, clicked_count_365d

  πŸ“Œ Remove multicollinear feature
     opened_count_365d and send_hour_sum_365d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_365d, send_hour_sum_365d

  πŸ“Œ Remove multicollinear feature
     opened_count_365d and send_hour_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_365d, send_hour_count_365d

  πŸ“Œ Remove multicollinear feature
     opened_count_365d and bounced_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_365d, bounced_count_365d

  πŸ“Œ Remove multicollinear feature
     clicked_count_365d and send_hour_sum_365d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_365d, send_hour_sum_365d

  πŸ“Œ Remove multicollinear feature
     clicked_count_365d and send_hour_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_365d, send_hour_count_365d

  πŸ“Œ Remove multicollinear feature
     clicked_count_365d and bounced_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_365d, bounced_count_365d

  πŸ“Œ Remove multicollinear feature
     send_hour_sum_365d and send_hour_count_365d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_sum_365d, send_hour_count_365d

  πŸ“Œ Remove multicollinear feature
     send_hour_sum_365d and bounced_count_365d are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_sum_365d, bounced_count_365d

  πŸ“Œ Remove multicollinear feature
     send_hour_count_365d and bounced_count_365d are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_count_365d, bounced_count_365d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_sum_365d and time_to_open_hours_max_365d are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_sum_365d, time_to_open_hours_max_365d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_365d and time_to_open_hours_max_365d are highly correlated (r=0.95)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_365d, time_to_open_hours_max_365d

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_365d and lag0_time_to_open_hours_mean are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_365d, lag0_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_365d and lag0_time_to_open_hours_max are highly correlated (r=0.93)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_365d, lag0_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_mean_365d and lag1_time_to_open_hours_mean are highly correlated (r=0.85)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_mean_365d, lag1_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_max_365d and lag0_time_to_open_hours_mean are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_max_365d, lag0_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_max_365d and lag0_time_to_open_hours_max are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_max_365d, lag0_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     opened_sum_all_time and time_to_open_hours_sum_all_time are highly correlated (r=0.86)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_sum_all_time, time_to_open_hours_sum_all_time

  πŸ“Œ Remove multicollinear feature
     opened_sum_all_time and time_to_open_hours_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_sum_all_time, time_to_open_hours_count_all_time

  πŸ“Œ Remove multicollinear feature
     opened_count_all_time and clicked_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_all_time, clicked_count_all_time

  πŸ“Œ Remove multicollinear feature
     opened_count_all_time and send_hour_sum_all_time are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_all_time, send_hour_sum_all_time

  πŸ“Œ Remove multicollinear feature
     opened_count_all_time and send_hour_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_all_time, send_hour_count_all_time

  πŸ“Œ Remove multicollinear feature
     opened_count_all_time and bounced_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_count_all_time, bounced_count_all_time

  πŸ“Œ Remove multicollinear feature
     clicked_count_all_time and send_hour_sum_all_time are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_all_time, send_hour_sum_all_time

  πŸ“Œ Remove multicollinear feature
     clicked_count_all_time and send_hour_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_all_time, send_hour_count_all_time

  πŸ“Œ Remove multicollinear feature
     clicked_count_all_time and bounced_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_count_all_time, bounced_count_all_time

  πŸ“Œ Remove multicollinear feature
     send_hour_sum_all_time and send_hour_count_all_time are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_sum_all_time, send_hour_count_all_time

  πŸ“Œ Remove multicollinear feature
     send_hour_sum_all_time and bounced_count_all_time are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_sum_all_time, bounced_count_all_time

  πŸ“Œ Remove multicollinear feature
     send_hour_sum_all_time and send_hour_beginning are highly correlated (r=0.85)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_sum_all_time, send_hour_beginning

  πŸ“Œ Remove multicollinear feature
     send_hour_sum_all_time and send_hour_end are highly correlated (r=0.85)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_sum_all_time, send_hour_end

  πŸ“Œ Remove multicollinear feature
     send_hour_count_all_time and bounced_count_all_time are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_count_all_time, bounced_count_all_time

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_sum_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.86)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_sum_all_time, time_to_open_hours_count_all_time

  πŸ“Œ Remove multicollinear feature
     days_since_last_event_x and days_since_first_event_y are highly correlated (r=-0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: days_since_last_event_x, days_since_first_event_y

  πŸ“Œ Remove multicollinear feature
     days_since_last_event_x and active_span_days are highly correlated (r=-0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: days_since_last_event_x, active_span_days

  πŸ“Œ Remove multicollinear feature
     lag0_opened_sum and lag0_opened_mean are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_sum, lag0_opened_mean

  πŸ“Œ Remove multicollinear feature
     lag0_opened_sum and lag0_time_to_open_hours_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_sum, lag0_time_to_open_hours_count

  πŸ“Œ Remove multicollinear feature
     lag0_opened_sum and opened_velocity_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_sum, opened_velocity_pct

  πŸ“Œ Remove multicollinear feature
     lag0_opened_sum and opened_vs_cohort_mean are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_sum, opened_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_opened_sum and opened_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_sum, opened_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_opened_sum and opened_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_sum, opened_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_opened_mean and lag0_time_to_open_hours_count are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_mean, lag0_time_to_open_hours_count

  πŸ“Œ Remove multicollinear feature
     lag0_opened_mean and opened_vs_cohort_mean are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_mean, opened_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_opened_mean and opened_vs_cohort_pct are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_mean, opened_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_opened_mean and opened_cohort_zscore are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_mean, opened_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_opened_count and lag0_clicked_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_count, lag0_clicked_count

  πŸ“Œ Remove multicollinear feature
     lag0_opened_count and lag0_send_hour_sum are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_count, lag0_send_hour_sum

  πŸ“Œ Remove multicollinear feature
     lag0_opened_count and lag0_send_hour_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_count, lag0_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag0_opened_count and lag0_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_count, lag0_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag0_opened_count and send_hour_vs_cohort_mean are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_count, send_hour_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_opened_count and send_hour_vs_cohort_pct are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_count, send_hour_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_opened_count and send_hour_cohort_zscore are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_opened_count, send_hour_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_sum and lag0_clicked_mean are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_sum, lag0_clicked_mean

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_sum and clicked_vs_cohort_mean are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_sum, clicked_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_sum and clicked_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_sum, clicked_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_sum and clicked_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_sum, clicked_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_mean and clicked_vs_cohort_mean are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_mean, clicked_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_mean and clicked_vs_cohort_pct are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_mean, clicked_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_mean and clicked_cohort_zscore are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_mean, clicked_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_count and lag0_send_hour_sum are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_count, lag0_send_hour_sum

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_count and lag0_send_hour_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_count, lag0_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_count and lag0_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_count, lag0_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_count and send_hour_vs_cohort_mean are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_count, send_hour_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_count and send_hour_vs_cohort_pct are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_count, send_hour_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_clicked_count and send_hour_cohort_zscore are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_clicked_count, send_hour_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_sum and lag0_send_hour_count are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_sum, lag0_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_sum and lag0_bounced_count are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_sum, lag0_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_sum and send_hour_vs_cohort_mean are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_sum, send_hour_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_sum and send_hour_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_sum, send_hour_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_sum and send_hour_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_sum, send_hour_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_mean and lag0_send_hour_max are highly correlated (r=0.95)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_mean, lag0_send_hour_max

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_count and lag0_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_count, lag0_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_count and send_hour_vs_cohort_mean are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_count, send_hour_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_count and send_hour_vs_cohort_pct are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_count, send_hour_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_send_hour_count and send_hour_cohort_zscore are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_send_hour_count, send_hour_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_sum and lag0_bounced_mean are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_sum, lag0_bounced_mean

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_sum and bounced_vs_cohort_mean are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_sum, bounced_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_sum and bounced_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_sum, bounced_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_sum and bounced_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_sum, bounced_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_mean and bounced_vs_cohort_mean are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_mean, bounced_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_mean and bounced_vs_cohort_pct are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_mean, bounced_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_mean and bounced_cohort_zscore are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_mean, bounced_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_count and send_hour_vs_cohort_mean are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_count, send_hour_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_count and send_hour_vs_cohort_pct are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_count, send_hour_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_bounced_count and send_hour_cohort_zscore are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_bounced_count, send_hour_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_sum and lag0_time_to_open_hours_mean are highly correlated (r=0.97)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_sum, lag0_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_sum and lag0_time_to_open_hours_max are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_sum, lag0_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_sum and time_to_open_hours_momentum are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_sum, time_to_open_hours_momentum

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_sum and time_to_open_hours_vs_cohort_mean are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_sum, time_to_open_hours_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_sum and time_to_open_hours_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_sum, time_to_open_hours_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_sum and time_to_open_hours_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_sum, time_to_open_hours_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_mean and lag0_time_to_open_hours_max are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_mean, lag0_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_mean and time_to_open_hours_velocity are highly correlated (r=0.86)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_mean, time_to_open_hours_velocity

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_mean and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.97)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_mean, time_to_open_hours_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_mean and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.97)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_mean, time_to_open_hours_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_mean and time_to_open_hours_cohort_zscore are highly correlated (r=0.97)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_mean, time_to_open_hours_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_count and opened_velocity_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_count, opened_velocity_pct

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_count and opened_vs_cohort_mean are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_count, opened_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_count and opened_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_count, opened_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_count and opened_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_count, opened_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_max and time_to_open_hours_velocity are highly correlated (r=0.90)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_max, time_to_open_hours_velocity

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_max and time_to_open_hours_momentum are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_max, time_to_open_hours_momentum

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_max and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_max, time_to_open_hours_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_max and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_max, time_to_open_hours_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     lag0_time_to_open_hours_max and time_to_open_hours_cohort_zscore are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag0_time_to_open_hours_max, time_to_open_hours_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     lag1_opened_sum and lag1_opened_mean are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_opened_sum, lag1_opened_mean

  πŸ“Œ Remove multicollinear feature
     lag1_opened_sum and lag1_time_to_open_hours_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_opened_sum, lag1_time_to_open_hours_count

  πŸ“Œ Remove multicollinear feature
     lag1_opened_mean and lag1_time_to_open_hours_count are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_opened_mean, lag1_time_to_open_hours_count

  πŸ“Œ Remove multicollinear feature
     lag1_opened_count and lag1_clicked_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_opened_count, lag1_clicked_count

  πŸ“Œ Remove multicollinear feature
     lag1_opened_count and lag1_send_hour_sum are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_opened_count, lag1_send_hour_sum

  πŸ“Œ Remove multicollinear feature
     lag1_opened_count and lag1_send_hour_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_opened_count, lag1_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag1_opened_count and lag1_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_opened_count, lag1_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag1_clicked_sum and lag1_clicked_mean are highly correlated (r=0.95)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_clicked_sum, lag1_clicked_mean

  πŸ“Œ Remove multicollinear feature
     lag1_clicked_count and lag1_send_hour_sum are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_clicked_count, lag1_send_hour_sum

  πŸ“Œ Remove multicollinear feature
     lag1_clicked_count and lag1_send_hour_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_clicked_count, lag1_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag1_clicked_count and lag1_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_clicked_count, lag1_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag1_send_hour_sum and lag1_send_hour_count are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_send_hour_sum, lag1_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag1_send_hour_sum and lag1_bounced_count are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_send_hour_sum, lag1_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag1_send_hour_mean and lag1_send_hour_max are highly correlated (r=0.96)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_send_hour_mean, lag1_send_hour_max

  πŸ“Œ Remove multicollinear feature
     lag1_send_hour_count and lag1_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_send_hour_count, lag1_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag1_time_to_open_hours_sum and lag1_time_to_open_hours_mean are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_time_to_open_hours_sum, lag1_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     lag1_time_to_open_hours_sum and lag1_time_to_open_hours_max are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_time_to_open_hours_sum, lag1_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     lag1_time_to_open_hours_mean and lag1_time_to_open_hours_max are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag1_time_to_open_hours_mean, lag1_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     lag2_opened_sum and lag2_opened_mean are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_opened_sum, lag2_opened_mean

  πŸ“Œ Remove multicollinear feature
     lag2_opened_sum and lag2_time_to_open_hours_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_opened_sum, lag2_time_to_open_hours_count

  πŸ“Œ Remove multicollinear feature
     lag2_opened_mean and lag2_time_to_open_hours_count are highly correlated (r=0.91)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_opened_mean, lag2_time_to_open_hours_count

  πŸ“Œ Remove multicollinear feature
     lag2_opened_count and lag2_clicked_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_opened_count, lag2_clicked_count

  πŸ“Œ Remove multicollinear feature
     lag2_opened_count and lag2_send_hour_sum are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_opened_count, lag2_send_hour_sum

  πŸ“Œ Remove multicollinear feature
     lag2_opened_count and lag2_send_hour_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_opened_count, lag2_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag2_opened_count and lag2_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_opened_count, lag2_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag2_clicked_sum and lag2_clicked_mean are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_clicked_sum, lag2_clicked_mean

  πŸ“Œ Remove multicollinear feature
     lag2_clicked_count and lag2_send_hour_sum are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_clicked_count, lag2_send_hour_sum

  πŸ“Œ Remove multicollinear feature
     lag2_clicked_count and lag2_send_hour_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_clicked_count, lag2_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag2_clicked_count and lag2_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_clicked_count, lag2_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag2_send_hour_sum and lag2_send_hour_count are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_send_hour_sum, lag2_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag2_send_hour_sum and lag2_bounced_count are highly correlated (r=0.88)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_send_hour_sum, lag2_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag2_send_hour_mean and lag2_send_hour_max are highly correlated (r=0.96)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_send_hour_mean, lag2_send_hour_max

  πŸ“Œ Remove multicollinear feature
     lag2_send_hour_count and lag2_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_send_hour_count, lag2_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag2_time_to_open_hours_sum and lag2_time_to_open_hours_mean are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_time_to_open_hours_sum, lag2_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     lag2_time_to_open_hours_sum and lag2_time_to_open_hours_max are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_time_to_open_hours_sum, lag2_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     lag2_time_to_open_hours_mean and lag2_time_to_open_hours_max are highly correlated (r=0.98)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag2_time_to_open_hours_mean, lag2_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     lag3_opened_sum and lag3_opened_mean are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_opened_sum, lag3_opened_mean

  πŸ“Œ Remove multicollinear feature
     lag3_opened_sum and lag3_time_to_open_hours_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_opened_sum, lag3_time_to_open_hours_count

  πŸ“Œ Remove multicollinear feature
     lag3_opened_mean and lag3_time_to_open_hours_count are highly correlated (r=0.94)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_opened_mean, lag3_time_to_open_hours_count

  πŸ“Œ Remove multicollinear feature
     lag3_opened_count and lag3_clicked_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_opened_count, lag3_clicked_count

  πŸ“Œ Remove multicollinear feature
     lag3_opened_count and lag3_send_hour_sum are highly correlated (r=0.87)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_opened_count, lag3_send_hour_sum

  πŸ“Œ Remove multicollinear feature
     lag3_opened_count and lag3_send_hour_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_opened_count, lag3_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag3_opened_count and lag3_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_opened_count, lag3_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag3_clicked_count and lag3_send_hour_sum are highly correlated (r=0.87)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_clicked_count, lag3_send_hour_sum

  πŸ“Œ Remove multicollinear feature
     lag3_clicked_count and lag3_send_hour_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_clicked_count, lag3_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag3_clicked_count and lag3_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_clicked_count, lag3_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag3_send_hour_sum and lag3_send_hour_count are highly correlated (r=0.87)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_send_hour_sum, lag3_send_hour_count

  πŸ“Œ Remove multicollinear feature
     lag3_send_hour_sum and lag3_bounced_count are highly correlated (r=0.87)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_send_hour_sum, lag3_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag3_send_hour_mean and lag3_send_hour_max are highly correlated (r=0.96)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_send_hour_mean, lag3_send_hour_max

  πŸ“Œ Remove multicollinear feature
     lag3_send_hour_count and lag3_bounced_count are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_send_hour_count, lag3_bounced_count

  πŸ“Œ Remove multicollinear feature
     lag3_time_to_open_hours_sum and lag3_time_to_open_hours_mean are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_time_to_open_hours_sum, lag3_time_to_open_hours_mean

  πŸ“Œ Remove multicollinear feature
     lag3_time_to_open_hours_sum and lag3_time_to_open_hours_max are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_time_to_open_hours_sum, lag3_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     lag3_time_to_open_hours_mean and lag3_time_to_open_hours_max are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: lag3_time_to_open_hours_mean, lag3_time_to_open_hours_max

  πŸ“Œ Remove multicollinear feature
     opened_velocity and opened_velocity_pct are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_velocity, opened_velocity_pct

  πŸ“Œ Remove multicollinear feature
     opened_velocity_pct and opened_vs_cohort_mean are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_velocity_pct, opened_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     opened_velocity_pct and opened_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_velocity_pct, opened_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     opened_velocity_pct and opened_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_velocity_pct, opened_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     send_hour_velocity and send_hour_velocity_pct are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_velocity, send_hour_velocity_pct

  πŸ“Œ Remove multicollinear feature
     bounced_velocity and bounced_acceleration are highly correlated (r=0.87)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: bounced_velocity, bounced_acceleration

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_momentum and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_momentum, time_to_open_hours_vs_cohort_mean

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_momentum and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_momentum, time_to_open_hours_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_momentum and time_to_open_hours_cohort_zscore are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_momentum, time_to_open_hours_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     clicked_end and clicked_trend_ratio are highly correlated (r=0.92)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_end, clicked_trend_ratio

  πŸ“Œ Remove multicollinear feature
     bounced_end and bounced_trend_ratio are highly correlated (r=0.99)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: bounced_end, bounced_trend_ratio

  πŸ“Œ Remove multicollinear feature
     days_since_first_event_y and active_span_days are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: days_since_first_event_y, active_span_days

  πŸ“Œ Remove multicollinear feature
     inter_event_gap_std and inter_event_gap_max are highly correlated (r=0.89)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: inter_event_gap_std, inter_event_gap_max

  πŸ“Œ Remove multicollinear feature
     opened_vs_cohort_mean and opened_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_vs_cohort_mean, opened_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     opened_vs_cohort_mean and opened_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_vs_cohort_mean, opened_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     opened_vs_cohort_pct and opened_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: opened_vs_cohort_pct, opened_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     clicked_vs_cohort_mean and clicked_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_vs_cohort_mean, clicked_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     clicked_vs_cohort_mean and clicked_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_vs_cohort_mean, clicked_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     clicked_vs_cohort_pct and clicked_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: clicked_vs_cohort_pct, clicked_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     send_hour_vs_cohort_mean and send_hour_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_vs_cohort_mean, send_hour_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     send_hour_vs_cohort_mean and send_hour_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_vs_cohort_mean, send_hour_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     send_hour_vs_cohort_pct and send_hour_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: send_hour_vs_cohort_pct, send_hour_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     bounced_vs_cohort_mean and bounced_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: bounced_vs_cohort_mean, bounced_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     bounced_vs_cohort_mean and bounced_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: bounced_vs_cohort_mean, bounced_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     bounced_vs_cohort_pct and bounced_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: bounced_vs_cohort_pct, bounced_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_vs_cohort_mean and time_to_open_hours_vs_cohort_pct are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_vs_cohort_mean, time_to_open_hours_vs_cohort_pct

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_vs_cohort_mean and time_to_open_hours_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_vs_cohort_mean, time_to_open_hours_cohort_zscore

  πŸ“Œ Remove multicollinear feature
     time_to_open_hours_vs_cohort_pct and time_to_open_hours_cohort_zscore are highly correlated (r=1.00)
     β†’ Action: Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.
     β†’ Features: time_to_open_hours_vs_cohort_pct, time_to_open_hours_cohort_zscore

  πŸ“Œ Prioritize strong predictors
     Top predictive features: days_since_last_event_x, days_since_first_event_y, active_span_days
     β†’ Action: Ensure these features are included in your model and check for data quality issues.
     β†’ Features: days_since_last_event_x, days_since_first_event_y, active_span_days

  πŸ“Œ Stratify by lifecycle_quadrant
     Significant variation in retention rates across lifecycle_quadrant categories (spread: 75.1%)
     β†’ Action: Use stratified sampling by lifecycle_quadrant in train/test split to ensure all segments are represented.
     β†’ Features: lifecycle_quadrant

  πŸ“Œ Stratify by recency_bucket
     Significant variation in retention rates across recency_bucket categories (spread: 66.6%)
     β†’ Action: Use stratified sampling by recency_bucket in train/test split to ensure all segments are represented.
     β†’ Features: recency_bucket

  πŸ“Œ Monitor high-risk segments
     Segments with below-average retention: Steady & Loyal, Occasional & Loyal, 0-7d
     β†’ Action: Target these segments for intervention campaigns and ensure adequate representation in training data.
     β†’ Features: lifecycle_quadrant, lifecycle_quadrant, recency_bucket

βœ… Persisted 442 multicollinearity recommendations
βœ… Persisted 51 strong predictor recommendations
βœ… Persisted 10 weak predictor recommendations

5.9.1 Feature Selection RecommendationsΒΆ

What these recommendations tell you:

  • Which features to prioritize (strong predictors)
  • Which features to consider dropping (weak predictors, redundant features)
  • Which feature pairs cause multicollinearity issues

πŸ“Š Decision Guide:

Finding Linear Models Tree-Based Models
Strong predictors Include - will have high coefficients Include - will appear early in splits
Weak predictors Consider dropping May help in interactions
Multicollinear pairs Drop one feature Can keep both (trees handle it)
InΒ [13]:
Show/Hide Code
# Feature Selection Recommendations
selection_recs = grouped_recs.get(RecommendationCategory.FEATURE_SELECTION, [])

print("=" * 70)
print("FEATURE SELECTION")
print("=" * 70)

# Strong predictors summary
if analysis_summary.strong_predictors:
    print("\nβœ… STRONG PREDICTORS (prioritize these):")
    strong_df = pd.DataFrame(analysis_summary.strong_predictors)
    strong_df["effect_size"] = strong_df["effect_size"].apply(lambda x: f"{x:+.3f}")
    strong_df["correlation"] = strong_df["correlation"].apply(lambda x: f"{x:+.3f}")
    strong_df = strong_df.sort_values("effect_size", key=lambda x: x.str.replace("+", "").astype(float).abs(), ascending=False)
    display(strong_df)

    print("\n   πŸ’‘ These features show strong discrimination between retained/churned customers.")
    print("   β†’ Ensure they're included in your model")
    print("   β†’ Check for data quality issues that could inflate their importance")

# Weak predictors summary
if analysis_summary.weak_predictors:
    print(f"\nβšͺ WEAK PREDICTORS (consider dropping): {', '.join(analysis_summary.weak_predictors[:5])}")
    print("   β†’ Low individual predictive power, but may help in combination")

# Multicollinearity summary
if analysis_summary.multicollinear_pairs:
    print("\n⚠️ MULTICOLLINEAR PAIRS (drop one from each pair for linear models):")
    for pair in analysis_summary.multicollinear_pairs:
        print(f"   β€’ {pair['feature1']} ↔ {pair['feature2']}: r = {pair['correlation']:.2f}")
    print("\n   πŸ’‘ For each pair, keep the feature with:")
    print("      - Stronger business meaning")
    print("      - Higher target correlation")
    print("      - Fewer missing values")

# Display all feature selection recommendations
if selection_recs:
    print("\n" + "-" * 70)
    print("DETAILED RECOMMENDATIONS:")
    for rec in selection_recs:
        priority_icon = "πŸ”΄" if rec.priority == "high" else "🟑" if rec.priority == "medium" else "🟒"
        print(f"\n{priority_icon} {rec.title}")
        print(f"   {rec.description}")
        print(f"   β†’ {rec.action}")
======================================================================
FEATURE SELECTION
======================================================================

βœ… STRONG PREDICTORS (prioritize these):
feature correlation effect_size
32 days_since_last_event_x +0.767 +2.403
45 active_span_days -0.762 -2.365
44 days_since_first_event_y -0.762 -2.365
18 bounced_count_365d -0.604 -1.524
1 event_count_365d -0.604 -1.524
17 send_hour_count_365d -0.604 -1.524
15 clicked_count_365d -0.604 -1.524
13 opened_count_365d -0.604 -1.524
16 send_hour_sum_365d -0.593 -1.481
0 event_count_180d -0.503 -1.170
8 send_hour_count_180d -0.503 -1.170
6 clicked_count_180d -0.503 -1.170
5 opened_count_180d -0.503 -1.170
9 bounced_count_180d -0.503 -1.170
7 send_hour_sum_180d -0.492 -1.136
31 time_to_open_hours_count_all_time -0.448 -1.008
21 opened_sum_all_time -0.448 -1.008
38 opened_end -0.447 -1.007
20 time_to_open_hours_count_365d -0.430 -0.957
11 opened_sum_365d -0.430 -0.957
47 inter_event_gap_max -0.419 -0.929
29 bounced_count_all_time -0.406 -0.893
28 send_hour_count_all_time -0.406 -0.893
25 clicked_count_all_time -0.406 -0.893
2 event_count_all_time -0.406 -0.893
23 opened_count_all_time -0.406 -0.893
26 send_hour_sum_all_time -0.403 -0.886
22 opened_mean_all_time -0.385 -0.839
30 time_to_open_hours_sum_all_time -0.379 -0.823
43 time_to_open_hours_end -0.350 -0.753
42 send_hour_end -0.343 -0.735
34 lag0_opened_mean -0.341 -0.729
12 opened_mean_365d -0.186 -0.705
39 opened_trend_ratio -0.317 -0.701
41 send_hour_beginning -0.327 -0.697
24 clicked_sum_all_time -0.325 -0.692
19 time_to_open_hours_sum_365d -0.323 -0.687
3 opened_sum_180d -0.313 -0.663
10 time_to_open_hours_count_180d -0.313 -0.663
48 opened_vs_cohort_mean -0.300 -0.632
49 opened_vs_cohort_pct -0.300 -0.632
50 opened_cohort_zscore -0.300 -0.632
35 lag0_time_to_open_hours_count -0.300 -0.632
33 lag0_opened_sum -0.300 -0.632
40 clicked_end -0.295 -0.623
27 send_hour_max_all_time -0.283 -0.594
4 opened_mean_180d -0.131 -0.579
37 opened_beginning -0.265 -0.555
46 inter_event_gap_std -0.263 -0.549
14 clicked_sum_365d -0.262 -0.546
36 lag1_opened_mean -0.258 -0.534
   πŸ’‘ These features show strong discrimination between retained/churned customers.
   β†’ Ensure they're included in your model
   β†’ Check for data quality issues that could inflate their importance

βšͺ WEAK PREDICTORS (consider dropping): send_hour_mean_180d, send_hour_max_180d, bounced_sum_180d, bounced_mean_180d, time_to_open_hours_mean_180d
   β†’ Low individual predictive power, but may help in combination

⚠️ MULTICOLLINEAR PAIRS (drop one from each pair for linear models):
   β€’ event_count_180d ↔ event_count_365d: r = 0.82
   β€’ event_count_180d ↔ opened_count_180d: r = 1.00
   β€’ event_count_180d ↔ clicked_count_180d: r = 1.00
   β€’ event_count_180d ↔ send_hour_sum_180d: r = 0.98
   β€’ event_count_180d ↔ send_hour_count_180d: r = 1.00
   β€’ event_count_180d ↔ bounced_count_180d: r = 1.00
   β€’ event_count_180d ↔ opened_count_365d: r = 0.82
   β€’ event_count_180d ↔ clicked_count_365d: r = 0.82
   β€’ event_count_180d ↔ send_hour_sum_365d: r = 0.80
   β€’ event_count_180d ↔ send_hour_count_365d: r = 0.82
   β€’ event_count_180d ↔ bounced_count_365d: r = 0.82
   β€’ event_count_365d ↔ opened_count_180d: r = 0.82
   β€’ event_count_365d ↔ clicked_count_180d: r = 0.82
   β€’ event_count_365d ↔ send_hour_sum_180d: r = 0.80
   β€’ event_count_365d ↔ send_hour_count_180d: r = 0.82
   β€’ event_count_365d ↔ bounced_count_180d: r = 0.82
   β€’ event_count_365d ↔ opened_count_365d: r = 1.00
   β€’ event_count_365d ↔ clicked_count_365d: r = 1.00
   β€’ event_count_365d ↔ send_hour_sum_365d: r = 0.98
   β€’ event_count_365d ↔ send_hour_count_365d: r = 1.00
   β€’ event_count_365d ↔ bounced_count_365d: r = 1.00
   β€’ event_count_all_time ↔ opened_sum_all_time: r = 0.80
   β€’ event_count_all_time ↔ opened_count_all_time: r = 1.00
   β€’ event_count_all_time ↔ clicked_count_all_time: r = 1.00
   β€’ event_count_all_time ↔ send_hour_sum_all_time: r = 0.99
   β€’ event_count_all_time ↔ send_hour_count_all_time: r = 1.00
   β€’ event_count_all_time ↔ bounced_count_all_time: r = 1.00
   β€’ event_count_all_time ↔ time_to_open_hours_count_all_time: r = 0.80
   β€’ event_count_all_time ↔ send_hour_beginning: r = 0.84
   β€’ event_count_all_time ↔ send_hour_end: r = 0.84
   β€’ opened_sum_180d ↔ opened_mean_180d: r = 0.82
   β€’ opened_sum_180d ↔ time_to_open_hours_sum_180d: r = 0.74
   β€’ opened_sum_180d ↔ time_to_open_hours_count_180d: r = 1.00
   β€’ opened_sum_180d ↔ opened_sum_365d: r = 0.75
   β€’ opened_sum_180d ↔ time_to_open_hours_count_365d: r = 0.75
   β€’ opened_mean_180d ↔ time_to_open_hours_count_180d: r = 0.82
   β€’ opened_mean_180d ↔ opened_mean_365d: r = 0.79
   β€’ opened_mean_180d ↔ lag0_opened_sum: r = 0.84
   β€’ opened_mean_180d ↔ lag0_opened_mean: r = 0.90
   β€’ opened_mean_180d ↔ lag0_time_to_open_hours_count: r = 0.84
   β€’ opened_mean_180d ↔ opened_vs_cohort_mean: r = 0.84
   β€’ opened_mean_180d ↔ opened_vs_cohort_pct: r = 0.84
   β€’ opened_mean_180d ↔ opened_cohort_zscore: r = 0.84
   β€’ opened_count_180d ↔ clicked_count_180d: r = 1.00
   β€’ opened_count_180d ↔ send_hour_sum_180d: r = 0.98
   β€’ opened_count_180d ↔ send_hour_count_180d: r = 1.00
   β€’ opened_count_180d ↔ bounced_count_180d: r = 1.00
   β€’ opened_count_180d ↔ opened_count_365d: r = 0.82
   β€’ opened_count_180d ↔ clicked_count_365d: r = 0.82
   β€’ opened_count_180d ↔ send_hour_sum_365d: r = 0.80
   β€’ opened_count_180d ↔ send_hour_count_365d: r = 0.82
   β€’ opened_count_180d ↔ bounced_count_365d: r = 0.82
   β€’ clicked_sum_180d ↔ clicked_mean_180d: r = 0.88
   β€’ clicked_sum_180d ↔ clicked_sum_365d: r = 0.71
   β€’ clicked_mean_180d ↔ clicked_mean_365d: r = 0.77
   β€’ clicked_mean_180d ↔ lag0_clicked_sum: r = 0.84
   β€’ clicked_mean_180d ↔ lag0_clicked_mean: r = 0.88
   β€’ clicked_mean_180d ↔ clicked_vs_cohort_mean: r = 0.84
   β€’ clicked_mean_180d ↔ clicked_vs_cohort_pct: r = 0.84
   β€’ clicked_mean_180d ↔ clicked_cohort_zscore: r = 0.84
   β€’ clicked_count_180d ↔ send_hour_sum_180d: r = 0.98
   β€’ clicked_count_180d ↔ send_hour_count_180d: r = 1.00
   β€’ clicked_count_180d ↔ bounced_count_180d: r = 1.00
   β€’ clicked_count_180d ↔ opened_count_365d: r = 0.82
   β€’ clicked_count_180d ↔ clicked_count_365d: r = 0.82
   β€’ clicked_count_180d ↔ send_hour_sum_365d: r = 0.80
   β€’ clicked_count_180d ↔ send_hour_count_365d: r = 0.82
   β€’ clicked_count_180d ↔ bounced_count_365d: r = 0.82
   β€’ send_hour_sum_180d ↔ send_hour_count_180d: r = 0.98
   β€’ send_hour_sum_180d ↔ bounced_count_180d: r = 0.98
   β€’ send_hour_sum_180d ↔ opened_count_365d: r = 0.80
   β€’ send_hour_sum_180d ↔ clicked_count_365d: r = 0.80
   β€’ send_hour_sum_180d ↔ send_hour_sum_365d: r = 0.81
   β€’ send_hour_sum_180d ↔ send_hour_count_365d: r = 0.80
   β€’ send_hour_sum_180d ↔ bounced_count_365d: r = 0.80
   β€’ send_hour_mean_180d ↔ send_hour_max_180d: r = 0.88
   β€’ send_hour_mean_180d ↔ send_hour_mean_365d: r = 0.81
   β€’ send_hour_mean_180d ↔ lag0_send_hour_mean: r = 0.89
   β€’ send_hour_mean_180d ↔ lag0_send_hour_max: r = 0.86
   β€’ send_hour_max_180d ↔ send_hour_mean_365d: r = 0.73
   β€’ send_hour_max_180d ↔ send_hour_max_365d: r = 0.75
   β€’ send_hour_max_180d ↔ lag0_send_hour_mean: r = 0.79
   β€’ send_hour_max_180d ↔ lag0_send_hour_max: r = 0.81
   β€’ send_hour_count_180d ↔ bounced_count_180d: r = 1.00
   β€’ send_hour_count_180d ↔ opened_count_365d: r = 0.82
   β€’ send_hour_count_180d ↔ clicked_count_365d: r = 0.82
   β€’ send_hour_count_180d ↔ send_hour_sum_365d: r = 0.80
   β€’ send_hour_count_180d ↔ send_hour_count_365d: r = 0.82
   β€’ send_hour_count_180d ↔ bounced_count_365d: r = 0.82
   β€’ bounced_sum_180d ↔ bounced_mean_180d: r = 0.89
   β€’ bounced_sum_180d ↔ bounced_sum_365d: r = 0.70
   β€’ bounced_mean_180d ↔ bounced_mean_365d: r = 0.80
   β€’ bounced_mean_180d ↔ lag0_bounced_sum: r = 0.85
   β€’ bounced_mean_180d ↔ lag0_bounced_mean: r = 0.88
   β€’ bounced_mean_180d ↔ bounced_vs_cohort_mean: r = 0.85
   β€’ bounced_mean_180d ↔ bounced_vs_cohort_pct: r = 0.85
   β€’ bounced_mean_180d ↔ bounced_cohort_zscore: r = 0.85
   β€’ bounced_count_180d ↔ opened_count_365d: r = 0.82
   β€’ bounced_count_180d ↔ clicked_count_365d: r = 0.82
   β€’ bounced_count_180d ↔ send_hour_sum_365d: r = 0.80
   β€’ bounced_count_180d ↔ send_hour_count_365d: r = 0.82
   β€’ bounced_count_180d ↔ bounced_count_365d: r = 0.82
   β€’ time_to_open_hours_sum_180d ↔ time_to_open_hours_mean_180d: r = 0.89
   β€’ time_to_open_hours_sum_180d ↔ time_to_open_hours_max_180d: r = 0.96
   β€’ time_to_open_hours_sum_180d ↔ time_to_open_hours_count_180d: r = 0.74
   β€’ time_to_open_hours_sum_180d ↔ time_to_open_hours_sum_365d: r = 0.71
   β€’ time_to_open_hours_mean_180d ↔ time_to_open_hours_max_180d: r = 0.97
   β€’ time_to_open_hours_mean_180d ↔ time_to_open_hours_sum_365d: r = 0.73
   β€’ time_to_open_hours_mean_180d ↔ time_to_open_hours_mean_365d: r = 0.94
   β€’ time_to_open_hours_mean_180d ↔ time_to_open_hours_max_365d: r = 0.88
   β€’ time_to_open_hours_mean_180d ↔ lag0_time_to_open_hours_sum: r = 0.70
   β€’ time_to_open_hours_mean_180d ↔ lag0_time_to_open_hours_mean: r = 0.98
   β€’ time_to_open_hours_mean_180d ↔ lag0_time_to_open_hours_max: r = 0.96
   β€’ time_to_open_hours_mean_180d ↔ lag1_time_to_open_hours_mean: r = 0.73
   β€’ time_to_open_hours_mean_180d ↔ lag1_time_to_open_hours_max: r = 0.73
   β€’ time_to_open_hours_mean_180d ↔ lag2_time_to_open_hours_mean: r = 0.94
   β€’ time_to_open_hours_mean_180d ↔ lag2_time_to_open_hours_max: r = 0.94
   β€’ time_to_open_hours_mean_180d ↔ lag3_time_to_open_hours_mean: r = 0.86
   β€’ time_to_open_hours_mean_180d ↔ lag3_time_to_open_hours_max: r = 0.87
   β€’ time_to_open_hours_mean_180d ↔ time_to_open_hours_vs_cohort_mean: r = 0.70
   β€’ time_to_open_hours_mean_180d ↔ time_to_open_hours_vs_cohort_pct: r = 0.70
   β€’ time_to_open_hours_mean_180d ↔ time_to_open_hours_cohort_zscore: r = 0.70
   β€’ time_to_open_hours_max_180d ↔ time_to_open_hours_sum_365d: r = 0.81
   β€’ time_to_open_hours_max_180d ↔ time_to_open_hours_mean_365d: r = 0.91
   β€’ time_to_open_hours_max_180d ↔ time_to_open_hours_max_365d: r = 0.91
   β€’ time_to_open_hours_max_180d ↔ lag0_time_to_open_hours_sum: r = 0.74
   β€’ time_to_open_hours_max_180d ↔ lag0_time_to_open_hours_mean: r = 0.95
   β€’ time_to_open_hours_max_180d ↔ lag0_time_to_open_hours_max: r = 0.96
   β€’ time_to_open_hours_max_180d ↔ lag2_time_to_open_hours_mean: r = 0.90
   β€’ time_to_open_hours_max_180d ↔ lag2_time_to_open_hours_max: r = 0.90
   β€’ time_to_open_hours_max_180d ↔ lag3_time_to_open_hours_mean: r = 0.76
   β€’ time_to_open_hours_max_180d ↔ lag3_time_to_open_hours_max: r = 0.77
   β€’ time_to_open_hours_max_180d ↔ time_to_open_hours_momentum: r = 0.72
   β€’ time_to_open_hours_max_180d ↔ time_to_open_hours_vs_cohort_mean: r = 0.74
   β€’ time_to_open_hours_max_180d ↔ time_to_open_hours_vs_cohort_pct: r = 0.74
   β€’ time_to_open_hours_max_180d ↔ time_to_open_hours_cohort_zscore: r = 0.74
   β€’ time_to_open_hours_count_180d ↔ opened_sum_365d: r = 0.75
   β€’ time_to_open_hours_count_180d ↔ time_to_open_hours_count_365d: r = 0.75
   β€’ opened_sum_365d ↔ opened_mean_365d: r = 0.76
   β€’ opened_sum_365d ↔ time_to_open_hours_sum_365d: r = 0.75
   β€’ opened_sum_365d ↔ time_to_open_hours_count_365d: r = 1.00
   β€’ opened_mean_365d ↔ time_to_open_hours_count_365d: r = 0.76
   β€’ opened_mean_365d ↔ lag0_opened_sum: r = 0.72
   β€’ opened_mean_365d ↔ lag0_opened_mean: r = 0.76
   β€’ opened_mean_365d ↔ lag0_time_to_open_hours_count: r = 0.72
   β€’ opened_mean_365d ↔ opened_vs_cohort_mean: r = 0.72
   β€’ opened_mean_365d ↔ opened_vs_cohort_pct: r = 0.72
   β€’ opened_mean_365d ↔ opened_cohort_zscore: r = 0.72
   β€’ opened_count_365d ↔ clicked_count_365d: r = 1.00
   β€’ opened_count_365d ↔ send_hour_sum_365d: r = 0.98
   β€’ opened_count_365d ↔ send_hour_count_365d: r = 1.00
   β€’ opened_count_365d ↔ bounced_count_365d: r = 1.00
   β€’ clicked_sum_365d ↔ clicked_mean_365d: r = 0.84
   β€’ clicked_mean_365d ↔ lag0_clicked_sum: r = 0.71
   β€’ clicked_mean_365d ↔ lag0_clicked_mean: r = 0.74
   β€’ clicked_mean_365d ↔ clicked_vs_cohort_mean: r = 0.71
   β€’ clicked_mean_365d ↔ clicked_vs_cohort_pct: r = 0.71
   β€’ clicked_mean_365d ↔ clicked_cohort_zscore: r = 0.71
   β€’ clicked_count_365d ↔ send_hour_sum_365d: r = 0.98
   β€’ clicked_count_365d ↔ send_hour_count_365d: r = 1.00
   β€’ clicked_count_365d ↔ bounced_count_365d: r = 1.00
   β€’ send_hour_sum_365d ↔ send_hour_count_365d: r = 0.98
   β€’ send_hour_sum_365d ↔ bounced_count_365d: r = 0.98
   β€’ send_hour_mean_365d ↔ send_hour_max_365d: r = 0.81
   β€’ send_hour_mean_365d ↔ lag0_send_hour_mean: r = 0.78
   β€’ send_hour_mean_365d ↔ lag0_send_hour_max: r = 0.75
   β€’ send_hour_count_365d ↔ bounced_count_365d: r = 1.00
   β€’ bounced_sum_365d ↔ bounced_mean_365d: r = 0.84
   β€’ bounced_mean_365d ↔ lag0_bounced_sum: r = 0.74
   β€’ bounced_mean_365d ↔ lag0_bounced_mean: r = 0.77
   β€’ bounced_mean_365d ↔ bounced_vs_cohort_mean: r = 0.74
   β€’ bounced_mean_365d ↔ bounced_vs_cohort_pct: r = 0.74
   β€’ bounced_mean_365d ↔ bounced_cohort_zscore: r = 0.74
   β€’ time_to_open_hours_sum_365d ↔ time_to_open_hours_mean_365d: r = 0.83
   β€’ time_to_open_hours_sum_365d ↔ time_to_open_hours_max_365d: r = 0.94
   β€’ time_to_open_hours_sum_365d ↔ time_to_open_hours_count_365d: r = 0.75
   β€’ time_to_open_hours_mean_365d ↔ time_to_open_hours_max_365d: r = 0.95
   β€’ time_to_open_hours_mean_365d ↔ lag0_time_to_open_hours_mean: r = 0.94
   β€’ time_to_open_hours_mean_365d ↔ lag0_time_to_open_hours_max: r = 0.93
   β€’ time_to_open_hours_mean_365d ↔ lag1_time_to_open_hours_mean: r = 0.85
   β€’ time_to_open_hours_mean_365d ↔ lag1_time_to_open_hours_max: r = 0.85
   β€’ time_to_open_hours_mean_365d ↔ lag2_time_to_open_hours_mean: r = 0.81
   β€’ time_to_open_hours_mean_365d ↔ lag2_time_to_open_hours_max: r = 0.81
   β€’ time_to_open_hours_mean_365d ↔ lag3_time_to_open_hours_mean: r = 0.81
   β€’ time_to_open_hours_mean_365d ↔ lag3_time_to_open_hours_max: r = 0.81
   β€’ time_to_open_hours_max_365d ↔ lag0_time_to_open_hours_mean: r = 0.89
   β€’ time_to_open_hours_max_365d ↔ lag0_time_to_open_hours_max: r = 0.91
   β€’ time_to_open_hours_max_365d ↔ lag1_time_to_open_hours_mean: r = 0.72
   β€’ time_to_open_hours_max_365d ↔ lag1_time_to_open_hours_max: r = 0.72
   β€’ time_to_open_hours_max_365d ↔ lag2_time_to_open_hours_mean: r = 0.71
   β€’ time_to_open_hours_max_365d ↔ lag2_time_to_open_hours_max: r = 0.71
   β€’ time_to_open_hours_max_365d ↔ lag3_time_to_open_hours_mean: r = 0.77
   β€’ time_to_open_hours_max_365d ↔ lag3_time_to_open_hours_max: r = 0.77
   β€’ opened_sum_all_time ↔ opened_count_all_time: r = 0.80
   β€’ opened_sum_all_time ↔ clicked_sum_all_time: r = 0.74
   β€’ opened_sum_all_time ↔ clicked_count_all_time: r = 0.80
   β€’ opened_sum_all_time ↔ send_hour_sum_all_time: r = 0.79
   β€’ opened_sum_all_time ↔ send_hour_count_all_time: r = 0.80
   β€’ opened_sum_all_time ↔ bounced_count_all_time: r = 0.80
   β€’ opened_sum_all_time ↔ time_to_open_hours_sum_all_time: r = 0.86
   β€’ opened_sum_all_time ↔ time_to_open_hours_count_all_time: r = 1.00
   β€’ opened_sum_all_time ↔ opened_beginning: r = 0.76
   β€’ opened_sum_all_time ↔ opened_end: r = 0.77
   β€’ opened_count_all_time ↔ clicked_count_all_time: r = 1.00
   β€’ opened_count_all_time ↔ send_hour_sum_all_time: r = 0.99
   β€’ opened_count_all_time ↔ send_hour_count_all_time: r = 1.00
   β€’ opened_count_all_time ↔ bounced_count_all_time: r = 1.00
   β€’ opened_count_all_time ↔ time_to_open_hours_count_all_time: r = 0.80
   β€’ opened_count_all_time ↔ send_hour_beginning: r = 0.84
   β€’ opened_count_all_time ↔ send_hour_end: r = 0.84
   β€’ clicked_sum_all_time ↔ clicked_mean_all_time: r = 0.78
   β€’ clicked_sum_all_time ↔ time_to_open_hours_count_all_time: r = 0.74
   β€’ clicked_count_all_time ↔ send_hour_sum_all_time: r = 0.99
   β€’ clicked_count_all_time ↔ send_hour_count_all_time: r = 1.00
   β€’ clicked_count_all_time ↔ bounced_count_all_time: r = 1.00
   β€’ clicked_count_all_time ↔ time_to_open_hours_count_all_time: r = 0.80
   β€’ clicked_count_all_time ↔ send_hour_beginning: r = 0.84
   β€’ clicked_count_all_time ↔ send_hour_end: r = 0.84
   β€’ send_hour_sum_all_time ↔ send_hour_count_all_time: r = 0.99
   β€’ send_hour_sum_all_time ↔ bounced_count_all_time: r = 0.99
   β€’ send_hour_sum_all_time ↔ time_to_open_hours_count_all_time: r = 0.79
   β€’ send_hour_sum_all_time ↔ send_hour_beginning: r = 0.85
   β€’ send_hour_sum_all_time ↔ send_hour_end: r = 0.85
   β€’ send_hour_count_all_time ↔ bounced_count_all_time: r = 1.00
   β€’ send_hour_count_all_time ↔ time_to_open_hours_count_all_time: r = 0.80
   β€’ send_hour_count_all_time ↔ send_hour_beginning: r = 0.84
   β€’ send_hour_count_all_time ↔ send_hour_end: r = 0.84
   β€’ bounced_sum_all_time ↔ bounced_mean_all_time: r = 0.71
   β€’ bounced_count_all_time ↔ time_to_open_hours_count_all_time: r = 0.80
   β€’ bounced_count_all_time ↔ send_hour_beginning: r = 0.84
   β€’ bounced_count_all_time ↔ send_hour_end: r = 0.84
   β€’ time_to_open_hours_sum_all_time ↔ time_to_open_hours_max_all_time: r = 0.75
   β€’ time_to_open_hours_sum_all_time ↔ time_to_open_hours_count_all_time: r = 0.86
   β€’ time_to_open_hours_sum_all_time ↔ time_to_open_hours_beginning: r = 0.71
   β€’ time_to_open_hours_sum_all_time ↔ time_to_open_hours_end: r = 0.71
   β€’ time_to_open_hours_mean_all_time ↔ time_to_open_hours_max_all_time: r = 0.74
   β€’ time_to_open_hours_count_all_time ↔ opened_beginning: r = 0.76
   β€’ time_to_open_hours_count_all_time ↔ opened_end: r = 0.77
   β€’ days_since_last_event_x ↔ days_since_first_event_y: r = -0.99
   β€’ days_since_last_event_x ↔ active_span_days: r = -0.99
   β€’ lag0_opened_sum ↔ lag0_opened_mean: r = 0.91
   β€’ lag0_opened_sum ↔ lag0_time_to_open_hours_count: r = 1.00
   β€’ lag0_opened_sum ↔ opened_velocity: r = 0.70
   β€’ lag0_opened_sum ↔ opened_velocity_pct: r = 1.00
   β€’ lag0_opened_sum ↔ opened_momentum: r = 0.77
   β€’ lag0_opened_sum ↔ opened_vs_cohort_mean: r = 1.00
   β€’ lag0_opened_sum ↔ opened_vs_cohort_pct: r = 1.00
   β€’ lag0_opened_sum ↔ opened_cohort_zscore: r = 1.00
   β€’ lag0_opened_mean ↔ lag0_time_to_open_hours_count: r = 0.91
   β€’ lag0_opened_mean ↔ opened_velocity_pct: r = 0.84
   β€’ lag0_opened_mean ↔ opened_vs_cohort_mean: r = 0.91
   β€’ lag0_opened_mean ↔ opened_vs_cohort_pct: r = 0.91
   β€’ lag0_opened_mean ↔ opened_cohort_zscore: r = 0.91
   β€’ lag0_opened_count ↔ lag0_clicked_count: r = 1.00
   β€’ lag0_opened_count ↔ lag0_send_hour_sum: r = 0.91
   β€’ lag0_opened_count ↔ lag0_send_hour_count: r = 1.00
   β€’ lag0_opened_count ↔ lag0_bounced_count: r = 1.00
   β€’ lag0_opened_count ↔ send_hour_vs_cohort_mean: r = 0.91
   β€’ lag0_opened_count ↔ send_hour_vs_cohort_pct: r = 0.91
   β€’ lag0_opened_count ↔ send_hour_cohort_zscore: r = 0.91
   β€’ lag0_clicked_sum ↔ lag0_clicked_mean: r = 0.92
   β€’ lag0_clicked_sum ↔ clicked_vs_cohort_mean: r = 1.00
   β€’ lag0_clicked_sum ↔ clicked_vs_cohort_pct: r = 1.00
   β€’ lag0_clicked_sum ↔ clicked_cohort_zscore: r = 1.00
   β€’ lag0_clicked_mean ↔ clicked_vs_cohort_mean: r = 0.92
   β€’ lag0_clicked_mean ↔ clicked_vs_cohort_pct: r = 0.92
   β€’ lag0_clicked_mean ↔ clicked_cohort_zscore: r = 0.92
   β€’ lag0_clicked_count ↔ lag0_send_hour_sum: r = 0.91
   β€’ lag0_clicked_count ↔ lag0_send_hour_count: r = 1.00
   β€’ lag0_clicked_count ↔ lag0_bounced_count: r = 1.00
   β€’ lag0_clicked_count ↔ send_hour_vs_cohort_mean: r = 0.91
   β€’ lag0_clicked_count ↔ send_hour_vs_cohort_pct: r = 0.91
   β€’ lag0_clicked_count ↔ send_hour_cohort_zscore: r = 0.91
   β€’ lag0_send_hour_sum ↔ lag0_send_hour_count: r = 0.91
   β€’ lag0_send_hour_sum ↔ lag0_bounced_count: r = 0.91
   β€’ lag0_send_hour_sum ↔ send_hour_velocity: r = 0.77
   β€’ lag0_send_hour_sum ↔ send_hour_vs_cohort_mean: r = 1.00
   β€’ lag0_send_hour_sum ↔ send_hour_vs_cohort_pct: r = 1.00
   β€’ lag0_send_hour_sum ↔ send_hour_cohort_zscore: r = 1.00
   β€’ lag0_send_hour_mean ↔ lag0_send_hour_max: r = 0.95
   β€’ lag0_send_hour_count ↔ lag0_bounced_count: r = 1.00
   β€’ lag0_send_hour_count ↔ send_hour_vs_cohort_mean: r = 0.91
   β€’ lag0_send_hour_count ↔ send_hour_vs_cohort_pct: r = 0.91
   β€’ lag0_send_hour_count ↔ send_hour_cohort_zscore: r = 0.91
   β€’ lag0_bounced_sum ↔ lag0_bounced_mean: r = 0.94
   β€’ lag0_bounced_sum ↔ bounced_velocity: r = 0.79
   β€’ lag0_bounced_sum ↔ bounced_vs_cohort_mean: r = 1.00
   β€’ lag0_bounced_sum ↔ bounced_vs_cohort_pct: r = 1.00
   β€’ lag0_bounced_sum ↔ bounced_cohort_zscore: r = 1.00
   β€’ lag0_bounced_mean ↔ bounced_velocity: r = 0.72
   β€’ lag0_bounced_mean ↔ bounced_vs_cohort_mean: r = 0.94
   β€’ lag0_bounced_mean ↔ bounced_vs_cohort_pct: r = 0.94
   β€’ lag0_bounced_mean ↔ bounced_cohort_zscore: r = 0.94
   β€’ lag0_bounced_count ↔ send_hour_vs_cohort_mean: r = 0.91
   β€’ lag0_bounced_count ↔ send_hour_vs_cohort_pct: r = 0.91
   β€’ lag0_bounced_count ↔ send_hour_cohort_zscore: r = 0.91
   β€’ lag0_time_to_open_hours_sum ↔ lag0_time_to_open_hours_mean: r = 0.97
   β€’ lag0_time_to_open_hours_sum ↔ lag0_time_to_open_hours_max: r = 0.99
   β€’ lag0_time_to_open_hours_sum ↔ opened_velocity_pct: r = 0.70
   β€’ lag0_time_to_open_hours_sum ↔ time_to_open_hours_velocity: r = 0.77
   β€’ lag0_time_to_open_hours_sum ↔ time_to_open_hours_momentum: r = 0.89
   β€’ lag0_time_to_open_hours_sum ↔ time_to_open_hours_vs_cohort_mean: r = 1.00
   β€’ lag0_time_to_open_hours_sum ↔ time_to_open_hours_vs_cohort_pct: r = 1.00
   β€’ lag0_time_to_open_hours_sum ↔ time_to_open_hours_cohort_zscore: r = 1.00
   β€’ lag0_time_to_open_hours_mean ↔ lag0_time_to_open_hours_max: r = 0.99
   β€’ lag0_time_to_open_hours_mean ↔ time_to_open_hours_velocity: r = 0.86
   β€’ lag0_time_to_open_hours_mean ↔ time_to_open_hours_momentum: r = 0.83
   β€’ lag0_time_to_open_hours_mean ↔ time_to_open_hours_vs_cohort_mean: r = 0.97
   β€’ lag0_time_to_open_hours_mean ↔ time_to_open_hours_vs_cohort_pct: r = 0.97
   β€’ lag0_time_to_open_hours_mean ↔ time_to_open_hours_cohort_zscore: r = 0.97
   β€’ lag0_time_to_open_hours_count ↔ opened_velocity: r = 0.70
   β€’ lag0_time_to_open_hours_count ↔ opened_velocity_pct: r = 1.00
   β€’ lag0_time_to_open_hours_count ↔ opened_momentum: r = 0.77
   β€’ lag0_time_to_open_hours_count ↔ opened_vs_cohort_mean: r = 1.00
   β€’ lag0_time_to_open_hours_count ↔ opened_vs_cohort_pct: r = 1.00
   β€’ lag0_time_to_open_hours_count ↔ opened_cohort_zscore: r = 1.00
   β€’ lag0_time_to_open_hours_max ↔ time_to_open_hours_velocity: r = 0.90
   β€’ lag0_time_to_open_hours_max ↔ time_to_open_hours_momentum: r = 0.89
   β€’ lag0_time_to_open_hours_max ↔ time_to_open_hours_vs_cohort_mean: r = 0.99
   β€’ lag0_time_to_open_hours_max ↔ time_to_open_hours_vs_cohort_pct: r = 0.99
   β€’ lag0_time_to_open_hours_max ↔ time_to_open_hours_cohort_zscore: r = 0.99
   β€’ lag1_opened_sum ↔ lag1_opened_mean: r = 0.92
   β€’ lag1_opened_sum ↔ lag1_time_to_open_hours_sum: r = 0.72
   β€’ lag1_opened_sum ↔ lag1_time_to_open_hours_count: r = 1.00
   β€’ lag1_opened_sum ↔ opened_acceleration: r = -0.79
   β€’ lag1_opened_mean ↔ lag1_time_to_open_hours_count: r = 0.92
   β€’ lag1_opened_count ↔ lag1_clicked_count: r = 1.00
   β€’ lag1_opened_count ↔ lag1_send_hour_sum: r = 0.89
   β€’ lag1_opened_count ↔ lag1_send_hour_count: r = 1.00
   β€’ lag1_opened_count ↔ lag1_bounced_count: r = 1.00
   β€’ lag1_clicked_sum ↔ lag1_clicked_mean: r = 0.95
   β€’ lag1_clicked_sum ↔ clicked_velocity: r = -0.72
   β€’ lag1_clicked_sum ↔ clicked_acceleration: r = -0.84
   β€’ lag1_clicked_mean ↔ clicked_acceleration: r = -0.76
   β€’ lag1_clicked_count ↔ lag1_send_hour_sum: r = 0.89
   β€’ lag1_clicked_count ↔ lag1_send_hour_count: r = 1.00
   β€’ lag1_clicked_count ↔ lag1_bounced_count: r = 1.00
   β€’ lag1_send_hour_sum ↔ lag1_send_hour_count: r = 0.89
   β€’ lag1_send_hour_sum ↔ lag1_bounced_count: r = 0.89
   β€’ lag1_send_hour_mean ↔ lag1_send_hour_max: r = 0.96
   β€’ lag1_send_hour_count ↔ lag1_bounced_count: r = 1.00
   β€’ lag1_time_to_open_hours_sum ↔ lag1_time_to_open_hours_mean: r = 0.98
   β€’ lag1_time_to_open_hours_sum ↔ lag1_time_to_open_hours_count: r = 0.72
   β€’ lag1_time_to_open_hours_sum ↔ lag1_time_to_open_hours_max: r = 0.99
   β€’ lag1_time_to_open_hours_sum ↔ time_to_open_hours_acceleration: r = -0.78
   β€’ lag1_time_to_open_hours_mean ↔ lag1_time_to_open_hours_max: r = 1.00
   β€’ lag1_time_to_open_hours_mean ↔ time_to_open_hours_acceleration: r = -0.78
   β€’ lag1_time_to_open_hours_count ↔ opened_acceleration: r = -0.79
   β€’ lag1_time_to_open_hours_max ↔ time_to_open_hours_acceleration: r = -0.81
   β€’ lag2_opened_sum ↔ lag2_opened_mean: r = 0.91
   β€’ lag2_opened_sum ↔ lag2_time_to_open_hours_sum: r = 0.74
   β€’ lag2_opened_sum ↔ lag2_time_to_open_hours_count: r = 1.00
   β€’ lag2_opened_mean ↔ lag2_time_to_open_hours_count: r = 0.91
   β€’ lag2_opened_count ↔ lag2_clicked_count: r = 1.00
   β€’ lag2_opened_count ↔ lag2_send_hour_sum: r = 0.88
   β€’ lag2_opened_count ↔ lag2_send_hour_count: r = 1.00
   β€’ lag2_opened_count ↔ lag2_bounced_count: r = 1.00
   β€’ lag2_clicked_sum ↔ lag2_clicked_mean: r = 0.92
   β€’ lag2_clicked_count ↔ lag2_send_hour_sum: r = 0.88
   β€’ lag2_clicked_count ↔ lag2_send_hour_count: r = 1.00
   β€’ lag2_clicked_count ↔ lag2_bounced_count: r = 1.00
   β€’ lag2_send_hour_sum ↔ lag2_send_hour_count: r = 0.88
   β€’ lag2_send_hour_sum ↔ lag2_bounced_count: r = 0.88
   β€’ lag2_send_hour_mean ↔ lag2_send_hour_max: r = 0.96
   β€’ lag2_send_hour_count ↔ lag2_bounced_count: r = 1.00
   β€’ lag2_time_to_open_hours_sum ↔ lag2_time_to_open_hours_mean: r = 0.94
   β€’ lag2_time_to_open_hours_sum ↔ lag2_time_to_open_hours_count: r = 0.74
   β€’ lag2_time_to_open_hours_sum ↔ lag2_time_to_open_hours_max: r = 0.99
   β€’ lag2_time_to_open_hours_mean ↔ lag2_time_to_open_hours_max: r = 0.98
   β€’ lag3_opened_sum ↔ lag3_opened_mean: r = 0.94
   β€’ lag3_opened_sum ↔ lag3_time_to_open_hours_count: r = 1.00
   β€’ lag3_opened_mean ↔ lag3_time_to_open_hours_count: r = 0.94
   β€’ lag3_opened_count ↔ lag3_clicked_count: r = 1.00
   β€’ lag3_opened_count ↔ lag3_send_hour_sum: r = 0.87
   β€’ lag3_opened_count ↔ lag3_send_hour_count: r = 1.00
   β€’ lag3_opened_count ↔ lag3_bounced_count: r = 1.00
   β€’ lag3_clicked_count ↔ lag3_send_hour_sum: r = 0.87
   β€’ lag3_clicked_count ↔ lag3_send_hour_count: r = 1.00
   β€’ lag3_clicked_count ↔ lag3_bounced_count: r = 1.00
   β€’ lag3_send_hour_sum ↔ lag3_send_hour_count: r = 0.87
   β€’ lag3_send_hour_sum ↔ lag3_bounced_count: r = 0.87
   β€’ lag3_send_hour_mean ↔ lag3_send_hour_max: r = 0.96
   β€’ lag3_send_hour_count ↔ lag3_bounced_count: r = 1.00
   β€’ lag3_time_to_open_hours_sum ↔ lag3_time_to_open_hours_mean: r = 0.99
   β€’ lag3_time_to_open_hours_sum ↔ lag3_time_to_open_hours_max: r = 1.00
   β€’ lag3_time_to_open_hours_mean ↔ lag3_time_to_open_hours_max: r = 1.00
   β€’ opened_velocity ↔ opened_velocity_pct: r = 0.92
   β€’ opened_velocity ↔ opened_acceleration: r = 0.84
   β€’ opened_velocity ↔ opened_vs_cohort_mean: r = 0.70
   β€’ opened_velocity ↔ opened_vs_cohort_pct: r = 0.70
   β€’ opened_velocity ↔ opened_cohort_zscore: r = 0.70
   β€’ opened_velocity_pct ↔ opened_vs_cohort_mean: r = 1.00
   β€’ opened_velocity_pct ↔ opened_vs_cohort_pct: r = 1.00
   β€’ opened_velocity_pct ↔ opened_cohort_zscore: r = 1.00
   β€’ opened_velocity_pct ↔ time_to_open_hours_vs_cohort_mean: r = 0.70
   β€’ opened_velocity_pct ↔ time_to_open_hours_vs_cohort_pct: r = 0.70
   β€’ opened_velocity_pct ↔ time_to_open_hours_cohort_zscore: r = 0.70
   β€’ clicked_velocity ↔ clicked_acceleration: r = 0.85
   β€’ send_hour_velocity ↔ send_hour_velocity_pct: r = 0.89
   β€’ send_hour_velocity ↔ send_hour_acceleration: r = 0.79
   β€’ send_hour_velocity ↔ send_hour_vs_cohort_mean: r = 0.77
   β€’ send_hour_velocity ↔ send_hour_vs_cohort_pct: r = 0.77
   β€’ send_hour_velocity ↔ send_hour_cohort_zscore: r = 0.77
   β€’ send_hour_velocity_pct ↔ send_hour_acceleration: r = 0.76
   β€’ bounced_velocity ↔ bounced_acceleration: r = 0.87
   β€’ bounced_velocity ↔ bounced_vs_cohort_mean: r = 0.79
   β€’ bounced_velocity ↔ bounced_vs_cohort_pct: r = 0.79
   β€’ bounced_velocity ↔ bounced_cohort_zscore: r = 0.79
   β€’ time_to_open_hours_velocity ↔ time_to_open_hours_acceleration: r = 0.85
   β€’ time_to_open_hours_velocity ↔ time_to_open_hours_momentum: r = 0.72
   β€’ time_to_open_hours_velocity ↔ time_to_open_hours_vs_cohort_mean: r = 0.77
   β€’ time_to_open_hours_velocity ↔ time_to_open_hours_vs_cohort_pct: r = 0.77
   β€’ time_to_open_hours_velocity ↔ time_to_open_hours_cohort_zscore: r = 0.77
   β€’ opened_acceleration ↔ time_to_open_hours_acceleration: r = 0.71
   β€’ opened_momentum ↔ opened_vs_cohort_mean: r = 0.77
   β€’ opened_momentum ↔ opened_vs_cohort_pct: r = 0.77
   β€’ opened_momentum ↔ opened_cohort_zscore: r = 0.77
   β€’ time_to_open_hours_momentum ↔ time_to_open_hours_vs_cohort_mean: r = 0.89
   β€’ time_to_open_hours_momentum ↔ time_to_open_hours_vs_cohort_pct: r = 0.89
   β€’ time_to_open_hours_momentum ↔ time_to_open_hours_cohort_zscore: r = 0.89
   β€’ opened_beginning ↔ time_to_open_hours_beginning: r = 0.77
   β€’ opened_end ↔ opened_trend_ratio: r = 0.73
   β€’ opened_end ↔ time_to_open_hours_end: r = 0.79
   β€’ clicked_end ↔ clicked_trend_ratio: r = 0.92
   β€’ bounced_end ↔ bounced_trend_ratio: r = 0.99
   β€’ days_since_first_event_y ↔ active_span_days: r = 1.00
   β€’ inter_event_gap_std ↔ inter_event_gap_max: r = 0.89
   β€’ opened_vs_cohort_mean ↔ opened_vs_cohort_pct: r = 1.00
   β€’ opened_vs_cohort_mean ↔ opened_cohort_zscore: r = 1.00
   β€’ opened_vs_cohort_pct ↔ opened_cohort_zscore: r = 1.00
   β€’ clicked_vs_cohort_mean ↔ clicked_vs_cohort_pct: r = 1.00
   β€’ clicked_vs_cohort_mean ↔ clicked_cohort_zscore: r = 1.00
   β€’ clicked_vs_cohort_pct ↔ clicked_cohort_zscore: r = 1.00
   β€’ send_hour_vs_cohort_mean ↔ send_hour_vs_cohort_pct: r = 1.00
   β€’ send_hour_vs_cohort_mean ↔ send_hour_cohort_zscore: r = 1.00
   β€’ send_hour_vs_cohort_pct ↔ send_hour_cohort_zscore: r = 1.00
   β€’ bounced_vs_cohort_mean ↔ bounced_vs_cohort_pct: r = 1.00
   β€’ bounced_vs_cohort_mean ↔ bounced_cohort_zscore: r = 1.00
   β€’ bounced_vs_cohort_pct ↔ bounced_cohort_zscore: r = 1.00
   β€’ time_to_open_hours_vs_cohort_mean ↔ time_to_open_hours_vs_cohort_pct: r = 1.00
   β€’ time_to_open_hours_vs_cohort_mean ↔ time_to_open_hours_cohort_zscore: r = 1.00
   β€’ time_to_open_hours_vs_cohort_pct ↔ time_to_open_hours_cohort_zscore: r = 1.00

   πŸ’‘ For each pair, keep the feature with:
      - Stronger business meaning
      - Higher target correlation
      - Fewer missing values

----------------------------------------------------------------------
DETAILED RECOMMENDATIONS:

🟑 Remove multicollinear feature
   event_count_180d and event_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_180d and opened_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_180d and clicked_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_180d and send_hour_sum_180d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_180d and send_hour_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_180d and bounced_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_180d and opened_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_180d and clicked_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_180d and send_hour_sum_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_180d and send_hour_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_180d and bounced_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_365d and opened_count_180d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_365d and clicked_count_180d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_365d and send_hour_sum_180d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_365d and send_hour_count_180d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_365d and bounced_count_180d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_365d and opened_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_365d and clicked_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_365d and send_hour_sum_365d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_365d and send_hour_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_365d and bounced_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_all_time and opened_sum_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_all_time and opened_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_all_time and clicked_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_all_time and send_hour_sum_all_time are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_all_time and send_hour_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   event_count_all_time and bounced_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_all_time and send_hour_beginning are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   event_count_all_time and send_hour_end are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_180d and opened_mean_180d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_180d and time_to_open_hours_sum_180d are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_sum_180d and time_to_open_hours_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_180d and opened_sum_365d are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_180d and time_to_open_hours_count_365d are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_180d and time_to_open_hours_count_180d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_180d and opened_mean_365d are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_180d and lag0_opened_sum are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_mean_180d and lag0_opened_mean are highly correlated (r=0.90)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_180d and lag0_time_to_open_hours_count are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_180d and opened_vs_cohort_mean are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_180d and opened_vs_cohort_pct are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_180d and opened_cohort_zscore are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_180d and clicked_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_180d and send_hour_sum_180d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_180d and send_hour_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_180d and bounced_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_count_180d and opened_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_count_180d and clicked_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_count_180d and send_hour_sum_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_count_180d and send_hour_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_count_180d and bounced_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_sum_180d and clicked_mean_180d are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_sum_180d and clicked_sum_365d are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_180d and clicked_mean_365d are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_180d and lag0_clicked_sum are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_mean_180d and lag0_clicked_mean are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_180d and clicked_vs_cohort_mean are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_180d and clicked_vs_cohort_pct are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_180d and clicked_cohort_zscore are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_180d and send_hour_sum_180d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_180d and send_hour_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_180d and bounced_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_count_180d and opened_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_count_180d and clicked_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_count_180d and send_hour_sum_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_count_180d and send_hour_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_count_180d and bounced_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_sum_180d and send_hour_count_180d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_sum_180d and bounced_count_180d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_sum_180d and opened_count_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_sum_180d and clicked_count_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_sum_180d and send_hour_sum_365d are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_sum_180d and send_hour_count_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_sum_180d and bounced_count_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_mean_180d and send_hour_max_180d are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_mean_180d and send_hour_mean_365d are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_mean_180d and lag0_send_hour_mean are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_mean_180d and lag0_send_hour_max are highly correlated (r=0.86)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_max_180d and send_hour_mean_365d are highly correlated (r=0.73)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_max_180d and send_hour_max_365d are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_max_180d and lag0_send_hour_mean are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_max_180d and lag0_send_hour_max are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_count_180d and bounced_count_180d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_count_180d and opened_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_count_180d and clicked_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_count_180d and send_hour_sum_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_count_180d and send_hour_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_count_180d and bounced_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   bounced_sum_180d and bounced_mean_180d are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_sum_180d and bounced_sum_365d are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_180d and bounced_mean_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_180d and lag0_bounced_sum are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   bounced_mean_180d and lag0_bounced_mean are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_180d and bounced_vs_cohort_mean are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_180d and bounced_vs_cohort_pct are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_180d and bounced_cohort_zscore are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_count_180d and opened_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_count_180d and clicked_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_count_180d and send_hour_sum_365d are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_count_180d and send_hour_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_count_180d and bounced_count_365d are highly correlated (r=0.82)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_sum_180d and time_to_open_hours_mean_180d are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_sum_180d and time_to_open_hours_max_180d are highly correlated (r=0.96)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_sum_180d and time_to_open_hours_count_180d are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_sum_180d and time_to_open_hours_sum_365d are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and time_to_open_hours_max_180d are highly correlated (r=0.97)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_180d and time_to_open_hours_sum_365d are highly correlated (r=0.73)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and time_to_open_hours_mean_365d are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and time_to_open_hours_max_365d are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_180d and lag0_time_to_open_hours_sum are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and lag0_time_to_open_hours_mean are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and lag0_time_to_open_hours_max are highly correlated (r=0.96)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_180d and lag1_time_to_open_hours_mean are highly correlated (r=0.73)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_180d and lag1_time_to_open_hours_max are highly correlated (r=0.73)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and lag2_time_to_open_hours_mean are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and lag2_time_to_open_hours_max are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and lag3_time_to_open_hours_mean are highly correlated (r=0.86)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_180d and lag3_time_to_open_hours_max are highly correlated (r=0.87)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_180d and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_180d and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_180d and time_to_open_hours_cohort_zscore are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_180d and time_to_open_hours_sum_365d are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_max_180d and time_to_open_hours_mean_365d are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_max_180d and time_to_open_hours_max_365d are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_180d and lag0_time_to_open_hours_sum are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_max_180d and lag0_time_to_open_hours_mean are highly correlated (r=0.95)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_max_180d and lag0_time_to_open_hours_max are highly correlated (r=0.96)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_max_180d and lag2_time_to_open_hours_mean are highly correlated (r=0.90)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_max_180d and lag2_time_to_open_hours_max are highly correlated (r=0.90)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_180d and lag3_time_to_open_hours_mean are highly correlated (r=0.76)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_180d and lag3_time_to_open_hours_max are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_180d and time_to_open_hours_momentum are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_180d and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_180d and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_180d and time_to_open_hours_cohort_zscore are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_count_180d and opened_sum_365d are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_count_180d and time_to_open_hours_count_365d are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_365d and opened_mean_365d are highly correlated (r=0.76)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_365d and time_to_open_hours_sum_365d are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_sum_365d and time_to_open_hours_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_365d and time_to_open_hours_count_365d are highly correlated (r=0.76)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_365d and lag0_opened_sum are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_365d and lag0_opened_mean are highly correlated (r=0.76)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_365d and lag0_time_to_open_hours_count are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_365d and opened_vs_cohort_mean are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_365d and opened_vs_cohort_pct are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_mean_365d and opened_cohort_zscore are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_365d and clicked_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_365d and send_hour_sum_365d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_365d and send_hour_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_365d and bounced_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_sum_365d and clicked_mean_365d are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_365d and lag0_clicked_sum are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_365d and lag0_clicked_mean are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_365d and clicked_vs_cohort_mean are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_365d and clicked_vs_cohort_pct are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_mean_365d and clicked_cohort_zscore are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_365d and send_hour_sum_365d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_365d and send_hour_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_365d and bounced_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_sum_365d and send_hour_count_365d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_sum_365d and bounced_count_365d are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_mean_365d and send_hour_max_365d are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_mean_365d and lag0_send_hour_mean are highly correlated (r=0.78)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_mean_365d and lag0_send_hour_max are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_count_365d and bounced_count_365d are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_sum_365d and bounced_mean_365d are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_365d and lag0_bounced_sum are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_365d and lag0_bounced_mean are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_365d and bounced_vs_cohort_mean are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_365d and bounced_vs_cohort_pct are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_mean_365d and bounced_cohort_zscore are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_sum_365d and time_to_open_hours_mean_365d are highly correlated (r=0.83)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_sum_365d and time_to_open_hours_max_365d are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_sum_365d and time_to_open_hours_count_365d are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_365d and time_to_open_hours_max_365d are highly correlated (r=0.95)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_365d and lag0_time_to_open_hours_mean are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_365d and lag0_time_to_open_hours_max are highly correlated (r=0.93)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_mean_365d and lag1_time_to_open_hours_mean are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_365d and lag1_time_to_open_hours_max are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_365d and lag2_time_to_open_hours_mean are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_365d and lag2_time_to_open_hours_max are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_365d and lag3_time_to_open_hours_mean are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_365d and lag3_time_to_open_hours_max are highly correlated (r=0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_max_365d and lag0_time_to_open_hours_mean are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_max_365d and lag0_time_to_open_hours_max are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_365d and lag1_time_to_open_hours_mean are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_365d and lag1_time_to_open_hours_max are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_365d and lag2_time_to_open_hours_mean are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_365d and lag2_time_to_open_hours_max are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_365d and lag3_time_to_open_hours_mean are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_max_365d and lag3_time_to_open_hours_max are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_all_time and opened_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_all_time and clicked_sum_all_time are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_all_time and clicked_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_all_time and send_hour_sum_all_time are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_all_time and send_hour_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_all_time and bounced_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_sum_all_time and time_to_open_hours_sum_all_time are highly correlated (r=0.86)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_sum_all_time and time_to_open_hours_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_all_time and opened_beginning are highly correlated (r=0.76)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_sum_all_time and opened_end are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_all_time and clicked_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_all_time and send_hour_sum_all_time are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_all_time and send_hour_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_count_all_time and bounced_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_count_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_count_all_time and send_hour_beginning are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_count_all_time and send_hour_end are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_sum_all_time and clicked_mean_all_time are highly correlated (r=0.78)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_sum_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_all_time and send_hour_sum_all_time are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_all_time and send_hour_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_count_all_time and bounced_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_count_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_count_all_time and send_hour_beginning are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_count_all_time and send_hour_end are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_sum_all_time and send_hour_count_all_time are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_sum_all_time and bounced_count_all_time are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_sum_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_sum_all_time and send_hour_beginning are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_sum_all_time and send_hour_end are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_count_all_time and bounced_count_all_time are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_count_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_count_all_time and send_hour_beginning are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_count_all_time and send_hour_end are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_sum_all_time and bounced_mean_all_time are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_count_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.80)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_count_all_time and send_hour_beginning are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_count_all_time and send_hour_end are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_sum_all_time and time_to_open_hours_max_all_time are highly correlated (r=0.75)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_sum_all_time and time_to_open_hours_count_all_time are highly correlated (r=0.86)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_sum_all_time and time_to_open_hours_beginning are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_sum_all_time and time_to_open_hours_end are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_mean_all_time and time_to_open_hours_max_all_time are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_count_all_time and opened_beginning are highly correlated (r=0.76)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_count_all_time and opened_end are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   days_since_last_event_x and days_since_first_event_y are highly correlated (r=-0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   days_since_last_event_x and active_span_days are highly correlated (r=-0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_sum and lag0_opened_mean are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_sum and lag0_time_to_open_hours_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_opened_sum and opened_velocity are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_sum and opened_velocity_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_opened_sum and opened_momentum are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_sum and opened_vs_cohort_mean are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_sum and opened_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_sum and opened_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_mean and lag0_time_to_open_hours_count are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_opened_mean and opened_velocity_pct are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_mean and opened_vs_cohort_mean are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_mean and opened_vs_cohort_pct are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_mean and opened_cohort_zscore are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_count and lag0_clicked_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_count and lag0_send_hour_sum are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_count and lag0_send_hour_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_count and lag0_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_count and send_hour_vs_cohort_mean are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_count and send_hour_vs_cohort_pct are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_opened_count and send_hour_cohort_zscore are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_sum and lag0_clicked_mean are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_sum and clicked_vs_cohort_mean are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_sum and clicked_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_sum and clicked_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_mean and clicked_vs_cohort_mean are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_mean and clicked_vs_cohort_pct are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_mean and clicked_cohort_zscore are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_count and lag0_send_hour_sum are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_count and lag0_send_hour_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_count and lag0_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_count and send_hour_vs_cohort_mean are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_count and send_hour_vs_cohort_pct are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_clicked_count and send_hour_cohort_zscore are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_sum and lag0_send_hour_count are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_sum and lag0_bounced_count are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_send_hour_sum and send_hour_velocity are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_sum and send_hour_vs_cohort_mean are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_sum and send_hour_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_sum and send_hour_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_mean and lag0_send_hour_max are highly correlated (r=0.95)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_count and lag0_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_count and send_hour_vs_cohort_mean are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_count and send_hour_vs_cohort_pct are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_send_hour_count and send_hour_cohort_zscore are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_sum and lag0_bounced_mean are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_bounced_sum and bounced_velocity are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_sum and bounced_vs_cohort_mean are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_sum and bounced_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_sum and bounced_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_bounced_mean and bounced_velocity are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_mean and bounced_vs_cohort_mean are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_mean and bounced_vs_cohort_pct are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_mean and bounced_cohort_zscore are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_count and send_hour_vs_cohort_mean are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_count and send_hour_vs_cohort_pct are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_bounced_count and send_hour_cohort_zscore are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_sum and lag0_time_to_open_hours_mean are highly correlated (r=0.97)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_sum and lag0_time_to_open_hours_max are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_time_to_open_hours_sum and opened_velocity_pct are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_time_to_open_hours_sum and time_to_open_hours_velocity are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_sum and time_to_open_hours_momentum are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_sum and time_to_open_hours_vs_cohort_mean are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_sum and time_to_open_hours_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_sum and time_to_open_hours_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_mean and lag0_time_to_open_hours_max are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_mean and time_to_open_hours_velocity are highly correlated (r=0.86)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_time_to_open_hours_mean and time_to_open_hours_momentum are highly correlated (r=0.83)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_mean and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.97)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_mean and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.97)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_mean and time_to_open_hours_cohort_zscore are highly correlated (r=0.97)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_time_to_open_hours_count and opened_velocity are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_count and opened_velocity_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag0_time_to_open_hours_count and opened_momentum are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_count and opened_vs_cohort_mean are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_count and opened_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_count and opened_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_max and time_to_open_hours_velocity are highly correlated (r=0.90)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_max and time_to_open_hours_momentum are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_max and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_max and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag0_time_to_open_hours_max and time_to_open_hours_cohort_zscore are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_opened_sum and lag1_opened_mean are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_opened_sum and lag1_time_to_open_hours_sum are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_opened_sum and lag1_time_to_open_hours_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_opened_sum and opened_acceleration are highly correlated (r=-0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_opened_mean and lag1_time_to_open_hours_count are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_opened_count and lag1_clicked_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_opened_count and lag1_send_hour_sum are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_opened_count and lag1_send_hour_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_opened_count and lag1_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_clicked_sum and lag1_clicked_mean are highly correlated (r=0.95)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_clicked_sum and clicked_velocity are highly correlated (r=-0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_clicked_sum and clicked_acceleration are highly correlated (r=-0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_clicked_mean and clicked_acceleration are highly correlated (r=-0.76)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_clicked_count and lag1_send_hour_sum are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_clicked_count and lag1_send_hour_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_clicked_count and lag1_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_send_hour_sum and lag1_send_hour_count are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_send_hour_sum and lag1_bounced_count are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_send_hour_mean and lag1_send_hour_max are highly correlated (r=0.96)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_send_hour_count and lag1_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_time_to_open_hours_sum and lag1_time_to_open_hours_mean are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_time_to_open_hours_sum and lag1_time_to_open_hours_count are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_time_to_open_hours_sum and lag1_time_to_open_hours_max are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_time_to_open_hours_sum and time_to_open_hours_acceleration are highly correlated (r=-0.78)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag1_time_to_open_hours_mean and lag1_time_to_open_hours_max are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_time_to_open_hours_mean and time_to_open_hours_acceleration are highly correlated (r=-0.78)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_time_to_open_hours_count and opened_acceleration are highly correlated (r=-0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag1_time_to_open_hours_max and time_to_open_hours_acceleration are highly correlated (r=-0.81)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_opened_sum and lag2_opened_mean are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag2_opened_sum and lag2_time_to_open_hours_sum are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_opened_sum and lag2_time_to_open_hours_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_opened_mean and lag2_time_to_open_hours_count are highly correlated (r=0.91)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_opened_count and lag2_clicked_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_opened_count and lag2_send_hour_sum are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_opened_count and lag2_send_hour_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_opened_count and lag2_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_clicked_sum and lag2_clicked_mean are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_clicked_count and lag2_send_hour_sum are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_clicked_count and lag2_send_hour_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_clicked_count and lag2_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_send_hour_sum and lag2_send_hour_count are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_send_hour_sum and lag2_bounced_count are highly correlated (r=0.88)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_send_hour_mean and lag2_send_hour_max are highly correlated (r=0.96)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_send_hour_count and lag2_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_time_to_open_hours_sum and lag2_time_to_open_hours_mean are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   lag2_time_to_open_hours_sum and lag2_time_to_open_hours_count are highly correlated (r=0.74)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_time_to_open_hours_sum and lag2_time_to_open_hours_max are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag2_time_to_open_hours_mean and lag2_time_to_open_hours_max are highly correlated (r=0.98)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_opened_sum and lag3_opened_mean are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_opened_sum and lag3_time_to_open_hours_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_opened_mean and lag3_time_to_open_hours_count are highly correlated (r=0.94)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_opened_count and lag3_clicked_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_opened_count and lag3_send_hour_sum are highly correlated (r=0.87)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_opened_count and lag3_send_hour_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_opened_count and lag3_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_clicked_count and lag3_send_hour_sum are highly correlated (r=0.87)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_clicked_count and lag3_send_hour_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_clicked_count and lag3_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_send_hour_sum and lag3_send_hour_count are highly correlated (r=0.87)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_send_hour_sum and lag3_bounced_count are highly correlated (r=0.87)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_send_hour_mean and lag3_send_hour_max are highly correlated (r=0.96)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_send_hour_count and lag3_bounced_count are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_time_to_open_hours_sum and lag3_time_to_open_hours_mean are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_time_to_open_hours_sum and lag3_time_to_open_hours_max are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   lag3_time_to_open_hours_mean and lag3_time_to_open_hours_max are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_velocity and opened_velocity_pct are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_velocity and opened_acceleration are highly correlated (r=0.84)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_velocity and opened_vs_cohort_mean are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_velocity and opened_vs_cohort_pct are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_velocity and opened_cohort_zscore are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_velocity_pct and opened_vs_cohort_mean are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_velocity_pct and opened_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_velocity_pct and opened_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_velocity_pct and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_velocity_pct and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_velocity_pct and time_to_open_hours_cohort_zscore are highly correlated (r=0.70)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   clicked_velocity and clicked_acceleration are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_velocity and send_hour_velocity_pct are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_velocity and send_hour_acceleration are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_velocity and send_hour_vs_cohort_mean are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_velocity and send_hour_vs_cohort_pct are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_velocity and send_hour_cohort_zscore are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   send_hour_velocity_pct and send_hour_acceleration are highly correlated (r=0.76)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   bounced_velocity and bounced_acceleration are highly correlated (r=0.87)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_velocity and bounced_vs_cohort_mean are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_velocity and bounced_vs_cohort_pct are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   bounced_velocity and bounced_cohort_zscore are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_velocity and time_to_open_hours_acceleration are highly correlated (r=0.85)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_velocity and time_to_open_hours_momentum are highly correlated (r=0.72)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_velocity and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_velocity and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   time_to_open_hours_velocity and time_to_open_hours_cohort_zscore are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_acceleration and time_to_open_hours_acceleration are highly correlated (r=0.71)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_momentum and opened_vs_cohort_mean are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_momentum and opened_vs_cohort_pct are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_momentum and opened_cohort_zscore are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_momentum and time_to_open_hours_vs_cohort_mean are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_momentum and time_to_open_hours_vs_cohort_pct are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_momentum and time_to_open_hours_cohort_zscore are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_beginning and time_to_open_hours_beginning are highly correlated (r=0.77)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_end and opened_trend_ratio are highly correlated (r=0.73)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

🟑 Remove multicollinear feature
   opened_end and time_to_open_hours_end are highly correlated (r=0.79)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_end and clicked_trend_ratio are highly correlated (r=0.92)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   bounced_end and bounced_trend_ratio are highly correlated (r=0.99)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   days_since_first_event_y and active_span_days are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   inter_event_gap_std and inter_event_gap_max are highly correlated (r=0.89)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_vs_cohort_mean and opened_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_vs_cohort_mean and opened_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   opened_vs_cohort_pct and opened_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_vs_cohort_mean and clicked_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_vs_cohort_mean and clicked_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   clicked_vs_cohort_pct and clicked_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_vs_cohort_mean and send_hour_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_vs_cohort_mean and send_hour_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   send_hour_vs_cohort_pct and send_hour_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   bounced_vs_cohort_mean and bounced_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   bounced_vs_cohort_mean and bounced_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   bounced_vs_cohort_pct and bounced_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_vs_cohort_mean and time_to_open_hours_vs_cohort_pct are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_vs_cohort_mean and time_to_open_hours_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Remove multicollinear feature
   time_to_open_hours_vs_cohort_pct and time_to_open_hours_cohort_zscore are highly correlated (r=1.00)
   β†’ Consider dropping one of these features. Keep the one with stronger business meaning or higher target correlation.

πŸ”΄ Prioritize strong predictors
   Top predictive features: days_since_last_event_x, days_since_first_event_y, active_span_days
   β†’ Ensure these features are included in your model and check for data quality issues.

🟒 Consider removing weak predictors
   Features with low predictive power: send_hour_mean_180d, send_hour_max_180d, bounced_sum_180d, bounced_mean_180d, time_to_open_hours_mean_180d
   β†’ These features may add noise. Consider removing or combining with other features.

5.9.2 Stratification RecommendationsΒΆ

What these recommendations tell you:

  • How to split your data for training and testing
  • Which segments require special attention in sampling
  • High-risk segments that need adequate representation

⚠️ Why This Matters:

  • Random splits can under-represent rare segments
  • High-risk segments may be systematically excluded
  • Model evaluation will be biased without proper stratification

πŸ“Š Implementation:

from sklearn.model_selection import train_test_split

# Stratified split by target
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, stratify=y, random_state=42
)

# Multi-column stratification (for categorical segments)
df['stratify_col'] = df['target'].astype(str) + '_' + df['segment']
InΒ [14]:
Show/Hide Code
# Stratification Recommendations
strat_recs = grouped_recs.get(RecommendationCategory.STRATIFICATION, [])

print("=" * 70)
print("STRATIFICATION (Train/Test Split Strategy)")
print("=" * 70)

# High-risk segments
if analysis_summary.high_risk_segments:
    print("\n🎯 HIGH-RISK SEGMENTS (ensure representation in training data):")
    risk_df = pd.DataFrame(analysis_summary.high_risk_segments)
    risk_df["retention_rate"] = risk_df["retention_rate"].apply(lambda x: f"{x:.1%}")
    risk_df["lift"] = risk_df["lift"].apply(lambda x: f"{x:.2f}x")
    display(risk_df[["feature", "segment", "count", "retention_rate", "lift"]])

    print("\n   πŸ’‘ These segments have below-average retention.")
    print("   β†’ Ensure they're adequately represented in both train and test sets")
    print("   β†’ Consider oversampling or class weights in modeling")

# Display all stratification recommendations
if strat_recs:
    print("\n" + "-" * 70)
    print("STRATIFICATION RECOMMENDATIONS:")
    for rec in strat_recs:
        priority_icon = "πŸ”΄" if rec.priority == "high" else "🟑" if rec.priority == "medium" else "🟒"
        print(f"\n{priority_icon} {rec.title}")
        print(f"   {rec.description}")
        print(f"   β†’ {rec.action}")
else:
    print("\nβœ… No special stratification requirements detected.")
    print("   Standard stratified split by target variable is sufficient.")
======================================================================
STRATIFICATION (Train/Test Split Strategy)
======================================================================

🎯 HIGH-RISK SEGMENTS (ensure representation in training data):
feature segment count retention_rate lift
0 lifecycle_quadrant Occasional & Loyal 1683 7.6% 0.17x
1 lifecycle_quadrant Steady & Loyal 820 10.4% 0.23x
2 recency_bucket 0-7d 123 4.9% 0.11x
3 recency_bucket 31-90d 725 5.9% 0.13x
4 recency_bucket 8-30d 364 2.2% 0.05x
5 recency_bucket 91-180d 702 7.0% 0.16x
   πŸ’‘ These segments have below-average retention.
   β†’ Ensure they're adequately represented in both train and test sets
   β†’ Consider oversampling or class weights in modeling

----------------------------------------------------------------------
STRATIFICATION RECOMMENDATIONS:

πŸ”΄ Stratify by lifecycle_quadrant
   Significant variation in retention rates across lifecycle_quadrant categories (spread: 75.1%)
   β†’ Use stratified sampling by lifecycle_quadrant in train/test split to ensure all segments are represented.

πŸ”΄ Stratify by recency_bucket
   Significant variation in retention rates across recency_bucket categories (spread: 66.6%)
   β†’ Use stratified sampling by recency_bucket in train/test split to ensure all segments are represented.

πŸ”΄ Monitor high-risk segments
   Segments with below-average retention: Steady & Loyal, Occasional & Loyal, 0-7d
   β†’ Target these segments for intervention campaigns and ensure adequate representation in training data.

5.9.3 Model Selection RecommendationsΒΆ

What these recommendations tell you:

  • Which model types are well-suited for your data characteristics
  • Linear vs non-linear based on relationship patterns
  • Ensemble considerations based on feature interactions

πŸ“Š Model Selection Guide Based on Data Characteristics:

Data Characteristic Recommended Models Reason
Strong linear relationships Logistic Regression, Linear SVM Interpretable, fast, less overfit risk
Non-linear patterns Random Forest, XGBoost, LightGBM Capture complex interactions
High multicollinearity Tree-based models Robust to correlated features
Many categorical features CatBoost, LightGBM Native categorical handling
Imbalanced classes Any with class_weight='balanced' Adjust for minority class
InΒ [15]:
Show/Hide Code
# Model Selection Recommendations
model_recs = grouped_recs.get(RecommendationCategory.MODEL_SELECTION, [])

print("=" * 70)
print("MODEL SELECTION")
print("=" * 70)

if model_recs:
    for rec in model_recs:
        priority_icon = "πŸ”΄" if rec.priority == "high" else "🟑" if rec.priority == "medium" else "🟒"
        print(f"\n{priority_icon} {rec.title}")
        print(f"   {rec.description}")
        print(f"   β†’ {rec.action}")

# Summary recommendations based on data characteristics
print("\n" + "-" * 70)
print("RECOMMENDED MODELING APPROACH:")

has_multicollinearity = len(analysis_summary.multicollinear_pairs) > 0
has_strong_linear = len([p for p in analysis_summary.strong_predictors if abs(p.get("effect_size", 0)) >= 0.5]) > 0
has_categoricals = len(categorical_features) > 0

if has_strong_linear and not has_multicollinearity:
    print("\nβœ… RECOMMENDED: Start with Logistic Regression")
    print("   β€’ Strong linear relationships detected")
    print("   β€’ Interpretable coefficients for business insights")
    print("   β€’ Fast training and inference")
    print("   β€’ Then compare with tree-based ensemble for potential improvement")
elif has_multicollinearity:
    print("\nβœ… RECOMMENDED: Start with Random Forest or XGBoost")
    print("   β€’ Multicollinearity present - tree models handle it naturally")
    print("   β€’ Can keep all features without VIF analysis")
    print("   β€’ Use feature importance to understand contributions")
else:
    print("\nβœ… RECOMMENDED: Compare Linear and Tree-Based Models")
    print("   β€’ No clear linear dominance - test both approaches")
    print("   β€’ Logistic Regression for interpretability baseline")
    print("   β€’ Random Forest/XGBoost for potential accuracy gain")

if has_categoricals:
    print("\nπŸ’‘ CATEGORICAL HANDLING:")
    print("   β€’ For tree models: Consider CatBoost or LightGBM with native categorical support")
    print("   β€’ For linear models: Use target encoding for high-cardinality features")
======================================================================
MODEL SELECTION
======================================================================

🟑 Consider tree-based models for multicollinearity
   Found 442 highly correlated feature pairs
   β†’ Tree-based models (Random Forest, XGBoost) are robust to multicollinearity. For linear models, remove redundant features first.

🟑 Linear models may perform well
   Strong linear relationships detected (avg effect size: 0.99)
   β†’ Start with Logistic Regression as baseline. Clear feature-target relationships suggest interpretable models may work well.

🟑 Categorical features are predictive
   Strong categorical associations: lifecycle_quadrant, recency_bucket
   β†’ Use target encoding for tree-based models or one-hot encoding for linear models. Consider CatBoost for native categorical handling.

----------------------------------------------------------------------
RECOMMENDED MODELING APPROACH:

βœ… RECOMMENDED: Start with Random Forest or XGBoost
   β€’ Multicollinearity present - tree models handle it naturally
   β€’ Can keep all features without VIF analysis
   β€’ Use feature importance to understand contributions

πŸ’‘ CATEGORICAL HANDLING:
   β€’ For tree models: Consider CatBoost or LightGBM with native categorical support
   β€’ For linear models: Use target encoding for high-cardinality features

5.9.4 Feature Engineering RecommendationsΒΆ

What these recommendations tell you:

  • Interaction features to create based on correlation patterns
  • Ratio features that may capture relative relationships
  • Polynomial features for non-linear patterns

πŸ“Š Common Feature Engineering Patterns:

Pattern Found Feature to Create Example
Moderate correlation Ratio feature feature_a / feature_b
Both features predictive Interaction term feature_a * feature_b
Curved scatter pattern Polynomial feature_a ** 2
Related semantics Difference total_orders - returned_orders
InΒ [16]:
Show/Hide Code
# Feature Engineering Recommendations
eng_recs = grouped_recs.get(RecommendationCategory.FEATURE_ENGINEERING, [])

print("=" * 70)
print("FEATURE ENGINEERING")
print("=" * 70)

if eng_recs:
    for rec in eng_recs:
        priority_icon = "πŸ”΄" if rec.priority == "high" else "🟑" if rec.priority == "medium" else "🟒"
        print(f"\n{priority_icon} {rec.title}")
        print(f"   {rec.description}")
        print(f"   β†’ {rec.action}")
        if rec.affected_features:
            print(f"   β†’ Features: {', '.join(rec.affected_features[:5])}")
else:
    print("\nβœ… No specific feature engineering recommendations based on correlation patterns.")
    print("   Consider domain-specific features based on business knowledge.")

# Additional suggestions based on strong predictors
if analysis_summary.strong_predictors:
    print("\n" + "-" * 70)
    print("POTENTIAL INTERACTION FEATURES:")
    strong_features = [p["feature"] for p in analysis_summary.strong_predictors[:5]]
    if len(strong_features) >= 2:
        print("\n   Based on strong predictors, consider interactions between:")
        for i, f1 in enumerate(strong_features[:3]):
            for f2 in strong_features[i+1:4]:
                print(f"   β€’ {f1} Γ— {f2}")
        print("\n   πŸ’‘ Tree-based models discover interactions automatically.")
        print("   β†’ For linear models, create explicit interaction columns.")
======================================================================
FEATURE ENGINEERING
======================================================================

🟒 Consider ratio features
   Moderately correlated pairs may benefit from ratio features: event_count_180d/event_count_all_time, event_count_180d/opened_sum_180d, event_count_180d/clicked_sum_180d
   β†’ Create ratio features (e.g., feature_a / feature_b) to capture relative relationships.
   β†’ Features: event_count_180d, event_count_180d, event_count_180d, event_count_all_time, opened_sum_180d

🟒 Test feature interactions
   Interaction terms may capture non-linear relationships
   β†’ Use PolynomialFeatures(interaction_only=True) or tree-based models which automatically discover interactions.
   β†’ Features: event_count_180d, event_count_365d, event_count_all_time, opened_sum_180d

----------------------------------------------------------------------
POTENTIAL INTERACTION FEATURES:

   Based on strong predictors, consider interactions between:
   β€’ event_count_180d Γ— event_count_365d
   β€’ event_count_180d Γ— event_count_all_time
   β€’ event_count_180d Γ— opened_sum_180d
   β€’ event_count_365d Γ— event_count_all_time
   β€’ event_count_365d Γ— opened_sum_180d
   β€’ event_count_all_time Γ— opened_sum_180d

   πŸ’‘ Tree-based models discover interactions automatically.
   β†’ For linear models, create explicit interaction columns.

5.9.5 Recommendations Summary TableΒΆ

InΒ [17]:
Show/Hide Code
# Create summary table of all recommendations
all_recs_data = []
for rec in analysis_summary.recommendations:
    all_recs_data.append({
        "Category": rec.category.value.replace("_", " ").title(),
        "Priority": rec.priority.upper(),
        "Recommendation": rec.title,
        "Action": rec.action[:80] + "..." if len(rec.action) > 80 else rec.action
    })

if all_recs_data:
    recs_df = pd.DataFrame(all_recs_data)

    # Sort by priority
    priority_order = {"HIGH": 0, "MEDIUM": 1, "LOW": 2}
    recs_df["_sort"] = recs_df["Priority"].map(priority_order)
    recs_df = recs_df.sort_values("_sort").drop("_sort", axis=1)

    print("=" * 80)
    print("ALL RECOMMENDATIONS SUMMARY")
    print("=" * 80)
    print(f"\nTotal: {len(recs_df)} recommendations")
    print(f"  πŸ”΄ High priority: {len(recs_df[recs_df['Priority'] == 'HIGH'])}")
    print(f"  🟑 Medium priority: {len(recs_df[recs_df['Priority'] == 'MEDIUM'])}")
    print(f"  🟒 Low priority: {len(recs_df[recs_df['Priority'] == 'LOW'])}")

    display(recs_df)

# Save updated findings and recommendations registry
findings.save(FINDINGS_PATH)
registry.save(RECOMMENDATIONS_PATH)

print(f"\nβœ… Findings updated with relationship analysis: {FINDINGS_PATH}")
print(f"βœ… Recommendations registry saved: {RECOMMENDATIONS_PATH}")
print(f"   Total recommendations in registry: {len(registry.all_recommendations)}")

if _namespace:
    from customer_retention.analysis.auto_explorer.project_context import ProjectContext

    _namespace.merged_dir.mkdir(parents=True, exist_ok=True)
    _all_findings = _namespace.discover_all_findings(prefer_aggregated=True)
    _mgr = ExplorationManager(_namespace.merged_dir, findings_paths=_all_findings)

    _scaffold = []
    if _namespace.project_context_path.exists():
        _ctx = ProjectContext.load(_namespace.project_context_path)
        _scaffold = _ctx.merge_scaffold

    _multi = _mgr.create_multi_dataset_findings(merge_scaffold=_scaffold)
    _multi.save(str(_namespace.multi_dataset_findings_path))
    _per_dataset_recs = []
    for _ds_name in _namespace.list_datasets():
        _ds_dir = _namespace.dataset_findings_dir(_ds_name)
        if _ds_dir.is_dir():
            for _rp in sorted(_ds_dir.glob("*_recommendations.yaml")):
                _per_dataset_recs.append(RecommendationRegistry.load(str(_rp)))
    if _per_dataset_recs:
        _merged = RecommendationRegistry.merge(_per_dataset_recs)
        _merged.save(str(_namespace.merged_recommendations_path))
    print(f"\nβœ… Merged findings/recommendations saved to {_namespace.merged_dir}")
================================================================================
ALL RECOMMENDATIONS SUMMARY
================================================================================

Total: 452 recommendations
  πŸ”΄ High priority: 241
  🟑 Medium priority: 208
  🟒 Low priority: 3
Category Priority Recommendation Action
357 Feature Selection HIGH Remove multicollinear feature Consider dropping one of these features. Keep ...
179 Feature Selection HIGH Remove multicollinear feature Consider dropping one of these features. Keep ...
336 Feature Selection HIGH Remove multicollinear feature Consider dropping one of these features. Keep ...
335 Feature Selection HIGH Remove multicollinear feature Consider dropping one of these features. Keep ...
334 Feature Selection HIGH Remove multicollinear feature Consider dropping one of these features. Keep ...
... ... ... ... ...
157 Feature Selection MEDIUM Remove multicollinear feature Consider dropping one of these features. Keep ...
170 Feature Selection MEDIUM Remove multicollinear feature Consider dropping one of these features. Keep ...
443 Feature Selection LOW Consider removing weak predictors These features may add noise. Consider removin...
450 Feature Engineering LOW Consider ratio features Create ratio features (e.g., feature_a / featu...
451 Feature Engineering LOW Test feature interactions Use PolynomialFeatures(interaction_only=True) ...

452 rows Γ— 4 columns

βœ… Findings updated with relationship analysis: /Users/Vital/python/CustomerRetention/experiments/runs/email-6301db6c/datasets/customer_emails/findings/customer_emails_aggregated_findings.yaml
βœ… Recommendations registry saved: /Users/Vital/python/CustomerRetention/experiments/runs/email-6301db6c/datasets/customer_emails/findings/customer_emails_aggregated_recommendations.yaml
   Total recommendations in registry: 674
βœ… Merged findings/recommendations saved to /Users/Vital/python/CustomerRetention/experiments/runs/email-6301db6c/merged

Summary: What We LearnedΒΆ

In this notebook, we analyzed feature relationships and generated actionable recommendations for modeling.

Analysis PerformedΒΆ

Numeric Features:

  1. Correlation Matrix - Identified multicollinearity issues between feature pairs
  2. Effect Sizes (Cohen's d) - Quantified how well features discriminate retained vs churned
  3. Box Plots - Visualized distribution differences between classes
  4. Feature-Target Correlations - Ranked features by predictive power

Categorical Features: 5. CramΓ©r's V - Measured association strength for categorical variables 6. Retention by Category - Identified high-risk segments 7. Lift Analysis - Found categories performing above/below average

Datetime Features: 8. Cohort Analysis - Retention trends by signup year 9. Seasonality - Monthly patterns in retention

Actionable Recommendations GeneratedΒΆ

Category What It Tells You Impact on Pipeline
Feature Selection Which features to prioritize/drop Reduces noise, improves interpretability
Stratification How to split train/test Ensures fair evaluation
Model Selection Which algorithms to try first Matches model to data
Feature Engineering Interactions to create Captures non-linear patterns

Key Metrics ReferenceΒΆ

Data Type Effect Measure Strong Signal
Numeric Cohen's d |d| β‰₯ 0.8
Numeric Correlation |r| β‰₯ 0.5
Categorical CramΓ©r's V V β‰₯ 0.3
Categorical Lift < 0.9x or > 1.1x

Recommended Actions ChecklistΒΆ

Based on the analysis above, here are the key actions to take:

  • Feature Selection: Review strong/weak predictors and multicollinear pairs
  • Stratification: Use stratified sampling with identified high-risk segments
  • Model Selection: Start with recommended model type based on data characteristics
  • Feature Engineering: Create interaction features between strong predictors

Next StepsΒΆ

Continue to 05_feature_opportunities.ipynb to:

  • Generate derived features (tenure, recency, engagement scores)
  • Identify interaction features based on relationships found here
  • Create business-relevant composite scores
  • Review automated feature recommendations

Save Reminder: Save this notebook (Ctrl+S / Cmd+S) before running the next one. The next notebook will automatically export this notebook's HTML documentation from the saved file.