Defining Custom Metrics and Remedies
This guide explains how to define custom metrics and remediation logic for your uploaded CSV files using the CustomDR class inside the CodeMirror editor. After uploading a dataset, you will navigate you to a page where you can write Python code that extends the platform’s data-review logic.
Workflow Overview for Creating Custom Metrics
Navigate to the file upload page at https://127.0.0.1:5000/upload_file. And upload a CSV file.
After upload, click the Define Custom Metrics button. You will be redirected to the custom metric editor page at https://127.0.0.1:5000/customMetrics.
On this page, a CodeMirror Python editor appears, preloaded with an editable example class
CustomDR. This class inherits fromBaseDRAgentand contains two key methods:metric(): returns a dictionary of metric resultsremedy(): returns a modified dataset based on your remediation logic
Write or modify your code inside the editor.
Press Save to store your custom logic on the server (temporary, 1-hour expiration).
Press Submit to execute your
metric()function, and optionally yourremedy()function if you have checked the Apply Remedy box.
The platform will display:
Your computed metric dictionary
Any remediated dataset to download (if remedy is enabled)
Any warnings or errors raised by your code
Understanding the CustomDR Base Class
Below is the template initially shown in CodeMirror:
from aidrin.custom_metrics.base_dr import BaseDRAgent
from typing import Any
from typing import Dict, Union, Any
class CustomDR(BaseDRAgent):
def __init__(self, dataset: Any, **kwargs):
super().__init__(dataset, **kwargs)
def metric(self, **kwargs):
"""
Implement your custom metric logic here.
"""
# IMPLEMENT YOUR METRIC LOGIC BELOW
# Example: Calculating the total number of missing cells in the entire DataFrame
# df: pd.DataFrame = self.dataset
# return {
# "total_missing_cells": df.isna().sum().to_dict()
# }
return {"message": "Placeholder metric. Implement your logic here."}
def remedy(self, metric_results: dict):
"""
Applies custom remediation logic based on the calculated metrics.
"""
# IMPLEMENT YOUR REMEDIATION LOGIC BELOW
# For example, filling null values with a default value
# df_remedied: pd.DataFrame = self.dataset.copy()
# df_remedied.fillna(0, inplace=True)
# return df_remedied
return self.dataset
The goal is to replace the placeholder logic with your own custom metric and remediation steps.
Implementing metric(): Requirements & Tips
Your metric() method:
Must return a dictionary whose keys are metric names and values are computed results.
Receives the dataset through
self.dataset.May accept additional keyword arguments (depending on future UI extensions).
Should not mutate the dataset; all transformations belong in
remedy().
Example: Compute missing values, row count, and column datatypes.
def metric(self, **kwargs):
df = self.dataset
return {
"row_count": len(df),
"column_count": df.shape[1],
"missing_values": df.isna().sum().to_dict(),
"dtypes": df.dtypes.apply(lambda x: str(x)).to_dict(),
}
Implementing remedy(): Requirements & Tips
The remedy() method receives the metric_results dictionary returned by metric().
Use this method when you want to apply data-cleaning or transformation logic based on your computed metrics. Or, you can modify the dataset directly without relying on metric_results.
You must return the updated dataset at the end of remedy().
Full Practical Example: A Combined Metric + Remedy Class
The example below shows both metric() and remedy() implemented in a realistic workflow.
from aidrin.custom_metrics.base_dr import BaseDRAgent
import pandas as pd
from typing import Dict, Union, Any
class CustomDR(BaseDRAgent):
"""
An agent focused on detecting and removing duplicate rows.
"""
def __init__(self, dataset: Any, **kwargs):
super().__init__(dataset, **kwargs)
def metric(self, **kwargs) -> Dict[str, int]:
"""
Calculates the total count of duplicate rows.
"""
df: pd.DataFrame = self.dataset
duplicate_rows_count: int = df.duplicated().sum()
return {
"duplicate_rows_total": duplicate_rows_count,
}
def remedy(self, metric_results: Dict[str, Any]) -> pd.DataFrame:
"""
Removes duplicate rows using the calculated metric results.
"""
# Create a copy for safe modification to prevent side effects on the original state.
df_remedied: pd.DataFrame = self.dataset.copy()
duplicate_count = metric_results.get("duplicate_rows_total", 0)
if duplicate_count > 0:
df_remedied.drop_duplicates(inplace=True)
return df_remedied
How the System Uses Your CustomDR Class
When you click Submit:
Your code is dynamically loaded and executed in an isolated environment.
The system creates an instance of your
CustomDRclass.The system calls your
metric()method to compute metrics.If Apply Remedy is checked , the system calls your
remedy()method to get the modified dataset.Metrics and (optionally) the remedied data preview are displayed on the results section of the page.
Best Practices for Writing Custom Metrics
Do not mutate the dataset inside ``metric()``. All modifications belong in
remedy().Work on a copy of
self.datasetinremedy()to avoid side effects.Always return the modified dataset at the end of
remedy().
Data & Code Storage Rules
Your custom metric code is stored temporarily for 1 hour.
The (optional) remedied dataset is also stored for 1 hour.
After expiration, all artifacts are safely removed.