Data Studio

Profile, transform, and prepare your data

Data Studio is the second tab (Cmd+2). It lets you profile, transform, and clean your data before training — without leaving MathExec.

Profiling

Select a dataset to see an automatic profile with:

  • Column statistics — type, missing values, unique count, mean/median/std for numerics
  • Data table — sortable, scrollable preview of all rows
  • Distribution hints — quick stats per column header

Toolbar

The Data Studio has its own set of tool shortcuts:

ToolKeyDescription
FilterFFilter rows by column conditions
VisualizeVQuick charts and distributions
CleanCHandle missing values, duplicates, outliers
DeriveGCreate new columns from expressions
AskANatural-language transforms (LLM-powered)
ExportEDownload transformed dataset

Natural Language Transforms

The Ask tool lets you describe transforms in plain English. For example:

  • "Drop rows where age is negative"
  • "Normalize the price column"
  • "Create a BMI column from height and weight"
  • "One-hot encode the category column"

MathExec generates pandas code under the hood, shows you a before/after preview, and lets you apply or discard the transform.

💡Tip

Each transform is added to a pipeline that you can review, reorder, or remove. The pipeline persists per dataset within your project.

Transform Pipeline

Every applied transform is recorded as a step in the pipeline. The pipeline panel shows:

  • Each step with a description and the generated code
  • Remove button to undo individual steps
  • The pipeline is saved to localStorage, keyed by project and dataset

Suggestions

MathExec can suggest transforms based on your data. Click Suggest to get LLM-powered recommendations like:

  • Handling missing values in specific columns
  • Encoding categorical variables
  • Scaling numeric features
  • Removing highly correlated columns