Data Studio
Profile, transform, and prepare your data
Data Studio is the second tab (Cmd+2). It lets you profile, transform, and clean your data before training — without leaving MathExec.
Profiling
Select a dataset to see an automatic profile with:
- Column statistics — type, missing values, unique count, mean/median/std for numerics
- Data table — sortable, scrollable preview of all rows
- Distribution hints — quick stats per column header
Toolbar
The Data Studio has its own set of tool shortcuts:
| Tool | Key | Description |
|---|---|---|
| Filter | F | Filter rows by column conditions |
| Visualize | V | Quick charts and distributions |
| Clean | C | Handle missing values, duplicates, outliers |
| Derive | G | Create new columns from expressions |
| Ask | A | Natural-language transforms (LLM-powered) |
| Export | E | Download transformed dataset |
Natural Language Transforms
The Ask tool lets you describe transforms in plain English. For example:
- "Drop rows where age is negative"
- "Normalize the price column"
- "Create a BMI column from height and weight"
- "One-hot encode the category column"
MathExec generates pandas code under the hood, shows you a before/after preview, and lets you apply or discard the transform.
💡Tip
Each transform is added to a pipeline that you can review, reorder, or remove. The pipeline persists per dataset within your project.
Transform Pipeline
Every applied transform is recorded as a step in the pipeline. The pipeline panel shows:
- Each step with a description and the generated code
- Remove button to undo individual steps
- The pipeline is saved to localStorage, keyed by project and dataset
Suggestions
MathExec can suggest transforms based on your data. Click Suggest to get LLM-powered recommendations like:
- Handling missing values in specific columns
- Encoding categorical variables
- Scaling numeric features
- Removing highly correlated columns