[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"blog-post-5-ml-models-one-line-of-math":3},{"id":4,"title":5,"slug":6,"excerpt":7,"category":8,"tags":9,"author_name":13,"cover_image":14,"status":15,"view_count":16,"reading_time_minutes":17,"published_at":18,"updated_at":18,"created_at":19,"content":20,"meta_description":21,"og_image":14,"canonical_url":22,"author_uid":22,"previous_slugs":23,"images":24},"69ac3871dfa72009eb0767c0","5 Machine Learning Models You Can Train with One Line of Math","5-ml-models-one-line-of-math","From linear regression to neural networks: 5 models you can train in MathExec just by typing a formula.","tutorial",[8,10,11,12],"beginner","machine-learning","formulas","Kingsley Michael","https:\u002F\u002Fmathexec.com\u002Fblog\u002Fimages\u002F69ac3871dfa72009eb0767c0\u002F9e9d6234-0484-4cb3-866a-7f3c5c2d797b.png","published",45,8,"2026-02-28T14:38:41.373000","2026-02-28T12:38:41.373000","# 5 Machine Learning Models You Can Train with One Line of Math\n\nOne of the things that makes MathExec unique is that you don't write code, you write math. Here are 5 models you can train just by typing a formula, along with when to use each one and what to expect.\n\n## 1. Linear Regression\n\n**Formula:** `y = mx + b`\n\nThe simplest model. Fits a straight line through your data by learning a slope `m` and intercept `b` that minimize the squared error between predictions and actual values.\n\n**When to use it:** Predicting continuous values with a roughly linear relationship. Housing prices based on square footage, sales forecasts from ad spend, temperature trends over time.\n\n**What to expect:** Linear regression trains in seconds, even on large datasets. If your data has a genuinely linear relationship, you'll see low MSE and a loss curve that drops quickly and flattens out. If the relationship isn't linear, the model will converge to the best straight-line approximation, which might not be great. That's your cue to try polynomial regression or a neural network.\n\n**Tips:** Linear regression is a great first model to try on any new dataset. Even if it doesn't perform well, it gives you a baseline to compare against more complex models. If linear regression gets 80% of the way there, a more complex model might only buy you a few extra percentage points.\n\nFor multivariate input (multiple columns), use `y = Wx + b` instead. The capital `W` tells MathExec to create a weight matrix that handles all input features at once.\n\n## 2. Polynomial Regression\n\n**Formula:** `y = ax² + bx + c`\n\nFits a curve instead of a line. The model learns coefficients `a`, `b`, and `c` that define a parabola. You can go higher-order too: `y = β₀ + β₁x + β₂x² + β₃x³` fits a cubic curve.\n\n**When to use it:** When the relationship between variables is curved. Growth rates that plateau, diminishing returns on investment, seasonal patterns, physical phenomena like projectile motion.\n\n**What to expect:** Polynomial regression is still fast to train, and the loss curve should drop more steeply than linear regression if there's genuine curvature in the data. Watch out for overfitting with high-degree polynomials: a degree-10 polynomial can fit your training data perfectly while being terrible on new data.\n\n**Tips:** Start with degree 2 (quadratic) and only increase if the fit is clearly insufficient. In most practical cases, degree 2-3 is enough. If you need degree 5+ to fit your data, a neural network will probably generalize better.\n\nAlso, polynomial regression on a single feature doesn't extend easily to multiple features. For multivariate curved relationships, a neural network with ReLU activations is usually a better choice.\n\n## 3. Logistic Regression (Binary Classification)\n\n**Formula:** `y = σ(Wx + b)`\n\nThe sigmoid function `σ` squashes the output to the range [0, 1], making this a binary classifier. Values closer to 1 mean \"yes\" (positive class), values closer to 0 mean \"no\" (negative class). The decision boundary is at 0.5 by default.\n\n**When to use it:** Yes\u002Fno predictions. Spam detection, customer churn prediction, medical diagnosis (disease present or not), fraud detection, pass\u002Ffail classification.\n\n**What to expect:** MathExec automatically selects binary cross-entropy as the loss function when it sees a sigmoid output. You'll see accuracy and loss metrics during training. For balanced datasets, you should see accuracy climb above 70-80% within the first few epochs if the features are informative.\n\n**Tips:** Logistic regression assumes a linear decision boundary. If your classes aren't linearly separable (imagine two concentric circles of data points), logistic regression will struggle. In that case, add a hidden layer to get a neural network that can learn non-linear boundaries.\n\nCheck your class balance before training. If 95% of your data is one class, the model can \"cheat\" by always predicting the majority class and still show 95% accuracy. MathExec shows per-class metrics to help catch this.\n\n## 4. Multi-class Classification\n\n**Formula:** `y = softmax(Wx + b)`\n\nSoftmax generalizes sigmoid to multiple classes. Instead of outputting a single probability, it outputs a probability distribution across all classes. The class with the highest probability is the prediction.\n\n**When to use it:** Categorizing into 3 or more classes. Digit recognition (0-9), sentiment analysis (positive\u002Fneutral\u002Fnegative), species classification, document categorization, product type classification.\n\n**What to expect:** MathExec uses cross-entropy loss for softmax outputs. Training time scales with the number of classes and features. For a dataset with 10 classes and a few hundred examples per class, expect training to take 5-15 seconds.\n\n**Tips:** Softmax works well when your classes are mutually exclusive (each example belongs to exactly one class). If examples can belong to multiple classes simultaneously (like tagging a photo with multiple labels), use sigmoid instead of softmax: `y = σ(Wx + b)` with a multi-label binary cross-entropy loss.\n\nThe linear decision boundaries of `softmax(Wx + b)` might not be enough for complex classification tasks. If accuracy plateaus below your target, upgrade to a neural network version: `y = softmax(W₂ · ReLU(W₁x + b₁) + b₂)`.\n\n## 5. Two-Layer Neural Network\n\n**Formula:** `y = σ(W₂ · ReLU(W₁x + b₁) + b₂)`\n\nA proper neural network with a hidden layer. The first layer (`W₁x + b₁`) projects your input into a learned feature space. ReLU activation introduces non-linearity, allowing the network to learn curved decision boundaries and complex patterns. The second layer (`W₂ · ... + b₂`) maps from the hidden space to the output.\n\n**When to use it:** When simpler models aren't capturing the pattern. Most classification and regression tasks benefit from at least one hidden layer, especially when the relationship between features and target is non-linear.\n\n**What to expect:** Neural networks take longer to train than linear models, but for tabular data with a few thousand rows, you're still looking at 10-30 seconds. The loss curve might show more noise (bouncing up and down) compared to linear models. This is normal. What matters is the overall downward trend.\n\n**Tips:** The default hidden layer size in MathExec is 64 units. This is a good starting point for most tabular datasets. If your dataset has very few features (2-5), you might get better results with a smaller hidden layer (16-32). If you have many features (50+), try increasing to 128 or 256.\n\nYou can add more layers for additional capacity: `y = σ(W₃ · ReLU(W₂ · ReLU(W₁x + b₁) + b₂) + b₃)`. But more layers doesn't always mean better results. For tabular data, 2-3 layers is usually sufficient. Going deeper tends to help more with image and text data.\n\nIf you're doing regression instead of classification, drop the outer sigmoid: `y = W₂ · ReLU(W₁x + b₁) + b₂`. This gives you an unbounded output suitable for predicting continuous values.\n\n## How to train any of these\n\n1. Go to [MathExec](https:\u002F\u002Fmathexec.com\u002Fapp)\n2. Type (or draw) the formula\n3. Upload a CSV with your data\n4. Click **Train**\n\nMathExec handles the PyTorch compilation, training loop, loss function selection, and optimization automatically. You can adjust hyperparameters (learning rate, epochs, batch size) in the training panel, but the defaults work well for getting started.\n\n## Choosing the right model\n\nNot sure which formula to start with? Here's a quick guide:\n\n- **Start with linear regression** (`y = mx + b`) to get a baseline\n- **If the data is curved**, try polynomial (`y = ax² + bx + c`)\n- **If it's a yes\u002Fno problem**, use logistic regression (`y = σ(Wx + b)`)\n- **If there are multiple categories**, use softmax (`y = softmax(Wx + b)`)\n- **If nothing else works well enough**, add a hidden layer\n\nThe beauty of MathExec is that switching between these takes about 10 seconds. Delete the formula, type a new one, hit Train. You can iterate through all five models on the same dataset in under five minutes.\n\n## Common mistakes to avoid\n\nA few things to watch out for when choosing and training these models:\n\n**Don't skip normalization.** If your input features are on very different scales (e.g., age ranges 0-100 while income ranges 0-500,000), models with weight matrices will struggle to converge. Use Data Studio to normalize your features first, or let MathExec auto-scale.\n\n**Don't over-interpret training accuracy.** A model that gets 99% accuracy on training data might be memorizing, not learning. Look at the validation metrics, which MathExec computes automatically on a held-out portion of your data.\n\n**Don't jump to complex models too quickly.** If linear regression gives you an R² of 0.92, adding two hidden layers probably won't help much and might make your model less interpretable. Start simple, and only add complexity when you can see the simpler model failing.\n\n## Beyond the basics\n\nYou can combine these building blocks into more complex architectures. MathExec's canvas lets you chain formula blocks together visually, creating custom pipelines. And every training run is saved as an experiment, so you can compare results across different formulas to find what works best for your data.\n\n---\n\n*Try these formulas in our [Gallery](https:\u002F\u002Fmathexec.com\u002Fgallery). Each one can be loaded into your workspace with one click.*\n","Learn how to train 5 different ML models by writing a single math formula in MathExec. No code required.",null,[],[25],"blog\u002F69ac3871dfa72009eb0767c0\u002Fb738ac67-dc9d-42dd-b197-94ef00997a86.png"]