Using Sourcetable AI’s AI (Not Just the Data): A Real Walkthrough from “Import” to Insights (Affiliate Links Inside)

Exactly what the AI did when I dropped in a Kaggle cars CSV—how it read the sheet, summarized it, ran three analyses on command, and surfaced enterprise-grade integrations without leaving the grid.

Aug 28, 2025

a sign with a question mark and a question mark drawn on it — Photo by Nahrizul Kadri on Unsplash

After writing my previous post, I discovered a great product to work with CSVs that is easy to use, very comprehensive, and available today, which I can vouch for. I wasn’t trying to build a perfect used-car pricing model. I wanted to see what Sourcetable’s AI actually does when you feed it a real dataset and talk to it like a teammate. And that’s why I decided to become their affiliate partner.

So I grabbed the USA Cars dataset from Kaggle (2,499 rows) and opened it in a brand-new Sourcetable workbook. What follows is a blow-by-blow of the AI assistant’s behavior, the UI flows I used (Import ➝ AI Chat ➝ quick checks ➝ prompts ➝ results), and a peek at the integrations dialog to show how this can scale beyond CSVs.

You’ll see exactly what the assistant said and produced, and the features/benefits that matter for a spreadsheet-first analyst.

The on-ramp: Import + AI Chat in one screen

I clicked Import data. The assistant immediately clarified: “I can read CSV and Excel files. I can also integrate your data sources.”

Two useful things already:

Familiar file flow (CSV/XLSX) — no ceremony.
A hint about scale — “integrate your data sources” (we’ll come back to the integrations dialog you opened).

Then I typed: “understand this dataset.”

What the AI did next (zero setup)

Detected the active table: “I can see you have 2,499 rows of vehicle/car data.”
Pulled sample rows into the chat so I could eyeball columns (price, brand, model, year, title_status, mileage, color, vin, lot, state, country, condition).
Ran a compact EDA and posted a structured summary (“Dataset Overview,” “Key Columns,” “Key Insights,” “Data Quality Notes”).

That’s key: no formula writing, no manual profiling, no leaving the sheet. The assistant created the exact “what am I looking at?” brief every analyst needs in the first five minutes.

Features & benefits you just used (even if it felt basic)

AI-readable grid: The assistant can see the active table and reason about its shape.
In-chat sample rows: Trust comes from glancing at live records—you got that instantly.
One-click profile: Range checks, unique counts, min/max, and domain-specific guesses (the “condition” column as an auction countdown)—this is the “why” in your first minute.

“What else can you do with this data?” — the menu moment

You asked that exact question. The assistant answered with a menu of directions—analytics, ML, dashboards, stats, BI, feature engineering, and even external-data joins.

This matters because it shows how the AI scopes work inside the sheet:

Analytics scopes: depreciation, regional comparisons, brand premium, sweet-spot detection.
ML scopes: price prediction, title-status classification, anomaly detection, segmentation.
Viz scopes: scatter, geo heatmaps, brand comparisons, correlation matrices; dashboards (exec, brand, regional).
BI scopes: undervalued lists, inventory/pricing strategy, regional risk.
Data engineering scopes: age/mileage tiers, regions, composite “value” scores.
Integrations/tech: GA4, GSC, PostgreSQL, Robinhood research, sports data, scraping, TabPFN for tabular ML, ECharts/Plotly for charts, secure credentials.

The takeaway isn’t “all of this is magic.” It’s that the assistant speaks in analyst tasks, and most of those tasks are achievable without leaving the grid.

The three things you asked it to do (and how the AI behaved)

You then ran three concrete asks. Here’s the play-by-play with capabilities emphasized:

1) Sweet Spot Analysis

Prompt: “Find the best value propositions (low price, low mileage, recent year).”

What happened:

The AI inferred sane thresholds from the dataset (e.g., recent year = 2016+, low mileage ≈ < median 35,365, reasonable price ≈ < $25,556).
It filtered the grid to count and list qualifying rows: 747 vehicles (29.9%).
It surfaced a brand-level perspective (Hyundai, Jeep, Infiniti, Nissan as strong value families; Ford with the most qualifying options).

Why the feature matters:

Promptable filtering + summarization means you can codify a rule in plain English, and the assistant turns it into reproducible criteria (not vibes).
The AI used the sheet’s distribution (median, upper bounds) rather than arbitrary cutoffs—credible defaults.

Sanity guardrail shown in chat:

It flagged extreme “deals” (e.g., $25 for a 2020 unit) and cross-referenced title_status. This is an important analyst behavior baked into the assistant: pair a filter with a likely risk attribute.

2) Which brand has the best mileage?

Prompt: “Tell me which brand has the best mileage.”

What happened:

The AI grouped by brand and computed average mileage.
It separated edge cases (e.g., Heartland—likely RV/trailer fleet with 1-mile averages) from mainstream brands (10+ vehicles).
It returned a ranked list of traditional brands with the lowest mileage averages (Infiniti, Buick, Jeep, Cadillac, Nissan).

Why the feature matters:

Group-by + summary from a prompt is the core of spreadsheet-first analytics.
The assistant contextualized outliers (commercial/heavy vs consumer), nudging you toward segmenting before deciding—again, analyst-like behavior.

3) Title Status Prediction (clean vs salvage)

Prompt: “Predict clean vs salvage based on other features, choose best features only.”

What happened under the hood (and what it reported):

It encoded title_status (0/1), preprocessed for TabPFN-style tabular ML, and trained on all 2,499 complete cases.
It returned metrics by class (not just accuracy):
- Accuracy 97.8%; precision/recall for salvage ~84%/82%; for clean ~99%/99%.
It produced feature importance (Price ≫ Mileage ≫ Year ≫ Brand).
It distilled human-readable rules (e.g., price < $10k, year < 2010, mileage > 100k → higher salvage probability).

Why the feature matters:

In-grid ML that returns class metrics and plain-English rules shortens the distance from model to decision.
Importance aligns with intuition (price, mileage, year), so the model is explainable to stakeholders in one screenshot.

Note: With only ~6.5% salvage, class balance matters. The assistant reporting per-class precision/recall is exactly what you want to see.

The integrations view you opened (and why it changes the game)

You also opened Integrations. The dialog showed both popular and all integrations—clean, searchable, and confirmable from inside the workbook.

What you saw at the top (Popular Integrations):

Salesforce • QuickBooks • Postgres • Stripe • Mailchimp

Scrolling through “All Integrations,” examples included:

AdRoll, Airtable, Amazon Ads, Amplitude … (and more in the list)

Alt-text for your screenshot (SEO):
“‘Connect New Integration’ modal in Sourcetable showing search bar, Popular Integrations (Salesforce, QuickBooks, Postgres, Stripe, Mailchimp), and a scrolling catalog (AdRoll, Airtable, Amazon Ads, Amplitude). The workbook with the cars table is visible in the background.”

Why this matters for your workflow:

You started with a CSV—but the same AI + grid loop applies to first-class data sources.
Instead of downloading exports every week, you can point the workbook at systems (CRM, finance, databases, ads, email) and keep running the same prompts (“understand this dataset,” “sweet spot,” “who’s at risk,” “make an exec summary”) on live tables.
It means the AI assistant isn’t a toy; it’s a front door to your stack.

(In your earlier chat, the assistant also rattled off capabilities like GA4, GSC, Robinhood research, sports data, scraping, and TabPFN modeling—reinforcing that Sourcetable AI is a hub, not a one-off script.)

The real value on display

Across this single session, Sourcetable AI demonstrated:

Context awareness — it read the active table, not just a pasted snippet.
First-minute EDA — sample rows, schema, ranges, uniques, “what this is probably about.”
Promptable summaries — “what else can you do?” returned an analysis menu you could immediately act on.
Analyst-grade defaults — it used medians/percentiles for thresholds and flagged suspicious “deals” with title checks.
One-shot ML — classifier with per-class metrics, feature importance, and human rules.
In-place visual attempts — chart generation within the grid (objects don’t serialize in the transcript, but the action happened).
Scale path — the Integrations modal that lets you connect Salesforce/Stripe/Postgres/etc., so the same prompts work on live tables.

That’s the story: fewer clicks from question → answer, and a believable upgrade path from CSVs to your company’s real systems.

Replicate my exact path (copy/paste prompts)

Open your workbook and use these, in order:

“understand this dataset”
“what else can you do with this data?”
“Find best value: recent year, low mileage, reasonable price—return counts and examples.”
“Which brand has the best mileage? Separate out edge cases (non-consumer/heavy/commercial).”
“Predict title_status (clean vs salvage) using only the strongest features. Report per-class precision/recall and the top 3 importance features. Give 3 human-readable rules.”
(Optional) “Open integrations and show CRM/finance/database connectors.”

Practical cautions (the assistant already hinted at them)

Data sanity before glory: Remove true zeros and million-mile outliers if they’re artifacts.
Segment before ranking: Separate commercial categories (e.g., Peterbilt) from consumer brands when you care about mileage “leaders.”
Class imbalance matters: Always ask for per-class metrics (precision/recall/F1), not just accuracy.
Flag deals with risk context: Pair “sweet spot” filters with title_status and, if available, VIN/lot metadata.
Charts for humans: If a chart fails to render in export, have the assistant create a tiny summary table and chart that—easier to paste into Slack/Slides.

If you connect real systems next

With Salesforce/QuickBooks/Stripe/Postgres/Mailchimp (the ones you saw under “Popular”), your prompts become operational:

Salesforce: “Under-touched opps by segment and rep; show conversion odds.”
QuickBooks: “AR aging by customer cohort; spot at-risk invoices.”
Stripe: “Refund/cancel spikes by SKU; alerts when daily rate breaches baseline.”
Postgres: “Run a SQL query, materialize a table, and have the AI brief me.”
Mailchimp: “Campaigns with outlier CTR by segment; write 3 subject lines for the laggard segment.”

Same grid. Same AI. Less swivel-chair.

Affiliate Disclosure:

I am an affiliate partner of Sourcetable AI. If you like their product, you can sign up for free today using my link here.

TL;DR

You imported a Kaggle cars CSV into Sourcetable.
You asked the AI to understand the dataset; it returned sample rows, schema, ranges, and a credible “auction” interpretation.
You asked what else it could do; it proposed a menu of analyst tasks (EDA, ML, dashboards, BI moves).
You ran three concrete prompts; it delivered filtered value picks, a brand mileage leaderboard (with edge-case caveats), and a title-status classifier with feature importance and per-class metrics.
You opened Integrations and saw Salesforce, QuickBooks, Postgres, Stripe, Mailchimp at the top, plus a catalog (AdRoll, Airtable, Amazon Ads, Amplitude, …), proving this scales past CSVs.
The through-line: Sourcetable’s AI keeps the whole loop in one place—import, inspect, summarize, chart, model, integrate.

Data & AI with Mukundan

Discussion about this post

Ready for more?