Monday morning. Your ops lead opens Shopify, exports this week's orders as a CSV, opens QuickBooks, exports the payment data, then spends the next hour reformatting both files in Excel so the columns line up, copies everything into the master reporting sheet, and fixes the dates because one system uses MM/DD/YYYY and the other doesn't.
That's ETL. Manual ETL.
Extract, Transform, Load is the formal name for what your team has been doing by hand for months, maybe years. You don't need to be a data engineer to be running ETL. You just need a business with more than one tool and a person willing to glue them together on a Monday morning.
This article explains what ETL actually is in plain business language, names the three ways it breaks when your team grows, and gives you a practical decision framework for choosing what to do next.
Key takeaways
- You're already doing ETL: Extract, Transform, Load is the formal name for exporting CSVs, cleaning them in Excel, and pasting them into a reporting sheet. You don't need to be a data engineer to be running it by hand.
- It breaks in three predictable ways: Version conflicts (two people, two exports, two different numbers), rising time cost as reporting gets more frequent, and bus-factor risk when the process lives in one person's head.
- Three realistic paths forward: Keep doing it manually, build an automated pipeline with tools like Fivetran or Stitch, or use a conversational analytics platform that skips the warehouse entirely.
- Decide with three questions: How often your data needs updating, how much technical capacity your team has, and whether you need to move data between systems or simply ask questions from it.
- You can skip ETL for the analytics use case: If your goal is answering business questions rather than feeding data into other systems, conversational analytics connects directly to your sources and answers in plain English, no pipeline required.
What ETL actually is
ETL - Extract, Transform, Load - means the three-step process of pulling data from a source system, reformatting it so it can be used, and moving it somewhere useful. For a small business, each of those three steps already happens manually: Extract is downloading the CSV from Shopify or QuickBooks.
Transform is opening that file in Excel and cleaning the columns, correcting date formats, or removing duplicate rows. Load is pasting the cleaned data into your reporting sheet, your Google Sheets dashboard, or wherever the rest of the business looks for numbers. Understanding the term doesn't change what you're already doing - it just names it and opens the question of whether you should still be doing it by hand.
Why you're already doing ETL manually
Manual ETL is the version most small businesses run without realizing it has a formal name. When your ops lead downloads a CSV from Shopify, cleans the columns in Excel, and pastes it into the master reporting sheet, that is Extract, Transform, Load - performed by a person instead of a system. The process works at low volume: no infrastructure required, no setup, anyone organized enough can run it.
The problem is that manual ETL doesn't stay manageable as the business grows. What works at 50 orders a month starts to strain at 500. What works with two data sources starts to break at five. And the person who knows all the steps becomes a single point of failure for the whole reporting process. Gartner projects that by 2026, 80% of low-code tool users will sit outside formal IT departments - which means the people running these manual data processes are exactly the non-technical operators who have the least time to spare on them.
When manual ETL breaks
Manual ETL fails in three predictable ways as a business grows. Version conflicts, time cost, and the bus factor are each manageable in isolation - but they tend to compound, and any one of them is a signal to reassess the approach.
Version conflicts appear when two team members export the same data on different days and reach different numbers. Your ops lead ran the Shopify report on Friday. The founder ran it on Monday. The numbers don't match. Now there's a conversation about which version is "right" instead of a conversation about what the data means - this is why data silos persist in growing teams.
Time cost grows as your reporting frequency increases. Running the CSV process once a month is a minor inconvenience. Running it weekly starts to eat hours. Running it daily means your ops lead spends a significant chunk of their week on data plumbing instead of analysis. Your ops lead might spend three or more hours each Monday on this ritual alone - hours spent moving numbers around rather than reading them.
The bus factor emerges when the person who runs the process leaves, gets sick, or moves to a different role. Nobody else knows the exact steps: which columns to rename, which rows to delete, which date format Shopify uses versus the one QuickBooks uses. The process lives in one person's head, and when that person isn't available, the reporting stops.
All three appearing together means the manual approach has already run out of road. If you're not sure which stage your team is in, whether spreadsheets are still working for your business covers the signals in more detail.
Three ways to handle your data integration problem
Small businesses typically have three realistic paths for handling their data integration challenge. Manual ETL - the CSV-and-spreadsheet routine - requires no infrastructure but costs time and breaks under growth. Automated ETL pipelines connect your sources automatically but require developer setup and ongoing maintenance. Conversational analytics platforms skip the warehouse and pipeline entirely - you connect sources and ask questions in plain English, with no transformation step required. Each path fits different team sizes, budgets, and technical tolerance.
How to decide which path is right for your team
Start with three questions to find your path: how often your data needs updating, how much technical capacity your team has, and whether you need to move data between systems or simply ask questions from it. Those three signals point to manual ETL, an automated pipeline, or a conversational analytics platform.
Choose manual ETL if your reporting is infrequent (monthly or less), your data volume is low, you have one reliable person to run the process, and the business isn't growing fast enough to strain it. There's nothing wrong with this option if the constraints fit.
Move to an automated ETL pipeline - a data pipeline is a process that automatically moves data between your source tools and a central warehouse - if you need data updated daily or in real time, you have a developer available to configure and maintain connectors, and you're prepared to manage the infrastructure that comes with it. Tools in the no-code ETL space make this more accessible than it used to be, but "no-code" doesn't mean "no maintenance."
Choose conversational analytics if you want answers from your data without building or maintaining a pipeline, your team is non-technical, and your primary need is asking cross-source questions - what does revenue look like when Shopify and QuickBooks are combined? Why is there a gap between orders and payments? These platforms connect your sources and answer questions directly, without requiring you to build a warehouse first.
What this means if you want answers, not infrastructure
Conversational analytics is a category of tools built specifically for teams who need data insights without a data team. Rather than asking you to set up a pipeline, load data into a warehouse, and then query it with SQL (a programming language used to retrieve data from databases), these platforms connect directly to your existing sources and let you ask questions in plain English.
AnalysisGPT is one example of this category. You connect your data sources - Shopify and QuickBooks are both supported - and ask questions the way you'd ask a colleague. "What were our top five products by revenue last month?" "How does this week's order value compare to the same week last year?" The LLM sees only column fields and table metadata, the LLM formats the query, read-only access executes the query, and user data stays in the customer's systems.
For teams who've been running manual ETL for years, this means you can skip the pipeline entirely for the analytics use case. You connect what you already have and start asking questions.
Get started with AnalysisGPT - 30-day trial
Frequently asked questions
Do I need a data pipeline?
No - not if your primary goal is answering business questions rather than feeding data into other systems. Data pipelines make sense when you need to synchronize data between applications at scale or serve data to an engineering team. For most small businesses whose goal is asking "how are we performing across channels?", conversational analytics platforms can connect sources and answer questions directly, without a pipeline step.
What's the difference between ETL and ELT?
ETL and ELT are two approaches to moving data between systems, and the difference is when the transformation happens. ETL (Extract, Transform, Load) transforms data before loading it into a destination. ELT (Extract, Load, Transform) loads raw data first and transforms it inside the destination data warehouse. Both require a data warehouse to be useful. For most small businesses, neither distinction matters in practice.
Can I skip ETL entirely?
Yes, if your goal is answering questions from your existing business data rather than moving data between systems at scale. Conversational analytics platforms connect directly to your sources - your Shopify store, your QuickBooks account, your spreadsheets - and answer questions in natural language without requiring you to build a pipeline or a warehouse. For the analytics use case, this approach skips ETL entirely.
How do I connect Shopify and QuickBooks?
You have three options. Manually: export CSVs from each and combine them in a spreadsheet, which is the most common approach and the slowest. Via an ETL tool: set up connectors that automatically sync both sources into a data warehouse, then query the warehouse. Via a conversational analytics platform: connect both sources directly and ask cross-platform questions in plain English, with no warehouse or transformation step required.