
Pulling Data from Microsoft Access to CSV: What I Built and Why
I’ve always loved working with plain old CSVs for my analysis and scripting needs. They are simple, portable, and just work everywhere.
So when I got handed a Microsoft Access database that I needed to break out into individual CSV files, I figured I’d whip up a little toolchain to handle it reliably on my Mac.
What I Wanted to Achieve
When I started this little project, I had a few specific requirements in mind.
First, I needed to be able to point the tool at an Access file—either the older .mdb or the newer .accdb format—and have it handle the extraction automatically.
Second, the output had to be clean. I wanted it to spit out one CSV per table, complete with headers.
Finally, this tool had to be quick and easy to script. I didn’t want a heavy UI; I wanted something I could drop into a pipeline.
The Two Approaches I Went With
Database drivers can be tricky, especially when dealing with proprietary formats on non-native operating systems. I built in support for two different backends and let the script pick the best one (or let me override it).
mdbtools
For the legacy .mdb files, I prefer mdbtools. It is a set of command-line tools that are super fast and easy to install on a Mac using Homebrew (brew install mdbtools).
The logic here is simple. I use mdb-tables to list everything in the database, and then mdb-export to dump each specific table to a CSV.
UCanAccess
For the newer .accdb files, mdbtools falls short. That is where UCanAccess comes in. It is a pure Java JDBC driver that handles .accdb files perfectly and has proven pretty solid.
To use this, you need Java installed, along with the UCanAccess JARs and dependencies. I generally stash mine in ~/lib/ucanaccess/. While it relies on Java, it gets the job done without fuss.
How I Structured the Python Code
I put together a small Python package with a couple of command-line entry points. This allows me to run it directly or import it into other scripts if I need more granular control.
The core export logic lives in access_to_csv/exporter.py. I used Click to handle the CLI interface in access_to_csv/cli.py.
The main function, export_database(), takes the database path, the output folder, the backend choice (auto, mdbtools, or ucanaccess), and an overwrite flag.
I implemented an “auto” mode to simplify the execution:
- For
.mdbfiles, it prefersmdbtoolsif it is available on the system. - For
.accdbfiles, it jumps straight toUCanAccess.
Behind the scenes, the two routes operate differently. The mdbtools route just shells out to the CLI tools. The UCanAccess route uses JayDeBeApi to connect via JDBC, grabs the real user tables, and writes out CSVs with proper headers.
Handling the Edge Cases
Data extraction is rarely a clean process. I ran into a few edge cases that I had to handle in the code.
- Table Names: Access users often put spaces or weird characters in table names. I added logic to clean them up when naming the resulting CSV files.
- Dates: Dates can be messy. With
mdbtools, I ensured they come out in a nice, consistent ISO format. - Safety: To avoid blowing away existing data, I made sure the script requires you to explicitly pass
--overwriteto replace files.
Why I’m Happy with This Setup
Everything runs locally on my machine and stays mostly in Python. The flexibility is critical; I can switch backends whenever I need to.
For old .mdb files, mdbtools is blazing fast. For .accdb, UCanAccess works.
The best part? I have a dead-simple command I can throw into any script and know it will reliably give me clean CSVs every time.
Do you have a preferred tool for wrangling legacy database formats? Reach out to me on X (@chadmichel).


