DuckDB describes itself as a in-process analytical database. It’s meant to process data for reporting and analysis, rather than being a transactional data store, and is column oriented (As opposed to being row oriented, like most OLTP systems). It doesn’t run as a server process, like SQL Server, but rather it runs in the client application’s process. SQLite runs in the same manner, and DuckDB has a lot of similarities with that system.
DuckDB is ranked as the #82 database system on DB Engines.
DuckDB can run on Windows, Linux, or Mac. DuckDB is open source, and is free to run. It is ACID compliant and supports transactions.
With concurrency in DuckDB, it will support either one process to both read and write to the database, or support multiples processes that read and none that write. But with an analytic systems, most operations should be reads.
Setup:
I’m using Windows, so it’s easy to use winget to install DuckDB:
winget install DuckDB.cli
There are other options available here.
I like to use DBeaver to connect to databases, and it supports DuckDB. When setting up the connection, just fill in the path where you want to store the database file, including the name (I used test.db). Once you try to connect, you’ll be prompted to download the necessary driver files.
There are three databases created: system, temp, and test(the user database).
Each database has three schemas: information_schema, pg_catalog, main
SQL Examples:
The SQL seems pretty close to Postgres PL/SQL.
*) Creating a table:
CREATE OR REPLACE SEQUENCE id_TestTable;
CREATE OR REPLACE TABLE main.TestTable (
ID int DEFAULT nextval('id_Test_Table') PRIMARY KEY,
Name varchar(50) NOT NULL UNIQUE,
Balance numeric(6,2) NOT NULL,
IsActive boolean NOT NULL,
CreatedDate TIMESTAMP WITH TIME ZONE
DEFAULT CURRENT_TIMESTAMP
);
I didn’t see a way to create an auto-increment column inline with the CREATE TABLE statement, but we can create a sequence. DuckDB supports CREATE OR REPLACE for creating a table.
*) Insert records:
INSERT INTO main.TestTable (Name, Balance, IsActive)
VALUES ('TestRecord1', 50.05, TRUE);
INSERT INTO main.TestTable (Name, Balance, IsActive)
VALUES ('TestRecord2', 25.00, TRUE);
*) Comments:
-- Comment /* Comment */
*) Select the first record with Name in alphabetical order:
SELECT * FROM main.TestTable ORDER BY Name LIMIT 1;
*) Get current date/time:
SELECT CURRENT_TIMESTAMP;
We can select without specifying a table.
*) Concatenate strings:
SELECT 'Atlanta' || ' ' || 'Falcons' as team;
SELECT CONCAT('Atlanta',' ', 'Falcons') as team;
We can use double pipes or the CONCAT function.
*) Convert a string to a datetime:
SELECT CAST('2024-01-01' as TIMESTAMP) as date;
*) Get the month from the current date/time:
SELECT EXTRACT(MONTH FROM CURRENT_TIMESTAMP) as current_month;
*) Create view:
CREATE VIEW main.vwTestTable AS SELECT ID, Name, Balance, IsActive, CreatedDate FROM main.TestTable ORDER BY ID;
*) Get List of tables and views:
SELECT table_catalog, table_schema, table_name, table_type FROM INFORMATION_SCHEMA.tables;
INFORMATION_SCHEMA views are available,
*) Drop the table.
The view will have to be dropped as well, since it references the table:
DROP VIEW IF EXISTS main.vwTestTable; DROP TABLE IF EXISTS main.TestTable;
Links: