TPC-H with Postgres

TPC-H is a database benchmark.

git clone https://github.com/gregrahn/tpch-kit.git
cd tpch-kit/dbgen
make -f Makefile.osx

Create the database and load the schema

createdb tpch
psql tpch -f dss.ddl

Generate data

./dbgen -vf -s 1

Load the data

for i in `ls *.tbl`; do
  table=${i/.tbl/}
  echo "Loading $table..."
  sed 's/|$//' $i > /tmp/$i
  psql tpch -q -c "TRUNCATE $table"
  psql tpch -c "\\copy $table FROM '/tmp/$i' CSV DELIMITER '|'"
done

Generate queries

mkdir /tmp/queries
for i in `ls queries/*.sql`; do
  tail -r $i | sed '2s/;//' | tail -r > /tmp/$i
done

DSS_QUERY=/tmp/queries ./qgen | sed 's/limit -1//' | sed 's/day (3)/day/' > queries.sql

Run queries

psql tpch -c "ANALYZE VERBOSE"
psql tpch < queries.sql

Bonus: Add Indexes with Dexter

Install Dexter

gem install pgdexter

And run

for i in `seq 1 5`; do
  dexter tpch queries.sql --input-format sql --create
done

Published April 29, 2018


You might also enjoy

Learn Data Science

Introducing Archer: Rails Console History for Heroku, Docker, and More

Bulk Upsert in Ruby/Rails


All code examples are public domain.
Use them however you’d like (licensed under CC0).