Thursday 2 February 2012

How to optimize search through 1 Terabyte in PostgreSQL

Today I wanted to share my experience of using a relatively big database. It's an active running database containing near 1 TB of data managed by RDBMS PostgreSQL 9.0. The most significant part of the data is located in only 2 tables, 650 GB and 350 GB respectively. There are lots of queries populating into these tables - nearly 15-20 inserts per second, however there are just few selects per minute. Each row in either table contains up to 50 KB of data. Additionally there is a single cleaning query starting once a day in order to remove oldest data and prevent excessive database growing. Well, let's take a look under the hood.