Postgres clustering using multi-column indexes

I have a table which includes a multi-column index defined as

CREATE INDEX tab_a_idx1 ON tab_a USING btree (device, fixtime)

The index was chosen deliberately because the majority of the queries run against this table include selection criteria like this

WHERE device = 'xyz' AND fixtime > 'sometime' AND fixtime <= 'someothertime' ORDER BY fixtime;

The table has been clustered on this index in a effort to improve performance.

CLUSTER tab_a USING tab_a_idx1;

Based on the comments and answers in a previous question I've used this query to list my clustered tables, the indexes they're clustered on, and the definitions of those indexes.

SELECT c.oid, c.relname as tablename, x.relname as indexname, z.indexdef

FROM   pg_class c

JOIN   pg_index i ON i.indrelid = c.oid

JOIN   pg_class x ON i.indexrelid = x.oid

JOIN   pg_indexes z ON x.relname = z.indexname

WHERE  c.relkind = 'r' AND c.relhasindex AND i.indisclustered

And I've been using the pg_stats table to check the correlation of the indexed columns.

The quoted answer states that a correlation close to '1' is good, and as the value get lower the more clustering is indicated.

Immediately after the table was clustered the correlation of the 1st field in the index (device) was low (0.008) and the 2nd one (fixtime) relatively high (0.994).

If these values are supposed to be close to '1' but aren't, does that mean that a table can't (or shouldn't) be clustered on a multi-column index?

There are several versions of the tab_a (it's partitioned on fixtime) and I've noticed that the correlation values don't actually seem to vary much between the clustered and un-clustered versions of the table. Does this mean there's no point in clustering on this index?

Thanks

UPDATE - the parent table was created as follows....

CREATE TABLE tab_a 

  ( device  CHAR(6),

    fixTime TIMESTAMP,

    ....lots more fields.....

  )

PARTITION BY RANGE (fixTime);

The individual partitions were created like this

CREATE TABLE tab_a_201704 PARTITION OF tab_a FOR VALUES FROM ('2017-04-01' ) TO (  '2017-05-01' )

And the index used for the clustering like this....

CREATE INDEX tab_a_201704_idx2 ON tab_a_201704 (device, fixTime);

And the command to do the cluster....

CLUSTER tab_a_201704 USING tab_a_201704_idx2 ;

edited Nov 19 '18 at 18:19

asked Nov 17 '18 at 13:59

Hemel

1948

Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?

– Laurenz Albe
Nov 18 '18 at 13:32

add a comment |

I have a table which includes a multi-column index defined as

CREATE INDEX tab_a_idx1 ON tab_a USING btree (device, fixtime)

The index was chosen deliberately because the majority of the queries run against this table include selection criteria like this

WHERE device = 'xyz' AND fixtime > 'sometime' AND fixtime <= 'someothertime' ORDER BY fixtime;

The table has been clustered on this index in a effort to improve performance.

CLUSTER tab_a USING tab_a_idx1;

Based on the comments and answers in a previous question I've used this query to list my clustered tables, the indexes they're clustered on, and the definitions of those indexes.

SELECT c.oid, c.relname as tablename, x.relname as indexname, z.indexdef

FROM   pg_class c

JOIN   pg_index i ON i.indrelid = c.oid

JOIN   pg_class x ON i.indexrelid = x.oid

JOIN   pg_indexes z ON x.relname = z.indexname

WHERE  c.relkind = 'r' AND c.relhasindex AND i.indisclustered

And I've been using the pg_stats table to check the correlation of the indexed columns.

The quoted answer states that a correlation close to '1' is good, and as the value get lower the more clustering is indicated.

Immediately after the table was clustered the correlation of the 1st field in the index (device) was low (0.008) and the 2nd one (fixtime) relatively high (0.994).

If these values are supposed to be close to '1' but aren't, does that mean that a table can't (or shouldn't) be clustered on a multi-column index?

There are several versions of the tab_a (it's partitioned on fixtime) and I've noticed that the correlation values don't actually seem to vary much between the clustered and un-clustered versions of the table. Does this mean there's no point in clustering on this index?

Thanks

UPDATE - the parent table was created as follows....

CREATE TABLE tab_a 

  ( device  CHAR(6),

    fixTime TIMESTAMP,

    ....lots more fields.....

  )

PARTITION BY RANGE (fixTime);

The individual partitions were created like this

CREATE TABLE tab_a_201704 PARTITION OF tab_a FOR VALUES FROM ('2017-04-01' ) TO (  '2017-05-01' )

And the index used for the clustering like this....

CREATE INDEX tab_a_201704_idx2 ON tab_a_201704 (device, fixTime);

And the command to do the cluster....

CLUSTER tab_a_201704 USING tab_a_201704_idx2 ;

edited Nov 19 '18 at 18:19

asked Nov 17 '18 at 13:59

Hemel

1948

Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?

– Laurenz Albe
Nov 18 '18 at 13:32

add a comment |

I have a table which includes a multi-column index defined as

CREATE INDEX tab_a_idx1 ON tab_a USING btree (device, fixtime)

The index was chosen deliberately because the majority of the queries run against this table include selection criteria like this

WHERE device = 'xyz' AND fixtime > 'sometime' AND fixtime <= 'someothertime' ORDER BY fixtime;

The table has been clustered on this index in a effort to improve performance.

CLUSTER tab_a USING tab_a_idx1;

Based on the comments and answers in a previous question I've used this query to list my clustered tables, the indexes they're clustered on, and the definitions of those indexes.

SELECT c.oid, c.relname as tablename, x.relname as indexname, z.indexdef

FROM   pg_class c

JOIN   pg_index i ON i.indrelid = c.oid

JOIN   pg_class x ON i.indexrelid = x.oid

JOIN   pg_indexes z ON x.relname = z.indexname

WHERE  c.relkind = 'r' AND c.relhasindex AND i.indisclustered

And I've been using the pg_stats table to check the correlation of the indexed columns.

The quoted answer states that a correlation close to '1' is good, and as the value get lower the more clustering is indicated.

Immediately after the table was clustered the correlation of the 1st field in the index (device) was low (0.008) and the 2nd one (fixtime) relatively high (0.994).

If these values are supposed to be close to '1' but aren't, does that mean that a table can't (or shouldn't) be clustered on a multi-column index?

There are several versions of the tab_a (it's partitioned on fixtime) and I've noticed that the correlation values don't actually seem to vary much between the clustered and un-clustered versions of the table. Does this mean there's no point in clustering on this index?

Thanks

UPDATE - the parent table was created as follows....

CREATE TABLE tab_a 

  ( device  CHAR(6),

    fixTime TIMESTAMP,

    ....lots more fields.....

  )

PARTITION BY RANGE (fixTime);

The individual partitions were created like this

CREATE TABLE tab_a_201704 PARTITION OF tab_a FOR VALUES FROM ('2017-04-01' ) TO (  '2017-05-01' )

And the index used for the clustering like this....

CREATE INDEX tab_a_201704_idx2 ON tab_a_201704 (device, fixTime);

And the command to do the cluster....

CLUSTER tab_a_201704 USING tab_a_201704_idx2 ;

edited Nov 19 '18 at 18:19

asked Nov 17 '18 at 13:59

Hemel

1948

I have a table which includes a multi-column index defined as

CREATE INDEX tab_a_idx1 ON tab_a USING btree (device, fixtime)

The index was chosen deliberately because the majority of the queries run against this table include selection criteria like this

WHERE device = 'xyz' AND fixtime > 'sometime' AND fixtime <= 'someothertime' ORDER BY fixtime;

The table has been clustered on this index in a effort to improve performance.

CLUSTER tab_a USING tab_a_idx1;

Based on the comments and answers in a previous question I've used this query to list my clustered tables, the indexes they're clustered on, and the definitions of those indexes.

SELECT c.oid, c.relname as tablename, x.relname as indexname, z.indexdef

FROM   pg_class c

JOIN   pg_index i ON i.indrelid = c.oid

JOIN   pg_class x ON i.indexrelid = x.oid

JOIN   pg_indexes z ON x.relname = z.indexname

WHERE  c.relkind = 'r' AND c.relhasindex AND i.indisclustered

And I've been using the pg_stats table to check the correlation of the indexed columns.

The quoted answer states that a correlation close to '1' is good, and as the value get lower the more clustering is indicated.

Immediately after the table was clustered the correlation of the 1st field in the index (device) was low (0.008) and the 2nd one (fixtime) relatively high (0.994).

If these values are supposed to be close to '1' but aren't, does that mean that a table can't (or shouldn't) be clustered on a multi-column index?

There are several versions of the tab_a (it's partitioned on fixtime) and I've noticed that the correlation values don't actually seem to vary much between the clustered and un-clustered versions of the table. Does this mean there's no point in clustering on this index?

Thanks

UPDATE - the parent table was created as follows....

CREATE TABLE tab_a 

  ( device  CHAR(6),

    fixTime TIMESTAMP,

    ....lots more fields.....

  )

PARTITION BY RANGE (fixTime);

The individual partitions were created like this

CREATE TABLE tab_a_201704 PARTITION OF tab_a FOR VALUES FROM ('2017-04-01' ) TO (  '2017-05-01' )

And the index used for the clustering like this....

CREATE INDEX tab_a_201704_idx2 ON tab_a_201704 (device, fixTime);

And the command to do the cluster....

CLUSTER tab_a_201704 USING tab_a_201704_idx2 ;

postgresql query-performance

edited Nov 19 '18 at 18:19

asked Nov 17 '18 at 13:59

Hemel

1948

edited Nov 19 '18 at 18:19

asked Nov 17 '18 at 13:59

Hemel

1948

edited Nov 19 '18 at 18:19

asked Nov 17 '18 at 13:59

Hemel

1948

asked Nov 17 '18 at 13:59

Hemel

1948

asked Nov 17 '18 at 13:59

Hemel

1948

Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?

– Laurenz Albe
Nov 18 '18 at 13:32

add a comment |

Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?

– Laurenz Albe
Nov 18 '18 at 13:32

Since you say that the table is partitioned, can you show CREATE TABLE for the table and its partitions? What was the SQL statement to cluster the table?

– Laurenz Albe
Nov 18 '18 at 13:32

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53351910%2fpostgres-clustering-using-multi-column-indexes%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky