|
|
 | | From: | Adrian Holovaty | | Subject: | Index on a function and SELECT DISTINCT | | Date: | Fri, 14 Jan 2005 12:32:12 -0600 |
|
|
 | If I have this table, function and index in Postgres 7.3.6 ...
""" CREATE TABLE news_stories ( id serial primary key NOT NULL, pub_date timestamp with time zone NOT NULL, ... ) CREATE OR REPLACE FUNCTION get_year_trunc(timestamp with time zone) returns timestamp with time zone AS 'SELECT date_trunc(\'year\',$1);' LANGUAGE 'SQL' IMMUTABLE; CREATE INDEX news_stories_pub_date_year_trunc ON news_stories( get_year_trunc(pub_date) ); """ ....why does this query not use the index? db=# EXPLAIN SELECT DISTINCT get_year_trunc(pub_date) FROM news_stories; QUERY PLAN --------------------------------------------------------------------------------- Unique (cost=59597.31..61311.13 rows=3768 width=8) -> Sort (cost=59597.31..60454.22 rows=342764 width=8) Sort Key: date_trunc('year'::text, pub_date) -> Seq Scan on news_stories (cost=0.00..23390.55 rows=342764 width=8) (4 rows)
The query is noticably slow (2 seconds) on a database with 150,000+ records. How can I speed it up?
Thanks, Adrian
---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
|
|
 | | From: | Frank Wiles | | Subject: | Re: Index on a function and SELECT DISTINCT | | Date: | Mon, 17 Jan 2005 10:09:42 -0600 |
|
|
 | On Fri, 14 Jan 2005 12:32:12 -0600 Adrian Holovaty wrote:
> If I have this table, function and index in Postgres 7.3.6 ... > > """ > CREATE TABLE news_stories ( > id serial primary key NOT NULL, > pub_date timestamp with time zone NOT NULL, > ... > ) > CREATE OR REPLACE FUNCTION get_year_trunc(timestamp with time zone) > returns timestamp with time zone AS 'SELECT date_trunc(\'year\',$1);' > LANGUAGE 'SQL' IMMUTABLE; > CREATE INDEX news_stories_pub_date_year_trunc ON > news_stories( get_year_trunc(pub_date) ); > """ > > ...why does this query not use the index? > > db=# EXPLAIN SELECT DISTINCT get_year_trunc(pub_date) FROM > news_stories; > QUERY PLAN > --------------------------------------------------------------------- > ------------ > Unique (cost=59597.31..61311.13 rows=3768 width=8) > -> Sort (cost=59597.31..60454.22 rows=342764 width=8) > Sort Key: date_trunc('year'::text, pub_date) > -> Seq Scan on news_stories (cost=0.00..23390.55 > rows=342764 > width=8) > (4 rows) > > The query is noticably slow (2 seconds) on a database with 150,000+ > records. How can I speed it up?
It's doing a sequence scan because you're not limiting the query in the FROM clause. No point in using an index when you're asking for the entire table. :)
--------------------------------- Frank Wiles http://www.wiles.org ---------------------------------
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
|
|
 | | From: | Adrian Holovaty | | Subject: | Re: Index on a function and SELECT DISTINCT | | Date: | Mon, 17 Jan 2005 11:59:24 -0600 |
|
|
 | Frank Wiles wrote: > Adrian Holovaty wrote: > > If I have this table, function and index in Postgres 7.3.6 ... > > > > """ > > CREATE TABLE news_stories ( > > id serial primary key NOT NULL, > > pub_date timestamp with time zone NOT NULL, > > ... > > ) > > CREATE OR REPLACE FUNCTION get_year_trunc(timestamp with time zone) > > returns timestamp with time zone AS 'SELECT date_trunc(\'year\',$1);' > > LANGUAGE 'SQL' IMMUTABLE; > > CREATE INDEX news_stories_pub_date_year_trunc ON > > news_stories( get_year_trunc(pub_date) ); > > """ > > > > ...why does this query not use the index? > > > > db=# EXPLAIN SELECT DISTINCT get_year_trunc(pub_date) FROM > > news_stories; > > QUERY PLAN > > --------------------------------------------------------------------- > > ------------ > > Unique (cost=59597.31..61311.13 rows=3768 width=8) > > -> Sort (cost=59597.31..60454.22 rows=342764 width=8) > > Sort Key: date_trunc('year'::text, pub_date) > > -> Seq Scan on news_stories (cost=0.00..23390.55 > > rows=342764 > > width=8) > > (4 rows) > > > > The query is noticably slow (2 seconds) on a database with 150,000+ > > records. How can I speed it up? > > It's doing a sequence scan because you're not limiting the query in > the FROM clause. No point in using an index when you're asking for > the entire table. :)
Ah, that makes sense. So is there a way to optimize SELECT DISTINCT queries that have no WHERE clause?
Adrian
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
|
|
 | | From: | PFC | | Subject: | Re: Index on a function and SELECT DISTINCT | | Date: | Mon, 17 Jan 2005 19:17:57 +0100 |
|
|
 |
Try :
EXPLAIN SELECT get_year_trunc(pub_date) as foo FROM ... GROUP BY foo
Apart from that, you could use a materialized view...
>> > db=# EXPLAIN SELECT DISTINCT get_year_trunc(pub_date) FROM
> Ah, that makes sense. So is there a way to optimize SELECT DISTINCT > queries > that have no WHERE clause? > > Adrian > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend
|
|
|