|
|
 | | From: | Andrei Bintintan | | Subject: | OFFSET impact on Performance??? | | Date: | Thu, 20 Jan 2005 13:13:44 +0200 |
|
|
 | This is a multi-part message in MIME format.
------=_NextPart_000_019A_01C4FEF1.DE1BBDF0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi to all,=20
I have the following 2 examples. Now, regarding on the offset if it is = small(10) or big(>50000) what is the impact on the performance of the = query?? I noticed that if I return more data's(columns) or if I make = more joins then the query runs even slower if the OFFSET is bigger. How = can I somehow improve the performance on this?=20
Best regards,=20 Andy.
explain analyze SELECT o.id FROM report r=20 INNER JOIN orders o ON o.id=3Dr.id_order AND o.id_status=3D6 ORDER BY 1 LIMIT 10 OFFSET 10
Limit (cost=3D44.37..88.75 rows=3D10 width=3D4) (actual = time=3D0.160..0.275 rows=3D10 loops=3D1) -> Merge Join (cost=3D0.00..182150.17 rows=3D41049 width=3D4) = (actual time=3D0.041..0.260 rows=3D20 loops=3D1) Merge Cond: ("outer".id_order =3D "inner".id) -> Index Scan using report_id_order_idx on report r = (cost=3D0.00..157550.90 rows=3D42862 width=3D4) (actual = time=3D0.018..0.075 rows=3D20 loops=3D1) -> Index Scan using orders_pkey on orders o = (cost=3D0.00..24127.04 rows=3D42501 width=3D4) (actual = time=3D0.013..0.078 rows=3D20 loops=3D1) Filter: (id_status =3D 6) Total runtime: 0.373 ms
explain analyze SELECT o.id FROM report r=20 INNER JOIN orders o ON o.id=3Dr.id_order AND o.id_status=3D6 ORDER BY 1 LIMIT 10 OFFSET 1000000
Limit (cost=3D31216.85..31216.85 rows=3D1 width=3D4) (actual = time=3D1168.152..1168.152 rows=3D0 loops=3D1) -> Sort (cost=3D31114.23..31216.85 rows=3D41049 width=3D4) (actual = time=3D1121.769..1152.246 rows=3D42693 loops=3D1) Sort Key: o.id -> Hash Join (cost=3D2329.99..27684.03 rows=3D41049 width=3D4) = (actual time=3D441.879..925.498 rows=3D42693 loops=3D1) Hash Cond: ("outer".id_order =3D "inner".id) -> Seq Scan on report r (cost=3D0.00..23860.62 = rows=3D42862 width=3D4) (actual time=3D38.634..366.035 rows=3D42864 = loops=3D1) -> Hash (cost=3D2077.74..2077.74 rows=3D42501 width=3D4) = (actual time=3D140.200..140.200 rows=3D0 loops=3D1) -> Seq Scan on orders o (cost=3D0.00..2077.74 = rows=3D42501 width=3D4) (actual time=3D0.059..96.890 rows=3D42693 = loops=3D1) Filter: (id_status =3D 6) Total runtime: 1170.586 ms
------=_NextPart_000_019A_01C4FEF1.DE1BBDF0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
charset=3Diso-8859-1">
Hi to all,
I have the following 2 examples. = Now,=20 regarding on the offset if it is small(10) or big(>50000) what is the = impact=20 on the performance of the query?? I noticed that if I return more=20 data's(columns) or if I make more joins then the query runs = even=20 slower if the OFFSET is bigger. How can I somehow improve the = performance on=20 this?
Best regards, Andy.
explain size=3D2>analyze color=3D#0000ff>SELECT=20 o.id size=3D2>FROM=20 report r size=3D2>INNERsize=3D2> JOIN orders o color=3D#008000>ON=20 o.id=3Dr.id_order AND = o.id_status=3D6 color=3D#6d13b7>ORDER color=3D#008000>BY 1 LIMIT 10 color=3D#6d13b7>OFFSET 10
Limit (cost=3D44.37..88.75 = rows=3D10 width=3D4)=20 (actual time=3D0.160..0.275 rows=3D10 loops=3D1) -> = Merge=20 Join (cost=3D0.00..182150.17 rows=3D41049 width=3D4) (actual = time=3D0.041..0.260=20 rows=3D20 loops=3D1) Merge = Cond:=20 ("outer".id_order =3D = "inner".id) =20 -> Index Scan using report_id_order_idx on report r =20 (cost=3D0.00..157550.90 rows=3D42862 width=3D4) (actual = time=3D0.018..0.075 rows=3D20=20 loops=3D1) -> = Index Scan=20 using orders_pkey on orders o (cost=3D0.00..24127.04 rows=3D42501 = width=3D4)=20 (actual time=3D0.013..0.078 rows=3D20=20 loops=3D1)  = ; =20 Filter: (id_status =3D 6) Total runtime: 0.373 ms
explain size=3D2>analyze color=3D#0000ff>SELECT=20 o.id size=3D2>FROM=20 report r size=3D2>INNERsize=3D2> JOIN orders o color=3D#008000>ON=20 o.id=3Dr.id_order AND = o.id_status=3D6 color=3D#6d13b7>ORDER color=3D#008000>BY 1 LIMIT 10 color=3D#6d13b7>OFFSET 1000000face=3DArial>size=3D2>Limit (cost=3D31216.85..31216.85 rows=3D1 width=3D4) = (actual=20 time=3D1168.152..1168.152 rows=3D0 loops=3D1) -> = Sort =20 (cost=3D31114.23..31216.85 rows=3D41049 width=3D4) (actual = time=3D1121.769..1152.246=20 rows=3D42693 loops=3D1) = Sort Key:=20 o.id -> Hash = Join =20 (cost=3D2329.99..27684.03 rows=3D41049 width=3D4) (actual = time=3D441.879..925.498=20 rows=3D42693=20 loops=3D1)  = ; =20 Hash Cond: ("outer".id_order =3D=20 "inner".id) &nbs= p; =20 -> Seq Scan on report r (cost=3D0.00..23860.62 = rows=3D42862 width=3D4)=20 (actual time=3D38.634..366.035 rows=3D42864=20 loops=3D1)  = ; =20 -> Hash (cost=3D2077.74..2077.74 rows=3D42501 width=3D4) = (actual=20 time=3D140.200..140.200 rows=3D0=20 loops=3D1)  = ; =20 -> Seq Scan on orders o (cost=3D0.00..2077.74 = rows=3D42501 width=3D4)=20 (actual time=3D0.059..96.890 rows=3D42693=20 loops=3D1)  = ; = =20 Filter: (id_status =3D 6) Total runtime: 1170.586=20 ms
------=_NextPart_000_019A_01C4FEF1.DE1BBDF0--
|
|
 | | From: | Alex Turner | | Subject: | Re: [SQL] OFFSET impact on Performance??? | | Date: | Thu, 20 Jan 2005 11:39:16 -0500 |
|
|
 | I am also very interesting in this very question.. Is there any way to declare a persistant cursor that remains open between pg sessions? This would be better than a temp table because you would not have to do the initial select and insert into a fresh table and incur those IO costs, which are often very heavy, and the reason why one would want to use a cursor.
Alex Turner NetEconomist
On Thu, 20 Jan 2005 15:20:59 +0000, Richard Huxton wrote: > Andrei Bintintan wrote: > >> If you're using this to provide "pages" of results, could you use a > >> cursor? > > > > What do you mean by that? Cursor? > > > > Yes I'm using this to provide "pages", but If I jump to the last pages > > it goes very slow. > > DECLARE mycursor CURSOR FOR SELECT * FROM ... > FETCH FORWARD 10 IN mycursor; > CLOSE mycursor; > > Repeated FETCHes would let you step through your results. That won't > work if you have a web-app making repeated connections. > > If you've got a web-application then you'll probably want to insert the > results into a cache table for later use. > > -- > Richard Huxton > Archonet Ltd > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
|
|
 | | From: | Ragnar Hafstaš | | Subject: | Re: [SQL] OFFSET impact on Performance??? | | Date: | Thu, 20 Jan 2005 19:23:12 +0000 |
|
|
 | On Thu, 2005-01-20 at 19:12 +0000, Ragnar Hafstaš wrote: > On Thu, 2005-01-20 at 11:59 -0500, Greg Stark wrote: > > > The best way to do pages for is not to use offset or cursors but to use an > > index. This only works if you can enumerate all the sort orders the > > application might be using and can have an index on each of them. > > > > To do this the query would look something like: > > > > SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50 > > > > Then you take note of the last value used on a given page and if the user > > selects "next" you pass that as the starting point for the next page. > > this will only work unchanged if the index is unique. imagine , for > example if you have more than 50 rows with the same value of col. > > one way to fix this is to use ORDER BY col,oid
and a slightly more complex WHERE clause as well, of course
gnari
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
|
|
 | | From: | Andrei Bintintan | | Subject: | Re: [SQL] OFFSET impact on Performance??? | | Date: | Fri, 21 Jan 2005 11:20:48 +0200 |
|
|
 | Now I read all the posts and I have some answers.
Yes, I have a web aplication. I HAVE to know exactly how many pages I have and I have to allow the user to jump to a specific page(this is where I used limit and offset). We have this feature and I cannot take it out.
>> > SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50 Now this solution looks very fast, but I cannot implement it, because I cannot jump from page 1 to page xxxx only to page 2. Because I know with this type where did the page 1 ended. And we have some really complicated where's and about 10 tables are involved in the sql query.
About the CURSOR I have to read more about them because this is my first time when I hear about. I don't know if temporary tables are a solution, really I don't think so, there are a lot of users that are working in the same time at the same page.
So... still DIGGING for solutions.
Andy.
----- Original Message ----- From: "Ragnar Hafstaš" To: Cc: "Andrei Bintintan" ; Sent: Thursday, January 20, 2005 9:23 PM Subject: Re: [PERFORM] [SQL] OFFSET impact on Performance???
> On Thu, 2005-01-20 at 19:12 +0000, Ragnar Hafstaš wrote: >> On Thu, 2005-01-20 at 11:59 -0500, Greg Stark wrote: >> >> > The best way to do pages for is not to use offset or cursors but to use >> > an >> > index. This only works if you can enumerate all the sort orders the >> > application might be using and can have an index on each of them. >> > >> > To do this the query would look something like: >> > >> > SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50 >> > >> > Then you take note of the last value used on a given page and if the >> > user >> > selects "next" you pass that as the starting point for the next page. >> >> this will only work unchanged if the index is unique. imagine , for >> example if you have more than 50 rows with the same value of col. >> >> one way to fix this is to use ORDER BY col,oid > > and a slightly more complex WHERE clause as well, of course > > gnari > > >
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
|
|
 | | From: | Ragnar Hafstaš | | Subject: | Re: [SQL] OFFSET impact on Performance??? | | Date: | Thu, 20 Jan 2005 19:12:06 +0000 |
|
|
 | On Thu, 2005-01-20 at 11:59 -0500, Greg Stark wrote:
> The best way to do pages for is not to use offset or cursors but to use an > index. This only works if you can enumerate all the sort orders the > application might be using and can have an index on each of them. > > To do this the query would look something like: > > SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50 > > Then you take note of the last value used on a given page and if the user > selects "next" you pass that as the starting point for the next page.
this will only work unchanged if the index is unique. imagine , for example if you have more than 50 rows with the same value of col.
one way to fix this is to use ORDER BY col,oid
gnari
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings
|
|
|