knowledge-database (beta)

Current group: pgsql.performance

OFFSET impact on Performance???

OFFSET impact on Performance???  
Andrei Bintintan
 Re: [SQL] OFFSET impact on Performance???  
Alex Turner
 Re: [SQL] OFFSET impact on Performance???  
Ragnar Hafstaš
 Re: [SQL] OFFSET impact on Performance???  
Andrei Bintintan
 Re: [SQL] OFFSET impact on Performance???  
Ragnar Hafstaš
From:Andrei Bintintan
Subject:OFFSET impact on Performance???
Date:Thu, 20 Jan 2005 13:13:44 +0200
This is a multi-part message in MIME format.

------=_NextPart_000_019A_01C4FEF1.DE1BBDF0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi to all,=20

I have the following 2 examples. Now, regarding on the offset if it is =
small(10) or big(>50000) what is the impact on the performance of the =
query?? I noticed that if I return more data's(columns) or if I make =
more joins then the query runs even slower if the OFFSET is bigger. How =
can I somehow improve the performance on this?=20

Best regards,=20
Andy.

explain analyze
SELECT o.id
FROM report r=20
INNER JOIN orders o ON o.id=3Dr.id_order AND o.id_status=3D6
ORDER BY 1 LIMIT 10 OFFSET 10


Limit (cost=3D44.37..88.75 rows=3D10 width=3D4) (actual =
time=3D0.160..0.275 rows=3D10 loops=3D1)
-> Merge Join (cost=3D0.00..182150.17 rows=3D41049 width=3D4) =
(actual time=3D0.041..0.260 rows=3D20 loops=3D1)
Merge Cond: ("outer".id_order =3D "inner".id)
-> Index Scan using report_id_order_idx on report r =
(cost=3D0.00..157550.90 rows=3D42862 width=3D4) (actual =
time=3D0.018..0.075 rows=3D20 loops=3D1)
-> Index Scan using orders_pkey on orders o =
(cost=3D0.00..24127.04 rows=3D42501 width=3D4) (actual =
time=3D0.013..0.078 rows=3D20 loops=3D1)
Filter: (id_status =3D 6)
Total runtime: 0.373 ms

explain analyze
SELECT o.id
FROM report r=20
INNER JOIN orders o ON o.id=3Dr.id_order AND o.id_status=3D6
ORDER BY 1 LIMIT 10 OFFSET 1000000

Limit (cost=3D31216.85..31216.85 rows=3D1 width=3D4) (actual =
time=3D1168.152..1168.152 rows=3D0 loops=3D1)
-> Sort (cost=3D31114.23..31216.85 rows=3D41049 width=3D4) (actual =
time=3D1121.769..1152.246 rows=3D42693 loops=3D1)
Sort Key: o.id
-> Hash Join (cost=3D2329.99..27684.03 rows=3D41049 width=3D4) =
(actual time=3D441.879..925.498 rows=3D42693 loops=3D1)
Hash Cond: ("outer".id_order =3D "inner".id)
-> Seq Scan on report r (cost=3D0.00..23860.62 =
rows=3D42862 width=3D4) (actual time=3D38.634..366.035 rows=3D42864 =
loops=3D1)
-> Hash (cost=3D2077.74..2077.74 rows=3D42501 width=3D4) =
(actual time=3D140.200..140.200 rows=3D0 loops=3D1)
-> Seq Scan on orders o (cost=3D0.00..2077.74 =
rows=3D42501 width=3D4) (actual time=3D0.059..96.890 rows=3D42693 =
loops=3D1)
Filter: (id_status =3D 6)
Total runtime: 1170.586 ms

------=_NextPart_000_019A_01C4FEF1.DE1BBDF0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable



charset=3Diso-8859-1">





Hi to all,

I have the following 2 examples. =
Now,=20
regarding on the offset if it is small(10) or big(>50000) what is the =
impact=20
on the performance of the query?? I noticed that if I return more=20
data's(columns) or if I make more joins then the query runs =
even=20
slower if the OFFSET is bigger. How can I somehow improve the =
performance on=20
this?


Best regards,
Andy.


explain size=3D2>analyze
color=3D#0000ff>SELECT
=20
o.id
size=3D2>FROM=20
report r
size=3D2>INNERsize=3D2> JOIN orders o color=3D#008000>ON=20
o.id=3Dr.id_order AND =
o.id_status=3D6
color=3D#6d13b7>ORDER color=3D#008000>BY 1 LIMIT 10 color=3D#6d13b7>OFFSET 10


 

Limit  (cost=3D44.37..88.75 =
rows=3D10 width=3D4)=20
(actual time=3D0.160..0.275 rows=3D10 loops=3D1)
  ->  =
Merge=20
Join  (cost=3D0.00..182150.17 rows=3D41049 width=3D4) (actual =
time=3D0.041..0.260=20
rows=3D20 loops=3D1)
        Merge =
Cond:=20
("outer".id_order =3D =
"inner".id)
       =20
->  Index Scan using report_id_order_idx on report r =20
(cost=3D0.00..157550.90 rows=3D42862 width=3D4) (actual =
time=3D0.018..0.075 rows=3D20=20
loops=3D1)
        ->  =
Index Scan=20
using orders_pkey on orders o  (cost=3D0.00..24127.04 rows=3D42501 =
width=3D4)=20
(actual time=3D0.013..0.078 rows=3D20=20
loops=3D1)
          =
;   =20
Filter: (id_status =3D 6)
Total runtime: 0.373 ms


explain size=3D2>analyze
color=3D#0000ff>SELECT
=20
o.id
size=3D2>FROM=20
report r
size=3D2>INNERsize=3D2> JOIN orders o color=3D#008000>ON=20
o.id=3Dr.id_order AND =
o.id_status=3D6
color=3D#6d13b7>ORDER color=3D#008000>BY 1 LIMIT 10 color=3D#6d13b7>OFFSET 1000000

face=3DArial>size=3D2>Limit  (cost=3D31216.85..31216.85 rows=3D1 width=3D4) =
(actual=20
time=3D1168.152..1168.152 rows=3D0 loops=3D1)
  ->  =
Sort =20
(cost=3D31114.23..31216.85 rows=3D41049 width=3D4) (actual =
time=3D1121.769..1152.246=20
rows=3D42693 loops=3D1)
        =
Sort Key:=20
o.id
        ->  Hash =
Join =20
(cost=3D2329.99..27684.03 rows=3D41049 width=3D4) (actual =
time=3D441.879..925.498=20
rows=3D42693=20
loops=3D1)
          =
;   =20
Hash Cond: ("outer".id_order =3D=20
"inner".id)
         &nbs=
p;   =20
->  Seq Scan on report r  (cost=3D0.00..23860.62 =
rows=3D42862 width=3D4)=20
(actual time=3D38.634..366.035 rows=3D42864=20
loops=3D1)
          =
;   =20
->  Hash  (cost=3D2077.74..2077.74 rows=3D42501 width=3D4) =
(actual=20
time=3D140.200..140.200 rows=3D0=20
loops=3D1)
          =
;         =20
->  Seq Scan on orders o  (cost=3D0.00..2077.74 =
rows=3D42501 width=3D4)=20
(actual time=3D0.059..96.890 rows=3D42693=20
loops=3D1)
          =
;            =
   =20
Filter: (id_status =3D 6)
Total runtime: 1170.586=20
ms


------=_NextPart_000_019A_01C4FEF1.DE1BBDF0--
From:Alex Turner
Subject:Re: [SQL] OFFSET impact on Performance???
Date:Thu, 20 Jan 2005 11:39:16 -0500
I am also very interesting in this very question.. Is there any way to
declare a persistant cursor that remains open between pg sessions?
This would be better than a temp table because you would not have to
do the initial select and insert into a fresh table and incur those IO
costs, which are often very heavy, and the reason why one would want
to use a cursor.

Alex Turner
NetEconomist


On Thu, 20 Jan 2005 15:20:59 +0000, Richard Huxton wrote:
> Andrei Bintintan wrote:
> >> If you're using this to provide "pages" of results, could you use a
> >> cursor?
> >
> > What do you mean by that? Cursor?
> >
> > Yes I'm using this to provide "pages", but If I jump to the last pages
> > it goes very slow.
>
> DECLARE mycursor CURSOR FOR SELECT * FROM ...
> FETCH FORWARD 10 IN mycursor;
> CLOSE mycursor;
>
> Repeated FETCHes would let you step through your results. That won't
> work if you have a web-app making repeated connections.
>
> If you've got a web-application then you'll probably want to insert the
> results into a cache table for later use.
>
> --
> Richard Huxton
> Archonet Ltd
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
From:Ragnar Hafstaš
Subject:Re: [SQL] OFFSET impact on Performance???
Date:Thu, 20 Jan 2005 19:23:12 +0000
On Thu, 2005-01-20 at 19:12 +0000, Ragnar Hafstaš wrote:
> On Thu, 2005-01-20 at 11:59 -0500, Greg Stark wrote:
>
> > The best way to do pages for is not to use offset or cursors but to use an
> > index. This only works if you can enumerate all the sort orders the
> > application might be using and can have an index on each of them.
> >
> > To do this the query would look something like:
> >
> > SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50
> >
> > Then you take note of the last value used on a given page and if the user
> > selects "next" you pass that as the starting point for the next page.
>
> this will only work unchanged if the index is unique. imagine , for
> example if you have more than 50 rows with the same value of col.
>
> one way to fix this is to use ORDER BY col,oid

and a slightly more complex WHERE clause as well, of course

gnari



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
From:Andrei Bintintan
Subject:Re: [SQL] OFFSET impact on Performance???
Date:Fri, 21 Jan 2005 11:20:48 +0200
Now I read all the posts and I have some answers.

Yes, I have a web aplication.
I HAVE to know exactly how many pages I have and I have to allow the user to
jump to a specific page(this is where I used limit and offset). We have this
feature and I cannot take it out.


>> > SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50
Now this solution looks very fast, but I cannot implement it, because I
cannot jump from page 1 to page xxxx only to page 2. Because I know with
this type where did the page 1 ended. And we have some really complicated
where's and about 10 tables are involved in the sql query.

About the CURSOR I have to read more about them because this is my first
time when I hear about.
I don't know if temporary tables are a solution, really I don't think so,
there are a lot of users that are working in the same time at the same page.

So... still DIGGING for solutions.

Andy.

----- Original Message -----
From: "Ragnar Hafstaš"
To:
Cc: "Andrei Bintintan" ;
Sent: Thursday, January 20, 2005 9:23 PM
Subject: Re: [PERFORM] [SQL] OFFSET impact on Performance???


> On Thu, 2005-01-20 at 19:12 +0000, Ragnar Hafstaš wrote:
>> On Thu, 2005-01-20 at 11:59 -0500, Greg Stark wrote:
>>
>> > The best way to do pages for is not to use offset or cursors but to use
>> > an
>> > index. This only works if you can enumerate all the sort orders the
>> > application might be using and can have an index on each of them.
>> >
>> > To do this the query would look something like:
>> >
>> > SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50
>> >
>> > Then you take note of the last value used on a given page and if the
>> > user
>> > selects "next" you pass that as the starting point for the next page.
>>
>> this will only work unchanged if the index is unique. imagine , for
>> example if you have more than 50 rows with the same value of col.
>>
>> one way to fix this is to use ORDER BY col,oid
>
> and a slightly more complex WHERE clause as well, of course
>
> gnari
>
>
>


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
From:Ragnar Hafstaš
Subject:Re: [SQL] OFFSET impact on Performance???
Date:Thu, 20 Jan 2005 19:12:06 +0000
On Thu, 2005-01-20 at 11:59 -0500, Greg Stark wrote:

> The best way to do pages for is not to use offset or cursors but to use an
> index. This only works if you can enumerate all the sort orders the
> application might be using and can have an index on each of them.
>
> To do this the query would look something like:
>
> SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50
>
> Then you take note of the last value used on a given page and if the user
> selects "next" you pass that as the starting point for the next page.

this will only work unchanged if the index is unique. imagine , for
example if you have more than 50 rows with the same value of col.

one way to fix this is to use ORDER BY col,oid

gnari



---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
   

Copyright © 2006 knowledge-database   -   All rights reserved