Intro
I always wonder which is the best way to do certain Postgres and PostGIS functions. It's often hard to tell if a certain method is more efficient than another. I'll update this post as I test more functions and methods out to see who's the top dog.
ST_Buffer
ST_Buffer. Pretty simple, it makes a buffer based on your geometry. I often use it to make, say, a 10 meter buffer for every point in a table
geom vs ST_Point
I always create a geom column when I have x and y data for points. But I was wondering if it's worth it. With a large dataset, creating the geom
column can be very time consuming.
I took a table with 1,488,942 rows of data and tested this out.
The geom
column (no index) method:
SELECT *, ST_Buffer(geom, 50) AS buffer
FROM my_table;
The query completed in 163.5 seconds.
The ST_Point method:
SELECT *, ST_Buffer(ST_Point(lon, lat), 50) AS buffer
FROM my_table;
They query completed in 280.1 seconds.
So using geom
was way faster. Probably worth creating the geom
column if you are planning on doing any buffering.
JOIN vs subquery
Using JOIN or a subquery on a table with multiple entries per, for example, user. About 10 million rows.
Subquery
SELECT *
FROM user_data
WHERE
user_id IN
(
SELECT DISTINCT user_id
FROM user_data
WHERE
purchased_item = 'something'
)
AND date = ('2022-02-02')
total time = 58 seconds
JOIN
SELECT *
FROM user_data a
JOIN
(
SELECT DISTINCT user_id
FROM user_data
WHERE
purchased_item = 'something'
) b
ON (a.user_id = b.user_id)
WHERE date = ('2022-02-02')
total time = 92 seconds.
Subquery is substantially faster.