Things To Keep In Mind While Building The Database

Things to Optimize

OR in the Join Predicate/WHERE Clause Across Multiple Columns

SQL Server can efficiently filter a data set using indexes via the WHERE clause or any combination of filters that are separated by an AND operator. By being exclusive, these operations take data and slice it into progressively smaller pieces, until only our result set remains. OR is a different story. Because it is inclusive, SQL Server cannot process it in a single operation. Instead, each component of the OR must be evaluated independently. When this expensive operation is completed, the results can then be concatenated and returned normally. The scenario in which OR performs worst is when multiple columns or tables are involved. We not only need to evaluate each component of the OR clause, but need to follow that path through the other filters and tables within the query. Even if only a few tables or columns are involved, the performance can become mind-bogglingly bad. Here is a very simple example of how an OR can cause performance to become far worse than you’d ever imagine it could be:

SELECT DISTINCT

 PRODUCT.ProductID,

    PRODUCT.Name

FROM Production.Product PRODUCT

INNER JOIN Sales.SalesOrderDetail DETAIL

ON PRODUCT.ProductID = DETAIL.ProductID

OR PRODUCT.rowguid = DETAIL.rowguid;

This is an incredibly bad runtime. Here is a better solution, using UNION SELECT:

SELECT

PRODUCT.ProductID,

PRODUCT.Name

FROM Production.Product PRODUCT

INNER JOIN Sales.SalesOrderDetail DETAIL

ON PRODUCT.ProductID = DETAIL.ProductID

UNION

SELECT

PRODUCT.ProductID,

PRODUCT.Name

FROM Production.Product PRODUCT

INNER JOIN Sales.SalesOrderDetail DETAIL

ON PRODUCT.rowguid = DETAIL.rowguid

Pagination

Reference for implementing pagination in the SQL server - should be fairly similar in the Aurora RDS setting as well (https://www.sqlshack.com/introduction-pagination-sql-server/)

Exporting the Data

Here is the reference for exporting the data - currently thinking of building some sort of streaming API to do the bulk download, but this is something that needs to be fleshed out more https://support.spatialkey.com/export-data-from-database-to-csv-file/


uid: 202007240825 tags: #amazon


Date
February 22, 2023