Things To Keep In Mind While Building The Database
Things to Optimize
OR in the Join Predicate/WHERE Clause Across Multiple Columns
SQL Server can efficiently filter a data set using indexes via the WHERE clause or any combination of filters that are separated by an AND operator. By being exclusive, these operations take data and slice it into progressively smaller pieces, until only our result set remains. OR is a different story. Because it is inclusive, SQL Server cannot process it in a single operation. Instead, each component of the OR must be evaluated independently. When this expensive operation is completed, the results can then be concatenated and returned normally. The scenario in which OR performs worst is when multiple columns or tables are involved. We not only need to evaluate each component of the OR clause, but need to follow that path through the other filters and tables within the query. Even if only a few tables or columns are involved, the performance can become mind-bogglingly bad. Here is a very simple example of how an OR can cause performance to become far worse than you’d ever imagine it could be:
SELECT DISTINCT
PRODUCT.ProductID,
PRODUCT.Name
FROM Production.Product PRODUCT
INNER JOIN Sales.SalesOrderDetail DETAIL
ON PRODUCT.ProductID = DETAIL.ProductID
OR PRODUCT.rowguid = DETAIL.rowguid;
This is an incredibly bad runtime. Here is a better solution, using UNION SELECT
:
SELECT
PRODUCT.ProductID,
PRODUCT.Name
FROM Production.Product PRODUCT
INNER JOIN Sales.SalesOrderDetail DETAIL
ON PRODUCT.ProductID = DETAIL.ProductID
UNION
SELECT
PRODUCT.ProductID,
PRODUCT.Name
FROM Production.Product PRODUCT
INNER JOIN Sales.SalesOrderDetail DETAIL
ON PRODUCT.rowguid = DETAIL.rowguid
Pagination
Reference for implementing pagination in the SQL server - should be fairly similar in the Aurora RDS setting as well (https://www.sqlshack.com/introduction-pagination-sql-server/)
Exporting the Data
Here is the reference for exporting the data - currently thinking of building some sort of streaming API to do the bulk download, but this is something that needs to be fleshed out more https://support.spatialkey.com/export-data-from-database-to-csv-file/
uid: 202007240825 tags: #amazon