Navbar: home
Purge Script 2.0
Problem:
Goal:
Outcome:
The existing purge tool for an Access-based accounting system removed more records than it should have, affecting the accuracy of clients' financial reports.
To build a new tool using SQL that accurately identifies customers, their contracts, and all related transactional records that qualify to be purged. For my own practice, I ported the tool to MySQL as well to utilize more modern functions.
We sped up reporting and processing times.
Process:
Identifying and isolating records to be purged...
01
Return all vacated contracts for location and save them to a temp table.
SELECT Site, Account
     INTO tempCustAcctVacant
FROM CU
WHERE Status = 2
AND CU.Site IN
     (SELECT Site FROM PurgeParameters)
AND CUMoveOutDate < ANY
     (SELECT TOP 1 CutOff FROM PurgeParameters);
02
Then the customers with contracts vacated after the cut off date.
SELECT Site, Account
     INTO tempCustAcctVacantAfter
FROM CU
WHERE Status = 2
AND CU.Site IN
     (SELECT Site FROM PurgeParameters)
AND CUMoveOutDate >= ANY
     (SELECT TOP 1 CutOff FROM PurgeParameters);
03
All customers currently occupying contracts:
SELECT CU.Site, CU.Account
INTO tempCustAcctOccupied
FROM CU
WHERE CU.Status IN (1, 5, 6)
AND CU.Site IN
     (SELECT Site FROM PurgeParameters);
04
Take the list of customers vacated prior to the cut off date and LEFT JOIN with the occupied. If a customer appears in only the "vacated prior" table, it is written to a temp table. This eliminates customers who vacated contracts prior to the cut off date but are still occupying other contracts.
SELECT DISTINCT tempCustAcctVacant.Account, tempCustAcctVacant.Site
INTO tempAcctFiltered
FROM tempCustAcctVacant
LEFT JOIN tempCustAcctOccupied
     ON (tempCustAcctVacant.Account = tempCustAcctOccupied.Account)
     AND (tempCustAcctVacant.Site = tempCustAcctOccupied.Site)
WHERE tempCustAcctOccupied.Account IS NULL
AND tempCustAcctOccupied.Site IS NULL;
05
That new temp table (vacated prior and not still occupied) is LEFT JOINed against the list of customers vacated after the cut off date, and filtered the same way. The results are customers who vacated contracts prior to the cut off date, are not still occupying any contracts, and have not vacated any contracts after the cut off date either.
SELECT DISTINCT tempAcctFiltered.Account, tempAcctFiltered.Site
INTO tempCustAcctPurge
FROM tempAcctFiltered
LEFT JOIN tempCustAcctVacantAfter
     ON (tempAcctFiltered.Site = tempCustAcctVacantAfter.Site)
     AND (tempAcctFiltered.Account = tempCustAcctVacantAfter.Account)
WHERE (tempCustAcctVacantAfter.Account IS NULL)
AND (tempCustAcctVacantAfter.Site IS NULL);
06
The last filter removes customers whose accounts include transactions that occured after vacating and after the cut off date. These results contain the unique customer ID and location ID of the filtered customers, which are written to a temp table. .
SELECT DISTINCT Account, Site
INTO tempVUIAccounts
FROM Tx
INNER JOIN TxDetail
ON Tx.TxID=TxDetail.TxID
WHERE TxDetail.TxCode IN (52,122,57)
AND Tx.Site IN
     (SELECT Site FROM PurgeParameters)
AND TxDetail.TransactionFromDate >=
     (SELECT TOP 1 CutOff FROM PurgeParameters)
ORDER BY Site, Account;
07
 
SELECT DISTINCT tempCustAcctPurge.Account, tempCustAcctPurge.Site
INTO tempCustAcctPurgeFinal
FROM tempCustAcctPurge
LEFT JOIN tempVUIAccounts
     ON (tempCustAcctPurge.Account = tempVUIAccounts.Account)
     AND (tempCustAcctPurge.Site = tempVUIAccounts.Site)
WHERE (tempVUIAccounts.Account IS NULL)
AND (tempVUIAccounts.Site IS NULL);
08
Based on that list of customer IDs, we start building lists of related transaction records, starting with the parent records and corralling the child records from a few other dependent tables.
SELECT Tx.Account, Tx.TxID
INTO tempTXIDsToPurge
FROM Tx
WHERE Tx.Account IN
     (SELECT Account FROM tempCustAcctPurgeFinal)
AND Tx.Site IN
     (SELECT Site FROM PurgeParameters);
09
--
SELECT TxDetailID
INTO tempTxDIDsToPurge
FROM TxDetail
WHERE TxID IN
     (SELECT TxID FROM tempTxIDsToPurge);
10
--
SELECT TxAppID
INTO tempTxAppliedToPurge
FROM TxApp
WHERE TxDetailID IN
     (SELECT TxDetailID FROM tempTxDIDsToPurge);
11
--
SELECT TxBalID
INTO tempTxBalToPurge
FROM TxBal
WHERE TxDetailID IN
     (SELECT TxDetailID FROM tempTxDIDsToPurge);
12
--
SELECT TxPTDID
INTO tempTxBalPTDToPurge
FROM TxBalPaidThru
WHERE TxBalID IN
     (SELECT TxBalID FROM tempTxBalToPurge);
XX
Prior to modifying any data, a temp table is created and displayed, containing the customer and contract information for the about-to-be-purged records. This is copied into Excel and saved for reference.
SELECT C.Site, C.Account, C.CustomerFirstName, C.CustomerLastName, U.UName, CU.Status, CU.CUMoveInDate, CU.CUMoveOutDate
INTO PurgedCustomers
FROM (C INNER JOIN
          (CU INNER JOIN U
           ON CU.UnitID = U.UnitID)
      ON (C.Account=CU.Account)
      AND (C.Site=CU.Site)
      )
INNER JOIN tempCustAcctPurgeFinal
ON (CU.Account=tempCustAcctPurgeFinal.Account)
AND (C.Account=tempCustAcctPurgeFinal.Account)
WHERE C.Account IN
     (SELECT Account FROM tempCustAcctPurgeFinal)
AND C.Site IN
     (SELECT Site FROM PurgeParameters);
XX
Using the final temp table, the data is purged from 35 tables.
XX
Start an Insert statement and filtering accounts through the CTEs
01
Select all vacant leases:
INSERT INTO CustAcctPurgeFinalReport
WITH
CustAcctVacant AS (
     SELECT SID, Account
     FROM CU
     WHERE Status = 2
     AND CU.SID IN (SELECT SID FROM PurgeParameters)
     AND EndDate < ANY (SELECT CutOff FROM PurgeParameters))
02
Select all leases vacated after the CutOff
, CustAcctVacantAfter AS (
     SELECT SID, Account
     FROM CU
     WHERE Status = 2
     AND CU.SID IN (SELECT SID FROM PurgeParameters)
     AND EndDate >= ANY (SELECT CutOff FROM PurgeParameters))
03
Select all occupied leases
, CustAcctOccupied AS (
     SELECT CU.SID, CU.Account
     FROM CU
     WHERE CU.Status IN (1, 5, 6)
     AND CU.SID IN (SELECT SID FROM PurgeParameters))
04
Join CustAcctVacant with CustAcctOccupied and only keep the customers that do NOT exist in CustAcctOccupied
, AcctFiltered AS (
     SELECT DISTINCT CustAcctVacant.Account, CustAcctVacant.SID
     FROM CustAcctVacant
     LEFT JOIN CustAcctOccupied
          ON CustAcctVacant.Account = CustAcctOccupied.Account
          AND CustAcctVacant.SID = CustAcctOccupied.SID
     WHERE CustAcctOccupied.Account IS NULL
          AND CustAcctOccupied.SID IS NULL)
05
Join AcctFiltered with CustAcctVacantAfter and keep the customers that do NOT exist in CustAcctVacantAfter
, CustAcctPurge AS (
     SELECT DISTINCT AcctFiltered.Account, AcctFiltered.SID
     FROM AcctFiltered
     LEFT JOIN CustAcctVacantAfter
          ON AcctFiltered.SID = CustAcctVacantAfter.SID
          AND AcctFiltered.Account = CustAcctVacantAfter.Account
     WHERE CustAcctVacantAfter.Account IS NULL
          AND CustAcctVacantAfter.SID IS NULL)
06
Select all customer records that registered a "Vacant Unit Income," "Reversal," or "Uncollected" transaction AFTER the CutOff
, VUIAccounts AS (
     SELECT DISTINCT Account, SID
     FROM Tx INNER JOIN TxD ON Tx.TxID=TxD.TxID
     WHERE TxD.TxCode IN (52,122,57)
          AND Tx.SID IN (SELECT SID FROM PurgeParameters)
          AND TxD.TransactionFromDate >= (SELECT CutOff FROM PurgeParameters))
07
Join CustAcctPurge with VUIAccounts and keep the customers that do NOT exist in VUIAccounts
, CustAcctPurgeFinal AS (
     SELECT DISTINCT CustAcctPurge.Account, CustAcctPurge.SID
     FROM CustAcctPurge
     LEFT JOIN VUIAccounts ON CustAcctPurge.Account = VUIAccounts.Account
          AND (CustAcctPurge.SID = VUIAccounts.SID)
     WHERE VUIAccounts.Account IS NULL AND VUIAccounts.SID IS NULL)
08
Start to build the output
SELECT C.SID
     , C.Account AS Acct
     , C.CustomerFirstName AS First_Name
     , C.CustomerLastName AS Last_Name
     , Us.UName AS Unit
     , CU.StartDate AS Move_In_Date
     , CU.EndDate AS Move_Out_Date
FROM C
INNER JOIN (CU INNER JOIN Us ON CU.UID = Us.UID)
     ON C.Account=CU.Account
     AND C.SID=CU.SID
JOIN CustAcctPurgeFinal
     ON CU.Account=CustAcctPurgeFinal.Account
     AND C.Account=CustAcctPurgeFinal.Account
09
Adding the category description, as users do not know the IDs
INNER JOIN LookupUnitStatus
     ON CU.Status = LookupUnitStatus.UnitStatusID
WHERE C.SID IN
     (SELECT SID FROM PurgeParameters)
AND C.Account IN
     (SELECT Account FROM CustAcctPurgeFinal)
ORDER BY C.Account, Us.UName;
01
Return all vacated contracts for location and save them to a temp table.
SELECT Site, Account
INTO tempCustAcctVacant
01
Return all vacated contracts for location and save them to a temp table.
SELECT Site, Account
INTO tempCustAcctVacant