Because of its particular features and ‘always parallel’ implementation, Teradata must have a very good distribution amongst AMP.
This will enhance performances, query time execution, and space.
In this example, I’m going to explain how you can dramatically affect your space management with a bad distribution
Here is the postulate (I make it simple purposely in order things to be clear);
- Your system has 10 AMP
- One database on your system is 100Gb large (100 Gb of maxperm).
- There is one table in the database
That means, the system automatically dispatches the space, dividing 100Gb by 10 AMP, so 10Gb by AMP
Now if you choose bad distribution for your table, by choosing badly your PI (Primary Index) that is not reflecting a good selectivity, you can have the following figure
In theory, on 100Gb, only 20Gb are used and you can pick back 80Gb of free space.
In theory, yes you can! But practically, you can not
Just try to perform against your database a move space for 80Gb, and you will receive an error message saying you can not perform this operation…
Why? Because if you try to remove 80GB amongst 10AMP, the system tries to remove equally 80/10 so 8Gb on each AMP. It will work for AMP 1 to 9, but not for AMP 10!!! Data are stored on this AMP.
The real space you can only pick back is the following
- You take the most loaded AMP, take the Maxperm (10Gb) minus currentperm for this AMP (9Gb), so 1Gb
- You multiply this 1Gb by the number of AMP (10) and you obtain the quantity of space you can really pick back, that means, 10 X 1 = 10Gb
If you compare this 10Gb to 80Gb theoretically moveable, what a difference, right?
70GB will stay free, but never really usable, and never moveable…
So here is a query you can perform against your system to know, the amount of space you can really move:
SELECT TRIM(C.databasename) AS DATABASENAME, C.VPROC_COUNT AS VPROC_COUNT, C.BIGGER_SPACE AS AMP_WITH_MORE_USED_SPACE, C.MAXPERM AS MAXPERM_BY_AMP, CASE WHEN C.BIGGER_SPACE = 0 THEN 0 WHEN C.BIGGER_SPACE <> 0 AND C.MAXPERM <>0 THEN (C.MAXPERM - C.BIGGER_SPACE) * VPROC_COUNT ELSE 0 END AS REUSABLE_SPACE FROM ( SELECT DatabaseName, COUNT(VProc) , CAST(MAX(CurrentPerm) AS BIGINT), CAST(MAX(MaxPerm) AS BIGINT) FROM DBC.DiskSPace GROUP BY 1 ) C (DatabaseName, VPROC_COUNT, BIGGER_SPACE, MAXPERM) WHERE REUSABLE_SPACE > 1000000000 AND DatabaseName NOT IN ('DBC', 'Crashdumps', 'viewpoint', 'SystemFe', 'SYSBAR') ORDER BY REUSABLE_SPACE DESC