Abstract

You have 11.256 christmas trees in stock. A customer wants to purchse 1.500 of them, with one condition: the average height of the trees must be 6.50 meters.

Your goal is to keep the remaining trees as close to a normal distribution as possible.

distribute_sample_normally_diagram

How can you achieve this?

A Sample Calculation

distribute_sample_normally

Let’s assume the count and distribution of trees are as shown in the diagram above.

A useful first step is to check whether your original sample is already fairly normally distributed. We can calculate skewness using the function sbSWV. The skewness is approximately =sbSWV(“SKEW.P”;$A$4:$A$21;B$4:B$21) = -0.35. Similarly, we calculate kurtosis with =sbSWV(“KURT”;$A$4:$A$21;B$4:B$21), resulting in approximately 0.95. As shown in the diagram (yellow-orange graph), the original sample is already fairly normally distributed.

However, ideally, the distribution would match the one shown in column C (formula: =TRANSPOSE(RoundToSum(NORM.DIST(A4:A21,B24,B25,FALSE)*B22/10,0))). In this case, skewness and kurtosis would be close to zero, though rounding may lead to slight deviations.

Column H displays the ideal normal distribution after the withdrawal of trees.

The automated withdrawals in column D aim to achieve this ideal distribution. However, this is only possible if there are enough trees of the necessary lengths. If not, you must enter a withdrawal of 0, as trees cannot be added to the sample. For instance, in the diagram, you can see that at a length of 6.10 meters, the ideal distribution exceeds the actual remaining distribution.

The original formulas in column F should read =D4 to =D21.

These formulas are manually overwritten to:

  • Achieve a total withdrawal of 1,500 trees.
  • Ensure an average tree height of 6.5 meters.
  • Maintain a standard deviation in the remaining sample close to the original.
  • Reduce the absolute skewness compared to the original.
  • Reduce the absolute kurtosis compared to the original.

In the provided sample file, significant deviations are highlighted using conditional formatting.

A Note of Caution:

It’s not always possible to achieve a fairly normal distribution in the remaining sample. It might even be impossible to withdraw trees that meet a desired average - for example, asking for 21 trees with an average height of 5.60 meters could be unachievable.

Helper Functions

Excel offers many basic statistical functions, but they don’t handle weighted values. The user-defined function sbSWV (statistics for weighted values) used here provides an easy and quick assessment of how well the samples are normally distributed.

To ensure that the sum of the ideal integer distributions matches the sum of the original samples, the user-defined function RoundToSum was employed. Note that the parameter 2 is used for error type to minimize the relative error, preventing artificial rounding errors, particularly in the tails of the distributions.

Download

Please read my Disclaimer.

distribute_sample_normally.xlsm [41 KB Excel file, open and use at your own risk]