Optimization and the Curse of Dimensionality
On a forum I regularly follow, Yahoo AmiBroker Forum, a message complained that the AmiBroker trading system development platform was unable to handle an optimization that had 10 parameters, each with 20 values. Whether that poster was serious about the numbers 10 and 20 or not, the question is worth considering.
A large number of parameters, say 10, each with a large number of evaluation points, say 20, leads directly to the “curse of dimensionality.” Assuming that it takes one nano second to evaluate each alternative, and that everything works as it should and the run does finish (no power failure, memory failure, cpu failure, hard disk failure, flood, earthquake, premature death, etc), exhaustive evaluation of the n = 10^20 points will have taken about 3100 years. At 50 lines per 12 inch page, the listing of the results will be 65 light years long. (The nearby universe is described as being those bodies within about 15 light years.)
Each result gives the value of an objective function. Sorting the results into ascending or descending order according to the value of the objective function puts the set of values that are “best” at the top of the list. Assuming a sorting algorithm that takes Order (n*ln(n)) is used, it will take an additional 46 times 3100 years to sort them.
Now make an individual run using each of the sets of values near the top to gain some confidence that the alternative ranked first is the one that is preferred. Maybe an hour or two more?
Just in time to give a signal for tomorrow’s trade?
————————
What is a reasonable number of alternatives to evaluate?
I recommend beginning by giving a considerable amount of thought and experimentation to design and testing of the objective function. Have confidence that the alternative with the highest objective function score is the one that is preferred. Or at least that it is acceptable — not letting the search for perfect get in the way of finding a satisfactory solution.
Run some tests using your system, your objective function, your data, on your computer. How long does it take to generate and sort 1000 or 10,000 alternatives? How many of these test runs do you expect to make before deciding on the logic and parameters that will be used? How many hours are you willing to allocate to the process? Working the arithmetic backwards will give an estimate of the number of alternatives that can realistically be tested. Continuing to work backward, you can compute the size of the parameter space that can be exhaustively processed in that amount of time. Make some trial runs using both exhaustive and non-exhaustive methods. Compare the results that rank best from each set of runs. Decide whether you can accept the results of the non-exhaustive method. If so, use it; if not, modify the afl code so that the number of parameters and number of evaluation points for each create an examination space that can be processed in the time you have allocated.
Thanks to Tomasz for creating a superb toolset that enables us to have these choices.
Thanks for listening,
Recent Comments