Inconsistent Portfolio Optimization

Hello all,

I am studying the portfolio optimization demo posted here: I ran the code considering 3 portfolio scenarios: 
  1. 4 stocks = ['AAPL', 'WMT', 'AAL', 'MSFT'] - same as demo
  2. 20 selected S&P 500 stocks including the 4 from item 1
  3. 50 random S&P 500 stocks including the 4 from item 1

Here are the common parameters for all jobs:

  * dates between 2018-10-01 and 2020-01-01 (monthly)
  * t_cost = 0.01
  * alpha = 0.005
  * budget = 1000

Results go attached. 

My question: why the 20 and 50 stocks portfolios performed so poorly when compare to 4 stocks and baseline? My understanding of the problem was that, since the 4 initial stocks were included in scenarios 2 and 3, worst case scenario the optimizer would recover the result obtained in scenario 1. Checking the Hamiltonian I didn't identify a penalty term for portfolio concentration in a few stocks - there is no reason why just a subset of 4 could be selected from 20 or 50. What am I missing here? 

Is there any parameter tuning needed to be done for these cases? 20 stocks should not in principle be such a heavy problem for Pegasus.

Thank you!



1 comment
  • Hello,

    I spoke with a colleague who has worked closely with the portfolio optimization example and this was his response to our inquiry:

    Hi Bruno,

    Thanks for the interesting question and comparison.  The portfolio optimization demo is solving a specific mathematical optimization problem that is shown in the README.  This optimization problem is formulated so that the goal is to choose a portfolio that prioritizes assets that have historically high returns, while penalizing the inclusion of pairs of assets that have historically high covariance (which increases risk).  However, it is important to recognize that this objective function is only a model that is attempting to guide us towards a portfolio that will provide "good" performance in the future (as defined by the risk-returns tradeoff parameter).  In essence, "future" or "actual" performance is what you are seeing in the visual output of the demo, but that is not identical to what the model itself is solving for.
    As the saying goes, "past performance is no guarantee of future results".  The multi-period demo is intended to be realistic in that when you run a simulation from 2018-2020, the rebalancing problem that it solves at each period is only given access to the prior historical data.  The plot that is generated in the end is showing how the performance of the portfolios actually played out over time, but those are actually "future" results from the perspective of the set of portfolio optimization rebalancing problems that were solved.

    In your example, the 4 individual stocks apparently performed very well over that particular period from 2018-2020.  However, the portfolio optimization scenarios that also included 20 or 50 total assets don't necessarily "know" to choose those same 4 stocks, since it is solving a rebalancing problem that only has access to pricing data for the time periods prior to each given rebalancing period.  The fact that the optimization model based on historical pricing does not necessarily "pick the winners" should not be too surprising given the inherent challenges in predicting future performance.  Moreover, the portfolio optimization model is not only trying to maximize expected returns, it is also trying to avoid risk from concentrating the portfolio in assets whose prices moved similarly historically.  This inherently pushes towards more diversified portfolios, and it should even penalize over-investment in an individual asset due to the "self interaction" term that appears in the objective function.

    Hopefully this provides some insight to help you make more sense of the results.

    Please let us know if you have any questions.
    Thank you for reaching out to us with your questions.

    Comment actions Permalink

Please sign in to leave a comment.

Didn't find what you were looking for?

New post