Percentile Rank and Value Functions

Earlier this year for our Premium Access Users we introduced the PREVAL function, or Previous Value function, which made it possible to access prior period’s data.  Whether it was screening for funds which had a rising Moving Average or comparing returns from a past low, this function has found widespread use.

Today for our Premium Users we are happy to add the pRANK function for Percentile Ranks and the VALUE function for determining a field value for a specified symbol.  These two functions add tremendous power to our screening and backtesting capabilities, and we will demonstrate samples of their use below.  These functions have been in the works for some time and offered a number of technical challenges to overcome.  In the end, it still takes about 15 to 20 seconds to calculate a full set of pRANK values, but with caching and a few other tricks we have tried to minimize this impact.  That said, you can still expect some lengthy calculations at times.

The pRANK function is of the general form pRANK(expression) with an optional form of pRANK(expression, INVERTED). The ‘expression’ value can be a single field or a mathematical combination of fields.  Percentile ranks are calculated from 0 to 99.99 with lower ‘expression’ values having the lower pRank values, unless the optional INVERTED keyword is included.  This optional keyword reverses the order such that the lower ‘expression’ values have the higher pRank values.  This is convenient when using an expression like volatility where lower is better, compared to a Total Return field where higher is better.  As a note to those who care about the details, any ties are rounded up in the pRank calculations regardless of sort order.

The VALUE function is of the general form VALUE(expression, symbol).  Like above, the ‘expression’ value can be a single field or a mathematical expression.  The symbol field is just that, any valid ETF symbol.  Let’s say you want to screen for all funds where 3-month performance is better than that of the QQQ.  The filter could be set up as [Rtn-3mo] > VALUE([Rtn-3mo], QQQ).  Likewise, if you are sorting by using a User Variable such as [_uV1] = pRANK([Rtn-3mo] + .25 * [Rtn-6mo]), value of QQQ for that pRANK could be referenced as VALUE([_UV1], QQQ).

As a reminder, the PREVAL function is of the form PREVAL(expresson, offset).  Expression is the same as above, and offset is the number of days prior to pull the value.

In case there is any confusion, let us lay out the difference between PREVAL and VALUE functions.  PREVAL is used to compare a value for a symbol with the same value for the same symbol some number of days previous.  For example, compare the SMA-50 with the SMA-50 five days earlier to determine whether it is rising or falling.  The VALUE function is a single value for a date, for a specified field and symbol, used to compare all other funds against.  For example, compare the 3-month return to that of QQQ for that same date.

Below are links to some quick examples, not selected for performance reasons, just to demonstrate the use.  Note it makes little sense to use pRank just to merge two “Rtn” values for sorting since the resulting order will generally be the same.  So, our first example will risk adjust returns.  This is difficult to do normally because units are different, but once they are standardized on a 0 to 100 scale there are several ways to do it.

Click image to go to screen

In this example, we are first standardizing a blended return value and then subtracting up to 10 from that based on a standardized volatility measure.

If you are curious about that calculation, remember that both [_mom] and [_vol] have a 0 to 99.99 range, so the maximum amount subtracted would be 9.999 for the most volatile symbol on that date.  Here we created three separate UV’s for this demonstration; therefore, we could see the values in the screen results.  If desired, this could all be combined into a single UV to make the code more compact, thus freeing up remaining UV’s.

An additional note about two small changes made to the screener that are evident here. First, the old code sorted the “DEFINE” statements in alphabetical order prior to saving a screen, resulting in situations where a second or third listed UV might be needed to calculate the first UV, and this made the logic hard to follow.  Also, the code would delete a UV if it was not utilized in a filter or sort rule, even if it was referenced in another UV definition.  This has been corrected, and there is no need to add the frivolous filter statements any longer.

Click image to go to screen

This next example illustrates defining a custom rs value, filtering for funds where this rs value is greater than that of the QQQ,  and then sorting on Rtn-3mo.  Surprisingly as this is written, only 7 of the 101 funds in the list passed this filter.  Depending on your desired logic, this function allows us to set a minimum performance level below which we would rather not be putting assets at risk.

We hope you will try these functions out and evaluate where they can add value to your decision making.  From this point, we see opportunities to fine tune our processes to further insulate our users from the processing delays.  We also see the possibility to take what we have learned here and use it to create custom versions of any standard indicators.  This could move our screening capabilities up another significant notch.  Please let us know what you would like to see next.

 

29 thoughts on “Percentile Rank and Value Functions

  1. I confess I find the illustrations of how these new functions work somewhat baffling. The main thing I would like to do is actually very simple: I would like to be able to sort a list of ETFs using two different return criteria, one longer, the other one shorter, taking the top x based on the longer return, and then, as a final sort, the top y from the shorter one. It seems to me that pRANK should be able to do this, but it’s also possible that I just completely misunderstand what it is measuring. Is it possible, for instance, to identify the ETFs “above the 50th percentile” in a given list? I that that would do it.

    Thanks for all the good work. I wish I understood it better.

    Baltassar

    • Thanks for the opportunity to discuss this a little more because you can do exactly what you mention. The screen linked below calculates a percentile rank on the 6-month return. Filters for that percentile rank > 50, and then sorts by 3-month return. I also set a dummy filter on 6-month return just so it would also show up in the screen output and we could visualize the relationship between the numbers.

      Definition:
      GRP: All Funds
      DEFINE _rsLT = pRANK([Rtn-6mo])
      WHERE [Rtn-6mo] > -100 < = Note this does nothing. WHERE [_rsLT] > 50 < = Takes top 50% of performers SORT desc [Rtn-3mo] Link: http://www.etfscreen.com/screener.php?sn=6ee34

      I recognized the write-up could be confusing when I posted it, but didn’t know a way to cover all the details without it being technical. These functions can be used to achieve some complex results but they can also be used very simply. I apologize for not focusing more on the simple applications first. Thank you for the question and giving me that opportunity now. ( I have an excuse for making things difficult – I’m an engineer.)

      – Hugh

      • I guess I don’t understand what universe percentile rank is using in its calculation. If I replace “all funds” in your example with a specified list of 10 funds, I would expect the top 50% of performers to include exactly five funds. It doesn’t. I just tried it with a list I use, and I get three funds as “current picks”. If I run the same screen with the same ten funds as of October 2011, I get six. So I’m guessing that “percentile rank” is calculating the percentile rank of the funds I’m specifying against the entire universe of funds the site follows?

        Anyway, that’s what’s confusing me.

        Appreciate the help.

        • You are right, percentile ranks are measured across the entire universe, not just the funds in your screen. We considered both and came down on the side of using the entire universe as RSf does. If there is interest we should be able to rank just the symbols in the screen. We could do that with a new function or with an optional argument in the pRank function.

          If we are evaluating only the selected funds would you rather see a percentile rank or a “1 to n” type ranking where n is the number of symbols?

          With the later we would have a technical issue on how to handle the situation where all funds did not have all data and how that would affect the rest of the funds rankings. Otherwise it is easier than a percentile ranking, although we already have the code written for the pRank.

          – Hugh

  2. My intuitive approach would be to use a “1 to n” type ranking; but it seems to me that percentile rank could be employed to pretty much the same purpose, if its calculations were confined to the chosen universe. It might even be superior. I always define my own universe of ETFs and screen within that universe. Now that I understand what pRANK is actually measuring, though, I’m definitely going to investigate it as a tool.

    Baltassar

    • Thanks for the feedback and I agree that the two methods should be interchangeable. The rank function can yield some unexpected numbers if, on a given date, some of the funds did not exist, or did not have sufficient activity to pass our filters. Percentile rank will always return a 0-100 scale regardless of these issues, but could still give some unexpected results.

      If anyone else is reading this and has an opinion I would really like to hear it. – Hugh

  3. Hugh, is it possible to use the market timing rule with indexes?

    For example, using VIX or VXV instead of SPY in the market timing data field.

      • Both VXX and VXZ suffer decay and their performance bears little correlation to the VIX index itself, which is mean reverting. Do you plan to implement this in the near future?

        Also, would it be possible so add some fundamental data for ETFs, such as CAPE, for backtesting?

        • You’re right. I wasn’t thinking well when I suggested ETFs in place of the indexes. Unfortunately I don’t see adding indexes, or fundamental data for that matter, anytime in the near term. This would preclude CAPE type analysis. Sorry about that. – Hugh

  4. Hugh,

    I have a request unrelated to your post. I would like the lin/log toggle to be available on the charts given for each individual cycle. Thanks.

  5. Hugh, would it be possible to have both a beginning and ending date for backtesting.

    For example, I would like to test the performance of a strategy in different years: 2008 (bear market), 2011 (drop in bull market) and 2013 (roaring bull market), separately. I would like to be able to confirm it will work in different market environments. When will it be possible?

      • Our current system limits us to evaluating complete cycles only (except for the most recent). We should be able to specify ending dates and then end each cycle with the first full-cycle date after that specified date. The downside to this is that it does not give you specific beginning and ending dates, but the plus side is that you get multiple cycles through the tested time period. I would envision an output page similar to the Results by Cycle page (with a chart). How would that sound? – Hugh

        • I just want to be able to backtest my strategies for every year. For example, first 2008 and then 2013, so as to be able to probe different market environments. When will it be possible?

          An output page would do if it allows me to check the profit/loss output for different markets environments (for example, check the output only for 2008 or only for 2013). Is that possible?

          • That would be possible to the extent that the beginning and ending dates must be on the defined cycle. This would likely mean the dates would not be exactly end of year dates (for example). I do believe they would be close enough to answer the question you are wanting answered. – Hugh

  6. Just an observation about backtesting specific backtest periods: it is already possible to copy the detailed results of any backtest cycle to a spreadsheet, which can then be parsed and probed in detail. Hard to see what else is really required.

  7. You are correct, Baltassar, and that is effectively what we would be doing. The procedure we are looking at is pulling in data from each cycle, and evaluating the periods starting with the first period after a beginning date and ending with the first period that ends after a requested ending date. The data, as you point out, is already shown on the Results for Cycle page. – Hugh

  8. We now have the option to “zoom-in” on a subset of dates from a model run. This option is available from the bottom of the Results by Cycle page, and produces a similar looking page for the range of dates requested. Note the additional line of chart links under each cycle.

    This option will be fully available tomorrow after the weekly update, but is available now for new screens. Comments are welcome. – Hugh

    • Hugh,
      It works. I like the grid.
      But I hoped that when I would click the “chart” link on the cycles page it would open a page that charted with the new start and end dates. What do you think?
      moises

    • The chart is limited to the new start and end dates now. Is the problem the size of the chart? It could be on a separate page or we could pop up a larger chart on the current page. I’m open to suggestions. – Hugh

    • Good to hear, as it was the easiest way to get a larger chart. The small chart was definitely hard to see. Now back to work on a “rank” function. – Hugh

Leave a Reply

Your email address will not be published. Required fields are marked *