The 18th Annual Intercompany Long-Term Care Insurance Conference March 18-21, 2018 - Paris Hotel & Casino - Las Vegas, NV

Pre-Workshop Homework & Resources

Start by Downloading the Software & then start your journey into predictive modeling with R! 1. Download the core R software here. Chose your option based on your operating software (Mac, Windows, Linux). 2. Go to the RStudio download page (link here) and select the ‘Installers for Supported Platforms’ which matches your own computer’s operating system. 3.  Install packages using the attached R program "0001 - Install packages.R".  This code needs to be run only once and can take time to do so; it is preferable that it be run before arriving at the workshop. 4. Get familiar with R through this free introductory course from DataCamp: https://www.datacamp.com/courses/free-introduction-to-r 5. Now explore some helpful functions we will use with the pre-workshop material here: https://www.datacamp.com/courses/claim-terminations-test 6. Download the pre-work presentation and accompanying data, and R code and run through it to give you the important background you will need to have a more successful workshop experience. 7. Final workshop slides and R markdown files (Penalized GLM) are located here. 8. To gain further knowledge on topics discussed in the workshop continue to go through the helpful material listed below. #1.   An   Introduction   to   Statistical   Learning,   Chapter   2.   Statistical   Learning,   key   sections:   2.1   What   is   statistical learning? and 2.2 Assessing model accuracy (through 2.2.2) http://www-bcf.usc.edu/~gareth/ISL/   https://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/                   (video lecture version of the textbook) Objective:  high-level lay of land and opportunity to spark interest to dig deeper in text Time estimate:  60-75 minutes (21 pages) or 30-35 minutes of video #2.   SOA   Long   Term   Care   Experience   Basic   Table   Development,   Appendix   B.   Generalized   Linear   Modeling   Technical Background https://www.soa.org/Files/Research/Exp-Study/2015-ltc-exp-basic-table-report.pdf   Objective:  short, summary description of GLM Time estimate:  <5 minutes (1 page) #3.   Survival   Models   by   Rodríguez,   G.   (2007),   key   sections:   7.1   The   hazard   and   survival   functions;   7.1.1   The   survival function;   7.1.2   The   hazard   function;   7.3.7   Model   fitting;   7.4.3   The   equivalent   Poisson   model;   7.4.4   Time-varying covariates; and 7.4.5 Time-dependent effects http://data.princeton.edu/wws509/notes/c7.pdf   These     are     lecture     notes     from     Chapter     7     of     Princeton     University’s     Generalized     Linear     Models     course ( http://data.princeton.edu/wws509/notes/ ) Objective:      GLM   Poisson   survival   model   is   equivalent   to   Cox,   but   has   the   benefit   of   using   aggregated   data along with introducing partial exposures.  Provides the math and proofs. Time estimate:  30-45 minutes (11 pages) #4.   Non-Parametric   Estimation   in   Survival   Models   by   Rodríguez,   G.   (2005),   key   sections:   1   One   sample:   Kaplan-Meir; 1.1 Estimation with censored data; 1.2 Non-parametric maximum likelihood; and 1.4 The Nelson-Aalen estimator http://data.princeton.edu/pop509/NonParametricSurvival.pdf   These       are       lecture       notes       from       Section       2       of       Princeton       University’s       Survival       Analysis       course ( http://data.princeton.edu/pop509 ) Objective:        Refresher    of    survival    modeling    to    lay    the    foundation    for    making    the    bridge    between    using traditional methods and more robust statistical learning methods Time estimate:  10-15 minutes (4 pages) #5. Bias-variance Tradeoff podcast by SOA Predictive Analytics and Futurism Section https://www.soa.org/prof-dev/podcasts/predictive-analytics-podcasts/   Objective:  Introduce this key concept of predictive analytics Time estimate:  12 minutes (podcast) #6.    The    Elements    of    Statistical    Learning,    Chapter    7.    Model    Assessment    and    Selection,    key    sections:        7.1 Introduction;   7.2   Bias,   variance,   and   model   complexity;   7.5   Estimates   of   in-sample   prediction   error;   and   7.10   Cross- validation http://statweb.stanford.edu/~tibs/ElemStatLearn/   Objective:      Reiterates   bias-variance   and   model   complexity   and   introduces   splitting   data   into   calibration, validation,   and   testing   data   sets.      Discuss   using   in-sample   measurements   of   fit   (AIC   BIC)   and   then   move   to using   cross   validation   techniques.      The   latter   becoming   common   practice   in   statistical   learning   due   the   use   of large datasets and/or advancements of computational power. Time estimate:  45 minutes (14 pages) #7. Cross validation and bootstrapping podcast by SOA Predictive Analytics and Futurism Section https://www.soa.org/prof-dev/podcasts/predictive-analytics-podcasts/   Objective:      Audio   version   to   supplement   portion   of   the   reading   above,   along   with   discussion   of   second resampling   techniques   that   aid   in   training   predictive   models.      Focuses   on   how   to   use   these   techniques   to train models that generalize well to new data. Time estimate:  24 minutes (podcast) #8. Penalized Regression podcast by SOA Predictive Analytics and Futurism Section https://www.soa.org/prof-dev/podcasts/predictive-analytics-podcasts/   Objective:  Introduce penalized regression Time estimate:  16 minutes (podcast) #9.   “Case   Study:   Improving   Financial   Projections   for   Long   Term   Care   Insurance   with   Predictive   Analytics”,   by   Missy Gordon   and   Joe   Long.      http://www.milliman.com/uploadedFiles/insight/2018/improving-projections-long-term-care- insurance.pdf Objective:        Introduction    of    the    general    concepts    of    progressing    from    traditional    analytics    to    predictive analytics for developing LTC projection assumptions. Time estimate:  10-15 minutes (3 pages) Extra credit Advanced or post-seminar resources Materials   prepared   by   Eileen   Burns   and   Matthias   Kullowatz   of   Milliman   for   the   Practical   Predictive   Analytics   May 2016   Seminar   that   was   produce   by   the   SOA   Predictive   Analytics   and   Futurism   Section.      These   documents   walk through   R   code   for   a   mini   predictive   modeling   example.      This   example   is   unrelated   to   what   will   be   performed   at   the LTC workshop, but gives a framework to explore. http://ppas-2016.s3-website-us-east-1.amazonaws.com/20160516/PPAS_Practical_20160516_MAMK.pdf   http://ppas-2016.s3-website-us-east-1.amazonaws.com/20160516/PPAS_DataPrep_20160513.pdf   http://ppas-2016.s3-website-us-east-1.amazonaws.com/20160516/PPAS_Modeling_Validation_20160513.pdf   http://ppas-2016.s3-website-us-east-1.amazonaws.com/index.html  (three datasets for the examples) Materials   prepared   by   Eileen   Burns   and   Matthias   Kullowatz   of   Milliman   for   the   Practical   Predictive   Analytics   May 2016 Seminar that was produce by the SOA Predictive Analytics and Futurism Section https://www.rstudio.com/products/rstudio/download/ PPAS_IntrotoR_20160415.pdf   iris2.csv    –   This   file   is   used   to   demonstrate   table   joining   functionality   in   R,   download   it   to   the   working   directory   you intend to use for the introductory exercises. https://cran.r-project.org/doc/contrib/Short-refcard.pdf   https://www.rstudio.com/resources/cheatsheets    (Data Wrangling, in particular) Objective:  Install RStudio and be introduced to R coding Time estimate: 2-8 hours, depending on programming background A   discussion   on   credibility   and   penalized   regression,   with   implications   for   actuarial   work   by   Hugh   Miller   presented to the Actuaries Institute, key sections:  1. Background and 2. Credibility and penalized regression http://actuaries.asn.au/Library/Events/ASTINAFIRERMColloquium/2015/MillerCredibiliyPaper.pdf   Objective:  Connects penalized regression with “traditional” credibility methods Time estimate:  15-30 minutes (8 pages) Calibrating Risk Score: Model with Partial Credibility by Shea Parkes and Brad Armstrong https://www.soa.org/Library/Newsletters/Forecasting-Futurism/2015/July/ffn-2015-iss11-parkes-armstrong.aspx   Objective:  Application of using penalized regression with offset to update an existing assumption Time estimate:  10-15 minutes (3 pages) Applications   of   the   offset   in   property-casualty   predictive   modeling   from   Casualty   Actuarial   Society   E-Forum   Winter 2009, key sections starting on:  page 370 Exposure adjustments and the offset and page 376 Sequential modeling https://www.casact.org/pubs/forum/09wforum/yan_et_al.pdf   Objective:        Provides    deeper    theory    of    using    an    offset    in    a    multiplicative    model    to    update    an    existing assumption Time estimate:  10-15 minutes (4 pages) Generalized additive models (GAM) http://multithreaded.stitchfix.com/blog/2015/07/30/gam/   SOA Predictive Analytics and Futurism Section podcasts: Decision trees Random forests and gradient boosting machines https://www.soa.org/prof-dev/podcasts/predictive-analytics-podcasts/   Deep Learning https://en.wikipedia.org/wiki/Deep_learning   Applied machine learning guide http://machinelearningmastery.com/start-here/   Info on scaling R http://blog.revolutionanalytics.com/2016/10/tutorial-scalable-r-on-spark.html   SOA Predictive Analytics and Futurism Section Newsletter, December 2015 Includes   articles   (1)   Getting   Started   in   Predictive   Analytics:   Books   and   Courses   by   Mary   Pat   Campbell   and   (2)   Johns Hopkins Data Science Specialization courses: A review by Shea Parkes https://www.soa.org/Library/Newsletters/Predictive-Analytics-and-Futurism/2015/december/paf-iss12.pdf   John Hopkins Data Science Specialization courses reviewed in the above-referenced article https://www.coursera.org/specializations/jhu-data-science   $29-49 per course; 10 courses available Blogs and newsletters http://dataelixir.com/   https://www.r-bloggers.com   http://www.statsblogs.com/   http://www.win-vector.com/blog/   Practice https://www.kaggle.com/   Additional languages http://www.sas.com/en_us/home.html   https://www.python.org/   http://julialang.org/downloads/   https://www.mathworks.com/products/matlab/?requestedDomain=www.mathworks.com   http://mc-stan.org/   https://www.ruby-lang.org/en/  
© ILTCI Conference 2017-18 - All Rights Reserved.
The 18th Annual Intercompany Long-Term Care Insurance Conference March 18-21, 2018 - Paris Hotel & Casino - Las Vegas, NV

Pre-Workshop Homework &

Resources

Start by Downloading the Software & then start your journey into predictive modeling with R! 1. Download the core R software here. Chose your option based on your operating software (Mac, Windows, Linux). 2. Go to the RStudio download page (link here) and select the ‘Installers for Supported Platforms’ which matches your own computer’s operating system. 3.  Install packages using the attached R program "0001 - Install packages.R".  This code needs to be run only once and can take time to do so; it is preferable that it be run before arriving at the workshop. 4. Get familiar with R through this free introductory course from DataCamp: https://www.datacamp.com/courses/free- introduction-to-r 5. Now explore some helpful functions we will use with the pre-workshop material here: https://www.datacamp.com/courses/claim- terminations-test 6. Download the pre-work presentation and accompanying data, and R code and run through it to give you the important background you will need to have a more successful workshop experience. 7. Final workshop slides and R markdown files (Penalized GLM) are located here. 8. To gain further knowledge on topics discussed in the workshop continue to go through the helpful material listed below. #1.    An    Introduction    to    Statistical    Learning,    Chapter    2.    Statistical Learning,    key    sections:    2.1    What    is    statistical    learning?    and    2.2 Assessing model accuracy (through 2.2.2) http://www-bcf.usc.edu/~gareth/ISL/   - learning-in-15-hours-of-expert-videos/         (video    lecture    version    of    the textbook) Objective:        high-level    lay    of    land    and    opportunity    to    spark interest to dig deeper in text Time   estimate:      60-75   minutes   (21   pages)   or   30-35   minutes   of video #2.     SOA     Long     Term     Care     Experience     Basic     Table     Development, Appendix B. Generalized Linear Modeling Technical Background w r r e - report.pdf   Objective:  short, summary description of GLM Time estimate:  <5 minutes (1 page) #3.    Survival    Models    by    Rodríguez,    G.    (2007),    key    sections:    7.1    The hazard   and   survival   functions;   7.1.1   The   survival   function;   7.1.2   The hazard    function;    7.3.7    Model    fitting;    7.4.3    The    equivalent    Poisson model; 7.4.4 Time-varying covariates; and 7.4.5 Time-dependent effects http://data.princeton.edu/wws509/notes/c7.pdf   These    are    lecture    notes    from    Chapter    7    of    Princeton    University’s Generalized                         Linear                         Models                         course ( http://data.princeton.edu/wws509/notes/ ) Objective:      GLM   Poisson   survival   model   is   equivalent   to   Cox,   but has   the   benefit   of   using   aggregated   data   along   with   introducing partial exposures.  Provides the math and proofs. Time estimate:  30-45 minutes (11 pages) #4.    Non-Parametric    Estimation    in    Survival    Models    by    Rodríguez,    G. (2005),   key   sections:   1   One   sample:   Kaplan-Meir;   1.1   Estimation   with censored   data;   1.2   Non-parametric   maximum   likelihood;   and   1.4   The Nelson-Aalen estimator http://data.princeton.edu/pop509/NonParametricSurvival.pdf   These    are    lecture    notes    from    Section    2    of    Princeton    University’s Survival Analysis course ( http://data.princeton.edu/pop509 ) Objective:      Refresher   of   survival   modeling   to   lay   the   foundation for   making   the   bridge   between   using   traditional   methods   and more robust statistical learning methods Time estimate:  10-15 minutes (4 pages) #5.    Bias-variance    Tradeoff    podcast    by    SOA    Predictive    Analytics    and Futurism Section https://www.soa.org/prof-dev/podcasts/predictive-analytics-podcasts/   Objective:  Introduce this key concept of predictive analytics Time estimate:  12 minutes (podcast) #6.   The   Elements   of   Statistical   Learning,   Chapter   7.   Model   Assessment and   Selection,   key   sections:      7.1   Introduction;   7.2   Bias,   variance,   and model   complexity;   7.5   Estimates   of   in-sample   prediction   error;   and   7.10 Cross-validation http://statweb.stanford.edu/~tibs/ElemStatLearn/   Objective:      Reiterates   bias-variance   and   model   complexity   and introduces   splitting   data   into   calibration,   validation,   and   testing data   sets.      Discuss   using   in-sample   measurements   of   fit   (AIC   BIC) and   then   move   to   using   cross   validation   techniques.      The   latter becoming   common   practice   in   statistical   learning   due   the   use   of large datasets and/or advancements of computational power. Time estimate:  45 minutes (14 pages) #7.    Cross    validation    and    bootstrapping    podcast    by    SOA    Predictive Analytics and Futurism Section https://www.soa.org/prof-dev/podcasts/predictive-analytics-podcasts/   Objective:      Audio   version   to   supplement   portion   of   the   reading above,   along   with   discussion   of   second   resampling   techniques that   aid   in   training   predictive   models.      Focuses   on   how   to   use these    techniques    to    train    models    that    generalize    well    to    new data. Time estimate:  24 minutes (podcast) #8.    Penalized    Regression    podcast    by    SOA    Predictive    Analytics    and Futurism Section https://www.soa.org/prof-dev/podcasts/predictive-analytics-podcasts/   Objective:  Introduce penalized regression Time estimate:  16 minutes (podcast) #9.   “Case   Study:   Improving   Financial   Projections   for   Long   Term   Care Insurance   with   Predictive   Analytics”,   by   Missy   Gordon   and   Joe   Long.     - projections-long-term-care-insurance.pdf Objective:      Introduction   of   the   general   concepts   of   progressing from   traditional   analytics   to   predictive   analytics   for   developing LTC projection assumptions. Time estimate:  10-15 minutes (3 pages) Extra credit Advanced or post-seminar resources Materials   prepared   by   Eileen   Burns   and   Matthias   Kullowatz   of   Milliman for    the    Practical    Predictive    Analytics    May    2016    Seminar    that    was produce   by   the   SOA   Predictive   Analytics   and   Futurism   Section.      These documents    walk    through    R    code    for    a    mini    predictive    modeling example.      This   example   is   unrelated   to   what   will   be   performed   at   the LTC workshop, but gives a framework to explore. h   t   t   p   :   /   /   p   p   a   s   -   2   0   1   6   .   s   3   -   w   e   b   s   i   t   e   -   u   s   -   e   a   s   t   - 1.amazonaws.com/20160516/PPAS_Practical_20160516_MAMK.pdf   h   t   t   p   :   /   /   p   p   a   s   -   2   0   1   6   .   s   3   -   w   e   b   s   i   t   e   -   u   s   -   e   a   s   t   - 1.amazonaws.com/20160516/PPAS_DataPrep_20160513.pdf   h   t   t   p   :   /   /   p   p   a   s   -   2   0   1   6   .   s   3   -   w   e   b   s   i   t   e   -   u   s   -   e   a   s   t   - 1.amazonaws.com/20160516/PPAS_Modeling_Validation_20160513.pdf     (three datasets for the examples) Materials   prepared   by   Eileen   Burns   and   Matthias   Kullowatz   of   Milliman for    the    Practical    Predictive    Analytics    May    2016    Seminar    that    was produce by the SOA Predictive Analytics and Futurism Section https://www.rstudio.com/products/rstudio/download/ PPAS_IntrotoR_20160415.pdf   iris2.csv    –   This   file   is   used   to   demonstrate   table   joining   functionality   in R,    download    it    to    the    working    directory    you    intend    to    use    for    the introductory exercises. https://cran.r-project.org/doc/contrib/Short-refcard.pdf   https://www.rstudio.com/resources/cheatsheets          (Data   Wrangling,   in particular) Objective:  Install RStudio and be introduced to R coding Time estimate: 2-8 hours, depending on programming background A   discussion   on   credibility   and   penalized   regression,   with   implications for   actuarial   work   by   Hugh   Miller   presented   to   the   Actuaries   Institute, key sections:  1. Background and 2. Credibility and penalized regression r / MillerCredibiliyPaper.pdf   Objective:        Connects    penalized    regression    with    “traditional” credibility methods Time estimate:  15-30 minutes (8 pages) Calibrating   Risk   Score:   Model   with   Partial   Credibility   by   Shea   Parkes   and Brad Armstrong - Futurism/2015/July/ffn-2015-iss11-parkes-armstrong.aspx   Objective:      Application   of   using   penalized   regression   with   offset to update an existing assumption Time estimate:  10-15 minutes (3 pages) Applications   of   the   offset   in   property-casualty   predictive   modeling   from Casualty   Actuarial   Society   E-Forum   Winter   2009,   key   sections   starting on:        page    370    Exposure    adjustments    and    the    offset    and    page    376 Sequential modeling https://www.casact.org/pubs/forum/09wforum/yan_et_al.pdf   Objective:        Provides    deeper    theory    of    using    an    offset    in    a multiplicative model to update an existing assumption Time estimate:  10-15 minutes (4 pages) Generalized additive models (GAM) http://multithreaded.stitchfix.com/blog/2015/07/30/gam/   SOA Predictive Analytics and Futurism Section podcasts: Decision trees Random forests and gradient boosting machines https://www.soa.org/prof-dev/podcasts/predictive-analytics-podcasts/   Deep Learning https://en.wikipedia.org/wiki/Deep_learning   Applied machine learning guide http://machinelearningmastery.com/start-here/   Info on scaling R - spark.html   SOA   Predictive   Analytics   and   Futurism   Section   Newsletter,   December 2015 Includes   articles   (1)   Getting   Started   in   Predictive   Analytics:   Books   and Courses   by   Mary   Pat   Campbell   and   (2)   Johns   Hopkins   Data   Science Specialization courses: A review by Shea Parkes w - Futurism/2015/december/paf-iss12.pdf   John    Hopkins    Data    Science    Specialization    courses    reviewed    in    the above-referenced article https://www.coursera.org/specializations/jhu-data-science   $29-49 per course; 10 courses available Blogs and newsletters http://dataelixir.com/   https://www.r-bloggers.com   http://www.statsblogs.com/   http://www.win-vector.com/blog/   Practice https://www.kaggle.com/   Additional languages http://www.sas.com/en_us/home.html   https://www.python.org/   http://julialang.org/downloads/   w r r w w.mathworks.com   http://mc-stan.org/   https://www.ruby-lang.org/en/