Predictive Analytics in Higher Education
By Linda Hartford, CIO, Northeast Wisconsin Technical College
Persistence and Completion
So what is persistence and completion? Persistence is defined as the number of students who stay in college from year to year. The biggest drop out typically happens from first to second year of college. Completion is the number of students who complete a degree. Completion and persistence rates are important because they measure how well an institution is serving its students. They also indicate the nation’s ability to fill jobs with qualified candidates.
According to the National Student Clearinghouse Research Center (2016), the persistence rate from year one to year two is 72.1 percent. Persistence rates continue to decline during and after the second year of college. The total completion rate of students who start college and completed within six years is 54.8 percent nationally as indicated in figure 1. The persistence and completion rates are lower for part-time students. Rates of completion and persistence vary between two year and four year colleges, public and private, full-time students versus part-time students. Other demographics such as age, gender, socio-economic status, and ethnicity all play a role in these numbers as well.
Figure 1 - Six-Year Outcomes by Enrollment Intensity (N=2,911,634)
Source: Signature Report 12 Completing College: A National View of Student Attainment Rates—Fall 2010 Cohort, National Student Clearinghouse Research Center
The question of interest is how do colleges collect and use big data to predict and improve the success of students (persistence and completion). More precisely, how do institutions determine students who are less likely to succeed, and create interventions that increase a student’s likelihood to succeed?
Sending a student with low grades in math, a reminder about coaching or tutoring is an easy example
This becomes the tricky part as many interventions are not related to academics. Typically students are worried about finances, child care, working and attending college, and an increasing number face issues that need counseling. Because many of these areas become very personal, how and what data do institutions collect before the line of privacy is crossed.
Data, Data, and More Data
Higher education institutions currently collect a wealth of information on students at various points in the student’s college lifecycle from prospect to graduate. Data collected includes high school transcripts, financial data, demographic data, progress and success data, course and program data, as well as data on extracurricular activities. This data can be supplemented with behavioral data collected from social media sites, surveys, text messages, blogs, etc. In addition there is a wealth of open data from many higher education institutions that can be used for benchmarking. In most institutions this transactional data is disparate and disaggregated, making it difficult to analyze.
The upside to spending time, money, and resources on collecting data and creating predictive analytics is the ability for organizations to increase enrollments because of improved success rates. Organized data and predictive analytics can be used to:
• Personalize student learning
• Increase outcomes including graduation rates and persistence
• Monitor a student’s level of engagement
• Notify faculty and advisors when a student is struggling (attendance, grades, participation)
• Aid in predicting student success
The downside to collecting more data, especially non-cognitive and non-academic data, is the concerns with privacy and security. As more data is collected and stored in the cloud there is a perceived increase in risk of data exposure. There is also the question of who is accountable for the security of a student’s data that is stored in the cloud as well as on-site at an institution. Big data in many instances aggregates data from various data sets and allows institutions to use the data in ways that are different from the original intended purpose. Consent to sharing data changes as new ways of combining and analyzing data are developed.
Challenges higher education institutions must tackle include:
• Anonymization and de-identification of data. People are concerned with the capability to re-identify data back to them
• Unintentional use of data beyond the original intended purpose
• The resources required to collect, analyze, and publish data
• Making large amounts of data secure and accessible
• Federal and state policy keeping up with practice
Nudging: Art of the Possible
Big data and predictive analytics can be used by higher education institutions to “nudge” students to make better decisions and increase their likelihood to succeed. The consolidation of academic, demographic, and social data can be used to create interventions that put a student back on track. Having leading indicators instead of waiting for final grades will help improve outcomes. Sending a student with low grades in math, a reminder about coaching or tutoring is an easy example. A more complex variation would be studying a student’s eating habits that go from visiting the cafeteria five days a week to one day a week may indicate an intervention is needed. Access to key data can allow faculty and advisors to intervene more quickly. The ability to tie core academic information with social information is the key to success.
A more personalized and data intensive education system is inevitable. The need to increase outcomes warrants looking at models that have worked elsewhere such as analyzing big data and creating predictive analytics. It is incumbent upon higher education institutions to be aware of and address the challenges of privacy and security as well as the benefits that come with access to so much personal information about students.