An Esteemed International Educational Institute

Business Background:

With the increase in the number of students for whom English is a second language, "one model fits all" approach fails. To elaborate, same reading material and teaching styles for all the students will fail in these scenarios. Reading plays a crucial role for any education development, but finding appropriate reading materials for students at different reading complexity levels is quite often difficult.

Need For Analytics:

To address the problem of providing reading materials at different reading complexity levels, often teachers make an effort to find materials from various online sources. Unfortunately, this process is difficult and time consuming. To suffice the needs of different students, teachers are often forced to rewrite the material themselves to suite the various needs of students. Application of text mining and machine learning approaches on the reading materials automates the process of determining the complexity of the material.

Technical Solution:

Collected data from common core state standards. Each data entry has a grade level which determines the complexity of the reading comprehension. With this data as training set, we applied text mining techniques to extract useful information from the data set. Each text document is represented as a graph and features to define each text document are extracted by applying social network analysis on the graph.

A combination of social network aspects and the graph properties of the text act as features to define the text document. After extracting features from all the text documents, a machine learning algorithm for classification is applied. The accuracy of the model built for commendable for the test data provided by the university.

Software / Tools:

Quadratyx NLPTook Kit, Gephi, Java, R


The model has been built and deployed successfully. The accuracy on the test data provided by the university is commendable.