Last week I collaborated with the Master of Educational Innovation. I shared with them some examples of the use of computing tools in education. Specifically, I focussed on the use of digital humanities methods to improve the pedagogical approach. So, how techniques like text mining or sentimental analysis can help to educational innovation? Since, it is an area which go little further of my discipline, I searched some papers which deal with the matter.


I found some good papers which resume well the latest advances. For example: Ferreira‐Mello, R., André, M., Pinheiro, A., Costa, E., & Romero, C. (2019). Text mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(6).; or Romero, C., & Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. WIREs Data Mining and Knowledge Discovery, 10(3).

To put it in practice, I chose a public domain dataset related with online education. In this case, I found a dataset on kaggle platform about another platform: medium, which is one of the most famous tools for spreading knowledge about almost any field. It is widely used to published articles on ML, AI, and data science. The dataset contains articles, their title, number of claps it has received, their links and their reading time. So I processed this data with the code you can find on github. I obtained the plot you can check bellow.


Most frequent words have to do with keywords of the subject of the forum: machine learning, data and network.

The reading time of the articles which is more accepted is 17 minutes. Interes decreases soon when the article exceeds this length.

Something very similar happens with the title length. The attention falls of when the title has more than 50 words.

Finally, I used sentimental analysis. We can check how the more charge of sentiment the post has, the more claps it earns. This is quite suggesting. It corroborates the success of polarized speeches, even in educational contexts.