In text mining, Tf-idf values (Term frequency- Inverse document frequency measure the frequency which a term appears in a document, but compensated with the more common terms that appear along the corpus where you are searching. In this way, you can obtain the key terms which, within a corpus, make more difference in between the documents –I presented some of these techniques before-. What it has to do with philosophy?

Philosophy Text Mining

As an approach, I decided to perform a Tf-idf analysis on a corpus of philosophy works. As everybody knows, sometimes, philosophy is hard to read, and not always we are able to finish a book or, even, we forget many things when days, weeks and months pass. Is because of this, and also because we are less and less able to manage the large amount of books published and traduced that arrive to our hands, that many people are beginning to use text mining to get closer to them. To be frank, in my case, I had already read the books I analyze here and, because of this, I realized that, even more, text mining could help you not just to get an approach, but also to learn, from another point of view, the texts you are working with. In fact, a similar perspective was proposed by Berry (2011, as a third wave of digital humanities trend some years ago.

From my point of view, one of the more interesting topics in philosophy always was the treatment of the will on the modern philosophy. There are several philosophers who were thinking about the human and the way in which we can overcome ourselves from different positions giving place to different philosophical movements. Among them, we can observe, for example, the philosophy of Spinoza -who on his Ethics treats the concept of desire as a conatus-; the philosophy of Hegel -who, although thought about the Spirit and rationality, includes in its movement the will, the force and the pain of its tearing-; Schopenhauer -who declared, inspired by oriental religion, that all we see is apparently an illusion but, ontologically, a will or volition; and finally Nietzsche -who transformed Schopenhauer nihilistic concept of will into a will of power-.

Philosophy Mining
Wordcloud with philosophers by color

On present visualizations, we can see the words that, being frequent in each work, are at the same time different in the whole selected corpus. I mean, we are front of the words which characterize Spinoza, Hegel, Schopenhauer and Nietzsche philosophy -always taking in account that just some of their works have been analyzed, and after a cleaning- making more differences between them. In a quick look, we can state each philosopher is focussed in different aspects. What do you think about?

For this text mining work, the open text files available in the Gutenberg Project have been used.