The last two posts talked about text mining and philosophy – http://man.herm3tica.tv/will-and-desire-along-modern-philosophy/ and http://man.herm3tica.tv/will-and-desire-along-modern-philosophy-ii/-. I treated to explore how computer tools can help us to understand from another point of view complex works -as the philosophical ones-. On the present post, I will continue using text mining applied to philosophy focussing the case of Friedrich Nietzsche.

Nietzsche Topic Modeling

In this case, I joined several Nietzsche’s works creating a corpus. Specifficaly, I downloaded from Gutenberg Project the works: The birth of tragedy, Gay science, Zarathustra, Beyond god and evil, Human All too human, Ecce Homo, Antichrist, On the genealogy of moral and The will of power. I runned topicmodels package – https://CRAN.R-project.org/package=topicmodels– over this corpus to separate terms into 3 groups. Shown bellow the plot resulted:

Digital humanities: Text mining applied to philosophy

What we see here is the classification of the 10 most frequent words. As it’s seen, the first and the second topic are quite related. One of them is more oriented to time, art, nature and music; and the other one contains words like morality, people, truth or form. Finally, we have a third topic with words like thou, ye or Zarathustra, among others.

Digital humanities: Text mining applied to philosophy
Distance between Topics, Nietzsche

Per-Document classification

If we know Nietzsche’s work a little, it is not very difficult to guess which work is related with each topic: we find a cluster centered on Zarathustra; another one on The Birth of Tragedy; the last one on the works about moral and christianity. However, let’s plot these correlations using the argument gamma which indicates the per-document probability for each topic:

Digital humanities: Text mining applied to philosophy
Gamma values comparative by topic

Now we can see better how the algorithm distributes works along the topics. The majority of works belong just to one topic. It is because of this that they have two ceros and a one within each plot. Just few works have medium values: Human too human, Gay science and Ecce homo. Probably, we should consider these works as bridges within Nietzsche’s corpus. As we see, these bridges are situated, especially, on the works Human, All Too Human, even more Gay science and Ecce homo. At the same time, Ecce homo is the only work which has similarities with Zarathustra.

Other sources

This third post of text mining applied to philosophy gives continuity to previous studies. Among then, I would like to highlight the work of David M. Berry, especially the post Berry, D. M., & Rybicki, J. (2012, diciembre 19). The author signal: Nietzsche’s typewriter and medium theory. Stunlaw. http://stunlaw.blogspot.com/2012/12/the-author-signal-nietzsches-typewriter.html. This post points out how technologies influence in our production of knowledge, not just as a change of style. Of course, also the text Silge, J., & Robinson, D. (2017). Text mining with R: A tidy approach (First edition). O’Reilly. https://www.tidytextmining.com/ is a key reference regarding to the technical development.