Next November 24th 18.00 hours (GMT + 2) I will be teaching the session Hegel and Nietzsche from NLP computational analysis within the framework of the beginning of the course of the doctoral program in Humanities and Digital Society and thanks to the competition of the research group of UNIR HDAUNIR (Applied Digital Humanities). More information and registration. Also you will find related analysis in previous posts.
Following Aristotle, Hegel reaffirms the grammatical relationship between subject and predicate as a substantial relationship of reality, that which gives the underlying structure of what happens. The predicate, as a variable entity or becoming, is subordinated to the subject, really stable and true. Against this, Nietzsche appeals to non-strictly linguistic relationships with reality, in which the subject dissolves, stating that we will believe in God while we believe in grammar. This conference tries to analyze this interpretive tension through visualizations obtained through text mining and natural language processing (NLP), concluding with a small practice in which the use of the programming library udpipe in R.
Last week I collaborated with the Master of Educational Innovation. I shared with them some examples of the use of computing tools in education. Specifically, I focussed on the use of digital humanities methods to improve the pedagogical approach. So, how techniques like text mining or sentimental analysis can help to educational innovation? Since, it is an area which go little further of my discipline, I searched some papers which deal with the matter.
I found some good papers which resume well the latest advances. For example: Ferreira‐Mello, R., André, M., Pinheiro, A., Costa, E., & Romero, C. (2019). Text mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(6). https://doi.org/10.1002/widm.1332; or Romero, C., & Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. WIREs Data Mining and Knowledge Discovery, 10(3). https://doi.org/10.1002/widm.1355.
To put it in practice, I chose a public domain dataset related with online education. In this case, I found a dataset on kaggle platform about another platform: medium, which is one of the most famous tools for spreading knowledge about almost any field. It is widely used to published articles on ML, AI, and data science. The dataset contains articles, their title, number of claps it has received, their links and their reading time. So I processed this data with the code you can find on github. I obtained the plot you can check bellow.
Most frequent words have to do with keywords of the subject of the forum: machine learning, data and network.
The reading time of the articles which is more accepted is 17 minutes. Interes decreases soon when the article exceeds this length.
Something very similar happens with the title length. The attention falls of when the title has more than 50 words.
Finally, I used sentimental analysis. We can check how the more charge of sentiment the post has, the more claps it earns. This is quite suggesting. It corroborates the success of polarized speeches, even in educational contexts.
Effectively, differences between Hegel and Nietzsche can be expressed in term of their use of the verbs. We can point out that the relation between the subject and the predicate has a metaphysical and ontological interpretation. Since Aristotle, the linguistic and grammatical subject is, at the same time, a metaphysical one. The relations contained in a sentence are relations of the reality, in a deep and onotological way.
Hegel gives continuty to this perception, stating that the subject evolutes, from his point of view, under the form as spirit. In that evolution, the subject is what remains, through the changes that happens along the history, as substance and even inmaterial force which animates people and historical events.
Conversely, Nietzsche suspects about the linguistic and grammatical form. He wants to shows up how language is not enough to express reality. Even more, grammar, specially the importance given to subject, stratifies reality, entifies the processes which happen under the appearance.
The NLP verbs analysis
First of all, we can observe which verbs are more used. This analysis will give us an initial and general look about the prioritized issuesby each philosopher. To do that, I selected their more relevant works: Phenomenology of the spirit (Hegel, 1807) and Thus spoke Zarathustra (Nietzsche, 1883). We can see that Hegel uses more abstract verbs, against Nietzsche, who uses more practical ones. Also we can conclude that Hegel has a more concentrated use of verbs.
The main verbs used in Hegel are to be and to have, strongly related to names such as consciousness, essence, mode or reality. Nietzsche’s graph is articulated through want and say, relating terms such as human, will or eternity; moreover, Hegel charges his speech much more on a few verbs. Other terms that Nietzsche uses in abundance are thing, truth, time, or heart. It can also be seen that the adpositions in Hegel are more limiting.
Regarding the type of verbs, in Hegel the use of the singular indicative in the third person of the present stands out prominently; while in Nietzsche it is more compensated with the gerund, the past and the plural.
As probably many people know, not all the computational analysis you do with texts is the same. There are differents approaches. Basically, we could enumerate three different levels (you will find mores information here at the last publication of Hvitfeldt and Silge (2020)):
the exploratory analysis where you calculate relations between words, including stopwords removal, ngrams analysis, idf-tf measurements or even sentimental analysis.
within the first one, a variant where you work not with the words (or tokens) itselves, but with a stemmization. In this case, you use previous created rules which are different for each language. These rules lend you to identify as the same token the variations of number, gender and time of the words. They are, for example: to remove the final “s” or “es” of a word or the “tation” termination, among many others.
one step further the second approach, when we use not mechanical rules but semantical detection of the common tokens resulted of words variations. That implies a more sophisticated work totally based on philological studies. So, we need a dictionary or words and compare it with our corpus searching the roots of our works. As you can imagine, it supposes a great cost or computational power. In return, you gain not just accuracy. You also have much more information to analyze. For example, you get the part-of-speech tagging (or POS tagging of each word).
The last method implies the Natural Language Processing (NLP) analysis which is more suitable for achademic purposes. That’s also the method I am going to present in this post. For this purpose, I will use the udpide package (Wijffels, 2019).
Practical case: Hegel vs. Nietzsche
If on previous posts I have analyzed what words identify each philsosphical work, in this one, under the view of NLP analysis, I will focus more on the type of words and the philosophical structure of the texts. In this sense, one of the first features you can observe is the sentence composition of the works. Mainly, we can look at the sentences length as it is shown on the next plot. As you can see, the work of Hegel has very much longer sentences than the Nietzsche’s one. The first one mean of sentences length is 33.2; meanwhile, the second one has a mean of 19.1. This difference should give us an idea of how different the styles are; even they have opposite styles.
This analysis can go further if we look, not just at the sentences itselves, but the kind of words which constitute them. This is also called POS tags on the NLP approach. On graphic 2, I have plot the result of exploring both works. As we can see, Hegel’s work has much more determines and adpositions, result of the complex sentences he uses. On the opposite, Nietzsche needs more puntuation sings. However, what draws more attention is that, even having shorter sentences, Nietzsche’s work has noticeably more verbs. This information could us think about which work has more movement. Even when Hegel’s work tries to conceptualize movement.
Finally, we can go little deeper, analyzing which words appear on each of the kind of words or POS tags detected. I have selected some of the POS tags I thought were more relevant but, of course, you can select another ones. In this case, we can see how Hegel uses very much the verb “to be” against Nietzsche, who uses the verb “want” -excuse me because I did this analysis in spanish ?-. Or if you look at the noums, you also confirm the orientation that Hegel has to inmaterial and abstract things, against the practical, directed and emotive
Although the so-called exact sciences have an extensively contrasted laboratory for centuries, where objective qualities of various phenomena of the world are put to the test quantitatively, the same does not happen with the Humanities. The relative character of its object -the human beings and their forms of artistic and philosophical expression, mainly- as well as the subjective nature of the faculties they put into play -sensitivity, affections and even historical awareness- make the modern scientific laboratory method is not enough in this area of research. To respond to this problem, theater may be an appropriate methodology, as philosophers such as Gilles Deleuze or Graham Harman have proposed in recent years. While there are certain disagreements between these two philosophers, this article aims to show important common features as well as the way in which both conceptions of theater and philosophy can be reinforced by the application of new technologies – virtual and / or augmented reality – giving place to the contemporary Digital Humanities laboratories based on simulation, immersion and interaction.
The actor can touch or perceive a sensual object -represented with the green triangle- but which, as such, is annulled -tainted- because a virtual object -in this case the flame- gives rise that we do not perceive as such, but through its qualities virtual. In turn, the real qualities of the immediate object of interaction -the green triangle receive, through its real qualities -sensors connected in line- the data flows of other external objects, thus opening the theater to what happens outside – the dispassionate relations of knowledge that Harman proposes. In this way, the two interactions Harman describes with respect to objects are coupled in the scenic space: a more theatrical and metaphorical one, based on passionate sensation -would lead to fascination with the virtual object as a flame of fire-; and a cognitive interaction, more sober, neutral and dispassionate, where fascination as conditioning does not enter – the reception of data flows that modify virtual or augmented reality generated as information-. Finally, the differences that Harman establishes between the real object referring to the metaphor and the sensual object referring to knowledge, are unified in this model into a single virtual object that, while seducing and fascinating us, can also inform us of what is happening outside -for example, the size or color of the flame increases or decreases according to a certain flow of data received through social networks-.
La pandemia Covid-19 ha cambiado, sin duda, la realidad de nuestras sociedades. Nos ha obligado a reflexionar sobre aspectos fundamentales de nuestras formas de organización, relación y producción. De entre ellas, el sector de la educación es uno de los más afectados. Varios son los motivos:
La educación y el conocimiento constituyen una pieza angular en la nueva sociedad de la información y la comunicación. Si ya siempre se ha considerado la educación un elemento clave para el buen desarrollo de una sociedad, hoy en día todavía más. La educación es continua, no dejamos de formarnos y adquirir nuevas capacidades y competencias que nos permiten adaptarnos a un entorno cambiante como el que producen las nuevas tecnologías.
El propio sistema educativo necesitaba un cambio, ya desde hace tiempo. Sobre todo la universidad, en tanto que institución secularizada, ha intentado en las últimas décadas hacer su modelo más flexible. Sin embargo, en muchas ocasiones ha mostrado lo contrario: se ha mostrado heredera de estructuras rígidas, jerarquizadas, no siempre transparentes en sus procesos, poco ágiles a la hora de gestionar y administrar sus recursos.
Las posibilidades de los medios digitales se han mostrado tremendamente compatibles con el entorno educativo. Ha sido uno de los sectores que más oportunidades de adaptación ha encontrado. Por supuesto, existen muchos problemas y fallas pero, sobre todo en el nivel de la educación superior, la mayor parte de los procesos están digitalizados y se pueden realizar remotamente a través de una computadora convencional.
¿Brecha o Explosión digital?
Frente a ello, la posibilidad de aprender de cualquier lugar y desde cualquier lugar abre un sinfín de posibilidades. Se han hecho especialmente patentes con la cantidad de ofertas, webinars, cursos en línea, apertura de recursos, etc. que se ha producido. Debemos hablar ya, no tanto de una brecha digital, sino de una explosión digital. Por supuesto, hay una serie de inconvenientes derivados del devenir tecnológico y digital de la educación; algunos de los cuales tienen que ver con:
La dificultad de acceso a la misma en la medida en que requiere de equipamiento tecnológico y conexión a Internet.
El requerimiento de cierta alfabetización en el uso de tecnologías. Muchas veces tiene que ver con la familiarización que tenemos con las tecnologías: cómo se emplean en nuestro entorno, la densidad de su uso.
La cuestión es ver hasta qué punto estos problemas son intrínsecamente digitales o bien lo son tan solo tangencialmente. Muchos de estos problemas ya existían en nuestras sociedades. Por ello, más bien habrá que estudiar cómo ha afectado el fenómeno digital.
While data mining and statistical learning are born as tools of the exact sciences, business, and government agencies, their value to the human and social sciences has increasingly been demonstrated. Due to the growing interest aroused by the large amount of unstructured information shared on the Internet, technical analyzes have required to explore texts, images, relational networks or geographic data. It is precisely this type of data that the Humanities work with, benefiting, indirectly, from the very evolution of computing tools, giving rise to the emerging field of knowledge of digital Humanities. During this conference we will examine some of the possibilities of the free software tool R for this type of analysis.
Throughout this Digital humanities webminar, attendees will be able to follow several of the demonstrations raised through the RStudio Cloud platform (rstudio.cloud); assistants want to practice with some of the examples should have an account created.
The fact that schizoanalysis is a materialistic practice is not always so evident as it should. No few critics have spoken about Deleuze -as well as about Deleuze & Guattari (D&G in advance)- as postmodernist theorists. They highlighted the fact that their philosophy makes too emphasis on correlative processes (Harman 2018, Meillassoux 2009). Others, have pointed out how their philosophy plays no attention to the difference, exposing an excessive proclamation of connections, leading to an extremely open and positive world (Han, 2017). I won’t say that any of those philosophies are totally wrong; they provide new and interesting arguments and approaches. However, they are not totally fair with D&G philosophy -maybe, just Bryant (2014) has recognized more deeply this heritage-. They tend to simplify and forget very important points, which I will try to defend -briefly- here.
In addition, and to not transform this post in a too academic one, I will relate the exposition with my own experience, putting into practice D&G philosophy in a technological applied project. This project was herm3TIC: cameras, sensors and telepresence, where I managed a team of artists and programmers exploring new uses of technology. We explicitly tried to implement a version of schizoanalysis. Let’s resume some of the features of this project which attend the debate I presented at the beginning.
The importance of schizoanalysis itself
Actually, schizoanalysis is a quite interesting proposal. Is timely today, because it develops a subjective study but without assume any origin or identity. On the contrary, it tries to study subjectivity in its social group dimension. And it is truly suitable to the circumstances we live today: an interconnected world where people spend more time in their social networks sharing likes and concerns than militating in a political party.
The natural link between schizoanalysis and technology
Since D&G understand nature as a production process, any resemblance of schizoanalysis with mystics or reactive movements against technology has to be discarded. Schizoanalysis is a proposal which embraces technology, overall in its capacity to transform processes generating difference and heterogeneity. They follow, in this sense, Simondon (2017) thought: technology has to imply individuation processes, far away of the mainstream use of a consumer technology. In herm3TIC project, for example, technologies were developed for specific tasks. The same happens with many other free software & hardware projects, as Lanier (2011) have already pointed out.
The relevance of the body
However, playing attention to technology doesn’t imply to forget the body. On the contrary, schizoanalysis achieves a deep understanding of the body. With the concept of body without organs, D&G are able to speak about a materiality which is not just individual, nor social, neither natural; but exists in all this strata in a transversal way. Also, with this concept, D&G run away of the semiotics imperative which domains schools as psychoanalysis. It is not a question of interpretation, but of production! -D&G always remind us-.
Schizoanalysis subjectivities production is materialistic on this transversal way: taking in account the body, but also the social body or even nature. In this sense, we reach the concept of ecology which Guattari (2014) developed in his works alone. Also Byant (2014) has proposed a cartographic ontology method very allied with D&G and those ideas. With those basis, we can suggest the necessity of overcome the semiotic view combining materialistic approaches.
The importance of inorganic strata
Related with last idea of transversal methods, D&G have defended the interconnection of all the reality without hierarchies. Particularly, schizoanalysis has to be the place to practice all kind of decentralizations. According with that, many authors has developed the idea of a flat ontology, where no strata is more important than other. However, this idea was already present in D&G -as Delanda (2011) has explained-. The idea should let us to think the power we have to stay away or even feel different. Even, all the analytical implementations that data mining and Big Data provide, which allow us to interconnect all dimensions of reality, could be understood very well from D&G point of view. I tried to demonstrate this fact in Cebral (2019).
The importance of arts and creativity
Actually, Deleuze has thought in his latest works about sensation (2004); and with Guattari has exposed the capacity of artworks to transform our conception of reality (2014). Art is an example of how we can become matter. Creating blocks of sensation, art allows us to transform the conception we have about reality. Connected with the importance of body, as well as with operations of decentralization, art should be in the center of a technology development. Even more if this technology want to interact and transform human subjectivity.
So, here I exposed some reasons I think could give us an idea of the relevance schizoanalysis could reach nowadays. The technological becoming of our society has to be understood in a subjective and creative way. We need to reply to psychological disorders but, at the same time, we need to give an answer which does not idealize life, human being, nor history. Schizoanalysis was conceived to do such a things, probably its biggest problem is just one: it has hardly been implemented.
You can also consult more projects I developed here.
Bryant, L. R. (2014). Onto-cartography: An ontology of machines and media. Edinburgh University Press.
Cebral Loureda, M. (2019). La revoluciòn cibernética desde la filosofía de Gilles Deleuze: Una revisión crítica de las herramientas de minería de datos y Big Data [Universidade de Santiago de Compostela]. http://hdl.handle.net/10347/20263
De Landa, M. (2011). Intensive science and virtual philosophy (Reprint). Continuum.
Deleuze, G. (2004). Francis Bacon: The logic of sensation. University of Minnesota Press.
In this case, I joined several Nietzsche’s works creating a corpus. Specifficaly, I downloaded from Gutenberg Project the works: The birth of tragedy, Gay science, Zarathustra, Beyond god and evil, Human All too human, Ecce Homo, Antichrist, On the genealogy of moral and The will of power. I runned topicmodels package – https://CRAN.R-project.org/package=topicmodels– over this corpus to separate terms into 3 groups. Shown bellow the plot resulted:
What we see here is the classification of the 10 most frequent words. As it’s seen, the first and the second topic are quite related. One of them is more oriented to time, art, nature and music; and the other one contains words like morality, people, truth or form. Finally, we have a third topic with words like thou, ye or Zarathustra, among others.
If we know Nietzsche’s work a little, it is not very difficult to guess which work is related with each topic: we find a cluster centered on Zarathustra; another one on The Birth of Tragedy; the last one on the works about moral and christianity. However, let’s plot these correlations using the argument gamma which indicates the per-document probability for each topic:
Now we can see better how the algorithm distributes works along the topics. The majority of works belong just to one topic. It is because of this that they have two ceros and a one within each plot. Just few works have medium values: Human too human, Gay science and Ecce homo. Probably, we should consider these works as bridges within Nietzsche’s corpus. As we see, these bridges are situated, especially, on the works Human, All Too Human, even more Gay science and Ecce homo. At the same time, Ecce homo is the only work which has similarities with Zarathustra.
This third post of text mining applied to philosophy gives continuity to previous studies. Among then, I would like to highlight the work of David M. Berry, especially the post Berry, D. M., & Rybicki, J. (2012, diciembre 19). The author signal: Nietzsche’s typewriter and medium theory. Stunlaw. http://stunlaw.blogspot.com/2012/12/the-author-signal-nietzsches-typewriter.html. This post points out how technologies influence in our production of knowledge, not just as a change of style. Of course, also the text Silge, J., & Robinson, D. (2017). Text mining with R: A tidy approach (First edition). O’Reilly. https://www.tidytextmining.com/ is a key reference regarding to the technical development.