Complexity Explorer Santa Few Institute

Foundations & Applications of Humanities Analytics (spring 2022)

Lead instructor:

This course is no longer in session.

13.1 Applications: Linkages » Chapter 8 Overview

What you will learn in this chapter

You will be walked through the entirety of a miniature project in humanities analytics, from its initial inception to the posing of a research question and choice of methodology, all the way through to data collection and the initial analysis of results. The project in question is an analysis of the ways in which the interests, concerns, and techniques popular within feminist philosophy have made their way into the mainstream of the discipline, especially within analytic philosophy. In the lecture, you are walked through the coding techniques used both for data collection and for the analysis of data through the calculation of linkages, as first discussed in Chapter 5

Review the Jupyter notebook used in the lecture here: github.com/davidbkinney/FAHAChapter8


Key terms to keep in mind


Feminist Philosophy   The sub-discipline of philosophy that aims to bring to bear the techniques and expertise associated with philosophical training on questions raised by the women's liberation movement of the 1960s and 1970s (e.g., abortion, affirmative action, equal opportunity, and the institutions of marriage, sexuality, and love). 


Scraping  A computer-assisted technique for copying and extracting data (usually text) from websites (where the text is usually stored as HTML code), into a location on the researcher's computer that allows for the convenient analysis of the text data.


HTML  Abbreviation for HyperText Markup Language, it is the standard markup language for documents that are displayed in a web browser (i.e., websites). In other words, HTML code tells a web browser what should be displayed to a user when that user accesses a particular web page. 


Frequency   The ratio between the number of times a word in a group appears in a given unit of analysis in a given corpus, and the total number of units of analysis in the same corpus. 


Linkage   Ratio between the joint frequency with which words in two groups appear in a given corpus, and the product of the frequency with which words in the first group appear in the corpus, and the frequency with which words in the second group appear in the corpus. Mathematically, this is written as follows: jointfrequency(group1,group2)/(frequency(group1)*frequency(group2)), and has the same mathematical structure as the R-ratio used in Chapter 5. 


Natural Language Toolkit   Python package used for cleaning and analyzing natural language data.


Stopwords   Very common words in a particular language which are uninformative from an analytic perspective (e.g., "the", "it", "and", etc.)


For-loop   A computational technique wherein the same code is repeated for all instance in which a parameter satisfies a particular set of logical conditions.


Matrix   Two-dimensional mathematical object in which the number (or "entry") in each row-column pairing represents some quantification of the relationship between what is represented by the row in question and what is represented by the column in question.