Figuring out how to ask
a research question

--how to take a research question and
develop it over time, allow it to evolve--

is a crucial part of a successful
humanities analytics project.

As we'll see, one of the key utilities

--one of the reasons you want a
research question--

is that it is a way to keep a humanities
analytics project on track.

It's not just an epistemically
good thing to do;

it's not just something that will help
you ask better questions

--will get you better forms of knowledge--

but it's also useful as a way to bring
a collaboration together

and keep it moving in a way that
benefits everybody

--that allows everybody's individual
ideas and talents

to contribute to the success of the
project as a whole.

So, research questions: if we were to come
up with the primary skill

--the skill that precedes all of
the other skills in humanities analytics--

well before we get to the technical
problems of a statistical analysis

or a computational problem--how
to set up a data analytics pipeline--

well before we get to those is this much
deeper, much harder problem

of how to phrase a good research question,

and how to develop one over the course of
an investigation.

So, in this series of videos,
we'll do two things:

first, I'll tell you what a research
question is,

and then I'll give you a
series of examples

of both good questions and bad
questions,

with [a view] towards allowing you
to perfect and develop, over time,

your own skills.

So, first, definitions;
what is a research question?

A research question is a succinct--meaning
literally less than 25 word--sentence,

literally ending in a question mark, that
helps organize your investigations,

directs your attention, and keeps
everybody on board.

So, it's this red, this blue, and the
orange that are critical features;

this is what a research question will do:

it will prevent you from wasting
too much time,

it'll prevent you from spiraling out
in too many different directions,

it will draw your attention--so it won't
just tell you what not to do,

but it will also tell you where to look--

and, this critical aspect--it's so crucial
for collaborations--

having the research question be explicit,

be something that's out there
for everyone to share,

is important for keeping everybody
on board,

for everybody's [apris] to be directed
in a synergistic fashion.

So, beyond that, research questions are
enormously varied;

they're endless, they are, in a way,
the meat of scholarship and science itself.

So, if we look at the sort of playful
example we had in a previous lecture

about what- you know, what's going on
in the blurbs on the back of

Rebecca Spang's book.

Sitting in there, in a sort of nascent and
collate form,

are a whole bunch of possible questions
--or, possible research questions--

that one could start trying to answer.

So, you know, at different scales of focus
you might say,

look, there's a research question in here:
what do academics value (right?)

Where did the academic blurb come from?
What are the genres of praise (right?)

The- if you were presented with
somebody who, you know,

handed you a blurb on the back of a
book and sort of started talking about it,

you might say, well, I don't know,

sitting in here, you know,
what you brought me,

it seems like there's a whole bunch of
potential research questions.

So, let's give some examples (right?)

And one of the key aspects of
a research question

is that they tend to evolve over time,

so if we begin with the
Rebecca Spang blurb,

you know, one possible research question
is, you know,

where did this thing come from (right?)

What's the origin of the 20th Century
academic monograph blurb, right?

Sitting behind that question, maybe in a
slightly larger scale,

is this sociological, sociology of
knowledge, you know,

almost philosophical question,

"what do academics actually think, like,
good work is?"

And so, this middle question,
this middle-scale question

that you might think of as a new lens
on that larger problem (right?).

So, we have this broad research question:

what do academics think makes for a good
piece of work,

we have this narrower one:

where did the 20th Century
blurb come from,

and, over time, as you try to
answer that question,

whether that's through standard or
traditional historical methods,

or whether it's through some
computational analysis,

in the course of trying to answer
that question,

the question itself might evolve.

So, for example, at some point you
might make this discovery, which is,

"wow, when we're looking at the 20th
Century academic blurb,

all of our metrics--something weird
happens in 1965,

and now all of a sudden, you have this
research question

that's evolved out of the original one
(right?).

Which, you know, you would never have
thought to ask coming in,

but simply because you tried to answer
the one in the middle,

has now given you- you've uncovered
something that now needs to be answered.

So, going from this very broad thing,
which, you know,

is understandable to basically anybody,

whether it's a grant officer, a dean,
or a colleague (right?)

From this large question, getting further
and further specific

--more and more specific over time--

also, in the end, can often illuminate
the large question.

So, a research question- the trajectory
of a research question

over the course of an investigation,
often involves narrowing,

as you sort of understand what your
data can do, what the archive can do,

what the real points and issue are,

and over time, even, what the real
sort of problems--the surprises-- are.

But as that question narrows, often near
the end of an investigation

you find that you've made a contribution
to a far broader question,

one that you might've begun with,

or one that you realize is actually
sitting behind

--it's almost, like, you know,
psychoanalytic way, all the time.

That kind of question, right, so,

even on a larger scale than the
sociology of academia,

than the history of a field of knowledge,

the study of academic blurbs may in fact
be contributing to this really deep, broad,

almost literary question of how
people tell stories about excellence.

So, what we're getting at here is,
first of all,

giving you examples of different research
questions,

but also giving you a sense that
the research question you come in with

is almost certainly not going to be the
research question you leave with.

And not only that, but even the one,
when you've narrowed it down,

you've gotten the thing, you've gotten
your answer for that question,

the thing that you're really answering,
the big thing

that you're making a contribution to,
may be quite unexpected.

We find this all the time
in humanities analytics.

We don't quite know where the research
question is going to lead us over time,

although we certainly have to pretend that
when we apply for grants.

So, here's another example: this is one
sequence of research questions

as they evolve--might evolve--
over the course of an investigation.

Here's another example (right?):
here's my research question.

I'm really interested in how people
praise things.

So, a couple problems with this research
question:

The first is that it's not actually
a question (right?)

What you've done here is say "well,
there's something I'm interested in."

But, you work this out over time,
and you might say, okay, look, you know,

I went to the cafe, thought about this;
given the kind of data I have,

given, you know, how much time I have,
I think, actually, you know what,

maybe I want to ask this question:

what genres of praise can be found in the
20th century academic blurb?

So that's a research question
that's not too bad.

In the process of trying to answer that
question,

you may, in fact, make a discovery.

So as you're dealing with the data that's
come in,

for example, your gigantic stack of blurbs
from every academic monograph

from the last hundred years,

as you're kind of going through that,
looking for patterns,

looking for distinct ways
in which people praise,

you might say, "actually, look, I made
this discovery, which is in 1920

the rhetoric of praise, the genre
was about discovery."

Everybody was using metaphors and
analogies about uncovering

or, uh, you know, sailing off into a
new land and discovering something,

and then, you know,
around 1980, it's...

we no longer have this language
of discovery;

we have, instead, all these words and
tropes and semantics associated with,

you know, brilliance and illumination and
lenses--references to light and perception,

right?

So now you have... you've begun with this
question:

what genres of praise can be found,
and looking at genres of praise,

you may discover, "oh look, there's this
discovery genre,

there's this enlightenment genre,"
and if you get really lucky

maybe these are segregated in time, and
now you can start to ask questions about

how and why the discovery genre gave way
to the light genre.

Of course, that research question,
that very focused one, now,

is the product of a set of discoveries
you've made by asking

previously increasingly focused questions.

The answer to how the language of
discovery gave way

to the language of enlightenment,
if that's what indeed happened

--we have a sort of cartoon fantasy
discovery here (right?).

But if that's exactly what happened,
then in answering that,

you will, in fact, not only successfully
nail your question,

but you will also answer research
questions that are far more broad,

far more general, far more of interest
to people well beyond

the, you know, narrow group of friends
of yours who are all interested

in academic blurbs.

So, these are two examples of the
evolution of questions over time:

the shift between big general themes,
or big general questions,

and narrow, direct and focused things,
and the way in which these kind of

coevolve with each other.

Let's have a little fun now, because one
way to learn about how good questions work

--how to get better at asking good questions--

is to look at bad questions and see what's
gone wrong.

So I'm going to give you now a series
of examples of questions

--of bad research questions, things that,
at first glance,

you're like, "oh that's great, let's do
that," (right?)--

but, it will... I promise you will turn
out over time that these are actually

total disasters.

So, the standard, the most common failure
mode of a question in humanities analytics

is something along the lines of
"can method X do thing Y?"

Why do people end up using this kind of
question? Oftentimes you're either

in the faculty club, and, you know, you
have an english professor and you have

a computer scientist and the english
professor says, "oh, you know..."

or maybe the computer scientist says,
"oh we have this new algorithm,

and we can spot when people are going to
go on a Twitter rant,"

and then the english professor says, "oh,
I have all of these, like, angry letters

in the Ember Review, where, you know,
poetry critics are yelling at each other,

and, you know, can your method detect when
someone's going to say something mean

in a letter to the Ember Review?"

So, this is... you know, this is fun... it's
like the Lego brick approach to scholarship

The problem is it's a terrible question
because the answer is one of two things:

the answer is either yes--okay cool, like,
you know, cool story bro (right?)

Computers are cool, method X can do
thing Y. Okay.

If the answer's no, okay, well, oh well,
sorry it didn't work,

and also your tenure packet is due
(right?) So now you're in trouble,

because you spent a year on this.

So, good questions... maybe what we're
going to say is, good questions,

in contrast to this question,
good questions tend to be interesting

regardless of the answer (right?)

When you have a research question trying
to think of all the possible answers,

and they all better be interesting.

In this case here, neither one
is interesting (right?)

If method X can do thing Y, then it's 
at best kind of an amusing thing

to tell someone over coffee.

If the answer's no, then it's even less
interesting, it's, well no,

but there's no reason that it should or
should not (right?)

It was developed for Twitter, why should
it work on the Ember Review?

The other aspect here, the thing that's sort
of missing in this question implicitly,

which a good question will have in play,

is that you want the question to connect
to big ideas in the field.

So, whatever the thing Y is, this question
here isn't really getting at it.

It's asking... it's looking at a problem,
it's looking at some feature, let's say,

of interest to scholars in the humanities,
but not in a very deep way.

We'll come back to this, we'll give you
an example of how a variation

on method X can do thing Y, how a
variation on this actually can work,

but for now, don't do this. It's the
most common, at least in my experience,

it's the most common failure mode.

It's usually something... the first thing
that somebody asks

because oftentimes they need a computer
scientist, or you're a computer scientist

and you've had this,

and you say, "well this clearly has to be
useful beyond its domain."

More bad questions: "what's up with X?"

So, this is a bad question. It's a start,
but the problem is it's just unfocused.

You've specified a domain of interest,
but you haven't gone any further (right?)

So, a good question is something that
narrows things down (right?)

It puts your colleagues on the same page.

A good question directs your attention to
something new,

it draws your attention to something else,
or something within the domain X

that you said you're interested in.

So, a better variation on this--a way to
unfold the question "what's up with X,"

right? You know, Shakespeare:
"what's up with Shakespeare?"

A way to unfold that is, well, at the very
least, "what's up with this feature of X?

What's up with tragedies in Shakespeare?"

at the very least, you've gotten a little
bit further than

"what's up with Shakespeare?"

What's up with the use of high vs low
diction in Shakespeare (right?)

By drawing your attention to some feature,
you can make a little bit of progress;

your research question can become better
and, in this case, more focused.

Another variation on "what's up with X"
that often works better is

"why did X happen?"
Or "how does X happen?"

Or "when does X happen?"

"What's up with major changes
in artistic taste?"

Well, who knows..?

Why do these changes happen? How do these
changes happen?

Under what conditions or when do these
changes happen?

Those are better, more refined versions of
a research question

that may be too unfocused--it gives you no
purchase on where to begin.

Yet more bad questions: "we have this big
dataset that cost someone a million dollars?"

So there's only one answer to this
question: "yes, yes we do."

It's, of course, a deeply
dissatisfying question,

and one way to see why this is a sort of
dissatisfying question:

it's a very common one (right?) that
people end up asking,

often because people create archives.
In fact, we often digitize things

without really knowing why we want to
digitize them.

This is a kind of common problem that,
hopefully not you, but somebody somewhere

in some government agency will have
spent a million dollars

to create an archive and no one knows what
to do with it.

Good questions, in fact,
go the other direction.

Good questions put the ideas ahead
of the evidence (right?)

I understand there... it's ok to be
inspired by the creation or the existence

of a new digitized archive.
that abso- like, this is the origin

of many fantastic pieces of work
that I've been involved with.

Everybody has archive lust, it's okay,
it's that kind of excitement that,

you know, it gets your blood up...
that's- that's wonderful.

It's also often the way you'll encounter...
begin formulating a research question;

you'll have your attention drawn to a
field, simply because that archive exists.

However, nobody will love your archive
as much as you do.

What they will love are the ideas--
the big conceptual questions,

the big challenges, the big sort of
problems, the analysis that the evidence

makes possible.

So, begin with the ideas, not with the
evidence,

at least when all is said and done and you
have to give the talk,

begin with the ideas.

Yet more bad questions: this is an example
of "can method X do thing Y?"

It's still bad, but here we go.

"X is a well known problem- er, sorry,
X is a well known fact in our field.

Everybody knows... every historian of this
knows X. Can a computer

--potentially with a really sexy archive--
can a computer detect it?"

So, Rebecca Spang and I have called this
the King Lear Problem.

The King Lear Problem is... after three
years you're able to say the following:

"I've trained a pattern recognition system
on Shakespeare's plays,

you know, I have this eight-level neural
network, it's PyTorch,

we stole a hundred thousand dollars from
the history department,

and we ran it on the Amazon Cloud,
and we've done this,

and it turns out that King Lear
is a tragedy."

The problem with this is that we already
knew this (right?)

We already knew that King Lear is
a tragedy,

and no one really cares that a computer
can know that thing too.

They might be impressed, they might be
amused by it,

but there's nothing... there's no benefit,
there's no advance, there's no deepening

of our knowledge of Shakespeare, or,
in fact,

the algorithms for text classification,
that comes from being able to provide

this answer.

There's a parallel problem to the King
Lear Problem,

which is the Problem Play Problem.

It's the exact same thing as the King Lear
Problem, just from a different angle.

In Shakespeare, there are these- you
know, we have comedies, we have tragedies,

we also have these things that we call
Problem Plays, and it's like,

are they comedies? They have some features
from comedies,

are they tragedies? They have some
features from tragedies.

Like, Winter's Tale is an example.

So the Problem Play Problem is:

"I've trained a pattern recognition system
on Shakespeare's plays,

and it turns out that Winter's Tale
is a tragedy."

In this case, you may have provided an
answer that people haven't... don't have.

In this case however, the failure is that
you haven't justified it.

There's no... you may have given- you may
have given people knowledge,

but it's not justifiable knowledge,
so they have a belief you may...

if you beat them over the head enough
they might say,

"fine, Winter's Tale is a tragedy,
I believe your computer,"

but you haven't actually given them any
reasons to do so,

and the practice of scholarship is,
in large part,

the practice of giving reasons
to believe things.

That said, both of the... this is actually
not as bad a question as it looks.

It's a form of "can method X do thing Y?"
In some ways it's even worse (right?),

we already knew it, but it contains the
seed of a good question, potentially,

because you can... you can maybe get an
answer like the following:

"what features of King Lear are
tragedy-like?"

So, it isn't just "okay, my text
classification system tells you that

King Lear is a tragedy. It may, however,
be the case that with some work...

some technical work, some philosophical,
conceptual thinking the problem through,

you may be able to figure out,

"ah, the machine is telling us it thinks
King Lear is a tragedy on the basis of

this or that feature."

And once we have a sense of what the
features are,

the features that are triggering that
classification algorithm,

the features that are triggering
that detection,

may in fact understand more deeply
what a tragedy is.

It may open up the question within the
literary sciences or literary scholarship

about the nature of tragedy in a new way.

So, that may potentially be a
source of power.

Answering a well known question--on the
one hand, it's often this... unfortunately,

it can often turn into very condescending
thing (right?);

the computer scientist says, "I've
discovered that King Lear is a tragedy,"

the English professor says, "well, I
already knew that,"

the computer scientist says, "ah, yeah,
but you didn't really know,

now the computer has validated
your knowledge."

This is never a successful way to sell
anything to anybody--by telling them that

you're smarter than them.

But, if you're able to take that algorithm
and pull it apart in a way

--and David and I will give examples
on how to do this--

to say why the algorithm is making those
decisions, what the detection pattern is,

sometimes called interpretable
machine learning,

then you may actually be able to make
some progress,

you may be able to make a contribution
to our understanding, our interpretation,

our reflection on this example of
Shakespearian tragedy.

So, a shorter way to say this, or,
you know, our kind of heuristic,

is that good questions empower and
advance the field;

they don't just computerize, they don't
just replicate something

that a scholar in the humanities
already knows.

The replication may be a crucial step
--the fact that your algorithm,

roughly speaking, matches the
classifications made by, you know,

over the course of the history
of the field.

That may be something you want,
but it can't be where you end.

Something more has to happen.

Alright, more bad que- yet more bad
questions:

sometimes, you will end up with a research
question that looks like this.

You know, "in order to understand
Machiavelli's Prince,

you have really understand the
historiography of-

not just the context of Machiavelli,
but, you know,

there's been a couple eras in our
understanding, and these shifted,

and there's this thing called the
Cambridge School (right?).

So, this... this kind of thing--which
people in the humanities...

we all do it, scientists do it as well
(right?)--

this is a great way to begin a lecture
(right?).

This may end, you know, at the end of
17 hours;

at the end there may be this question mark
that is the world's greatest question mark

of all time (right?)

You know, it's like, "hey y'all, this is
great (right?),

you've now posed this amazing question,
but it is not a research question,"

and the way we understand it, technically
it's not a research question

because it's more than 25 words,

but the reason it fails is that it's not...
we might say exoteric.

It won't help you connect 
with your collaborators (right?)

You and, you know, you're a political
theorist, you- you know,

you could phrase- you might be able to
squeeze whatever your question is

into 25 words (right?), so you may be able
to shrink this down,

but only through the use of jargon,
assumptions shared, really,

narrow technical assumptions shared by
others in your particular subdiscipline,

and what that means is that you cant-
when you're sitting around

after three exhausting hours with your
collaborator--maybe you're the statistician

and your collaborator's in literature,

maybe you're the historian and your
collaborator is in computer science,

who knows (right?)--

at the end of three exhausting hours,
you need to be like,

"what are we actually- what question
are we trying to answer?"

and you need to have a 25 word thing
that people can all understand:

that people can all get on the same
page on really quickly

so that you can redirect yourselves,
so that you can actually take advantage

so you can keep that conversation going,
and you can take advantage of all of the

skills and ideas and interests
that people have.

So, this is probably the most contentious
thing that I've noticed

in teaching people to phrase
a research question:

is that, in general, every piece
of knowledge, every research question,

unfolds and ramifies continuously.

Every research question is in a much
larger epistemic context.

In one sense, that's why a good
research question is a good research question,

but the question itself has to be phrased
in such a way

that everybody in the room--not just
people in your particular discipline--

everybody in the room can
grasp it.

So, good questions are short, not to
harp on this, they're 25 words or less,

and they're exoteric; they don't need
a lot of background or introduction,

they don't need somebody to buy in
to an enormous number of values,

it doesn't need- you don't need somebody
to also have been

in your particular PhD program in order
for them to grasp and be enthused

and be excited about that question.

So, uh, our definition and example--this
has been the goal of the series

of videos here, of research questions,

and an important thing to say is phrasing
a research question

in the way we've introduced it here is
not a necessary skill to get a PhD

in the humanities.

You may, if you are a- somebody...
you're a faculty in the humanities

or a graduate student,

you may have never, in your career,
ever phrased a research question.

Sometimes, it's common for people to say,
"well, what's your question?"

Research questions are a sub-case of that.

But very often because of the nature of
education and work and scholarship,

we're never potentially that explicit.

It's in part because research questions
are- partly because work is often solo,

so the research questions may be implicit
in one's mind.

It's also in part because most of our
education,

most of PhD training,
certainly,

happens with a group of people, all of
whom share an enormous amount

of background knowledge,

so by the time you're ABD, you're
surrounded by people who have been

in your field for maybe even ten years,

beginning with their undergraduate career
and entering graduate school;

you're surrounded by people with whom
you share an enormous shorthand,

and the research questions are so implicit
that you may never have actually

drawn up to the surface to look at,

and very unlikely that you're drawn them
up in such a way that you could then

show them to a friend of yours who's in
a completely different discipline.

Probably the time that you were most
introduced,

or the time you had the most practice with
developing research questions

is early on in your graduate school career
when you were still drinking

at the grad student bar.

So, I would say, don't be embarrassed, or
don't worry if you struggle with

formulating a research question in a 
humanities analytic project.

That said, you should struggle because
it's a great skill;

it's an important skill to develop
over time.

It's not just a way to keep yourself
on track,

it's not just a way to make sure that you're
asking or answering really important questions

that are meaningful to others,

it's not just a way to make sure that
everybody in your collaboration

is participating equally.

It's also, in fact, a way to make your
research communicatable

to people outside academia itself.

So, developing skills in asking a research
question and framing in limus

to develop over time, is if you're good at
this, you'll also be really good

at teaching your work, you'll also be
really good at writing popular articles

about your work and communicating the
value of that work and giving the value

of that work to people, even well outside
the academic domain.

Being able to understand that the last
three years of your life

were devoted to answering the question,
"how do people talk about excellence,"

being able to say that is not just something
that will make your research deeper,

for example.

but it will also make your research
comprehensible to those around you, and,

of course, including the people who've
supported you and funded you.

Implicitly in this whole time, we've given
all these examples;

we've walked through how they develop,
how they succeed, how they fail,

so just sitting there, it's like, "how do
you find a good research question?"

The short answer to that is that if you
know, you know, call me, tell me

if you found the answer to this question
of all questions, let me know.

The reason I say that is 90% of good
scholarship, at least in my experience

in human analytics and
these kinds of fields,

90% of it is actually this stage.

Finding the good research question and
allowing it- knowing how to develop it,

how to focus it, how to broaden it,
is almost all of the work.

In the end, a really well phrased
research question makes everything else

a technical problem.

It may be a tedious one, like, you have to
look through a hundred books,

it may be a technically challenging one,
like, I need to find, you know,

all the word pairs with these qualities.

But, having formulated that,
you can almost just parcel out the labor

in, you know, kind of an Adam Smith
pin factory fashion.

The good research question itself--this
is... this is the real challenge.

More seriously, I mean, okay, no one
really knows, but one way to develop

your skills in asking and answering good
research questions is to practice.

It is to work with others.

The sort of challenge of a research
question is often the brevity and the depth

and when you're working with other people
and phrasing research questions to each other,

others are a far better judge of whether
or not you've brought

the implicit interest of what's in your
head to the surface

in an exoteric, comprehensible way.

So, all right, so, where are we at the end
of these- this series of videos

on research questions?

Giving you some examples: on example
related to the history of praise,

the other more to the genre, the
literature, the rhetoric of praise.

What are good research questions? They're 
interesting regardless of the answer.

Generally, research question that the
answer is either yes or no,

they're usually not good, although
sometimes yes can be interesting

and no can be interesting for the same
question, in which case, go for it.

Good research questions connect to big
ideas in your field;

they drive at some of the motivating-
uh, you know, at the largest scales

they are things, if you're a historian,
historians care about.

They put your colleagues, meaning your
collaborators, on the same page:

they're comprehensible at whatever
stage of your work, you know,

you're at with your colleagues, you can down
and all say, "Okay, in 10 words or less,

what are we doing? What are we trying to
answer this morning?"

Good research questions tend to direct
your attention to something in the field,

something new in the field, something
perhaps unexpected, and finally,

good research questions empower and
advance the field.

They're not simply this: a failure mode of
human analytics- humanities analytics.

They don't just computerize it, they don't
just tell people what they already knew

plus a computer.

In the next series of lectures, we're
going to talk about on one of the ways

you answer good research questions, and
that focuses in the case of

humanities analytics, and focuses on the
discovery of patterns.

If you take a look in the assignments for
these lectures, you'll have exampes

and challenges to come up with research
questions of your own

to retrospectively identify research
questions in projects that you've been

involved in and in projects that form some
of the canonical examples of excellence

in your own field.

Thank you.