EH352-7-SU-CO:
Advanced Methods for Text As Data: Natural Language Processing

The details
2023/24
Essex Summer School in Social Science Data Analysis
Colchester Campus
Summer
Postgraduate: Level 7
Current
Monday 22 April 2024
Friday 28 June 2024
30
03 February 2023

 

Requisites for this module
(none)
(none)
(none)
(none)

 

(none)

Key module for

(none)

Module description

With the recent explosion in the availability of digitized text and the expanded access computing power, social scientists are increasingly leveraging advanced computational tools for the analysis of text as data. In this course, students will explore the application of many advanced approaches for text-as-data research in the social sciences.

The course will begin with an overview of text-as-data research for social scientists, orienting students to the general area and contextualizing the advanced approaches we will explore in the class. Then, we will begin to extend our text-as-data work beyond the "bag of words" to models that better represent the richness of text.

Next, the course will turn to embedding-based representations of texts and the underlying distributional theory. We will begin with static embedding models like word2vec and GloVe, and will discuss the benefits and utility of embedding-based representations for social science research.

We will then further our work on embeddings by transitioning to contextual embeddings. To inform our understanding of pretrained contextual embedding models like ELMo and BERT, we will explore neural networks and deep learning in NLP, and will learn how to develop and deploy our own deep learning models. In doing so, we will cover feedforward neural networks, recurrent neural networks, and transformers. Then, we will explore transfer learning, or how to leverage pretrained models for application in our own specific domains.

Finally, we will explore an area of increasing interest at the confluence of NLP and social science research: causal inference with text. In this section, we'll explore how and where text is being used as part of causal research designs, with a focus on efforts to leverage embedding based representations in those designs.

Module aims

No information available.

Module learning outcomes

Students will gain an understanding of important concepts and tools at the leading edge of text-as-data research and how they can be applied in social science text-as-data research. In so doing, the course will equip students as knowledgeable consumers of advanced text-as-data research and provide them with the tools to design and complete more advanced approaches leveraging text in their own work.

Module information

Course prerequisites:

Participants are assumed to have completed a course in text-as-data / quantitative text analysis covering basic text processing, supervised learning (e.g., classification), and unsupervised learning (e.g., scaling, topic modeling), such as 1B. Some facility with Python and/or R is assumed.

Module information will be made available at https://essexsummerschool.com/.

Please contact essexsummerschoolssda@essex.ac.uk and govpgquery@essex.ac.uk with any queries.

Learning and teaching methods

Most days will be split into roughly 2 hours of lecture and 1.5 hours of computing tutorials. Where possible, computing examples will be demonstrated in both Python and R. Students should be aware that some modern NLP models are extremely computationally intensive, requiring GPUs and/or hours/days for realistic examples to be completed. In these cases, tutorials will be limited to "toy examples" or will be demonstrated only partially live in class.

Bibliography

This module does not appear to have a published bibliography for this year.

Assessment items, weightings and deadlines

Coursework / exam Description Deadline Coursework weighting

Exam format definitions

  • Remote, open book: Your exam will take place remotely via an online learning platform. You may refer to any physical or electronic materials during the exam.
  • In-person, open book: Your exam will take place on campus under invigilation. You may refer to any physical materials such as paper study notes or a textbook during the exam. Electronic devices may not be used in the exam.
  • In-person, open book (restricted): The exam will take place on campus under invigilation. You may refer only to specific physical materials such as a named textbook during the exam. Permitted materials will be specified by your department. Electronic devices may not be used in the exam.
  • In-person, closed book: The exam will take place on campus under invigilation. You may not refer to any physical materials or electronic devices during the exam. There may be times when a paper dictionary, for example, may be permitted in an otherwise closed book exam. Any exceptions will be specified by your department.

Your department will provide further guidance before your exams.

Overall assessment

Coursework Exam
100% 0%

Reassessment

Coursework Exam
100% 0%
Module supervisor and teaching staff

 

Availability
No
No
No

External examiner

Dr Anthony Mcgann
Resources
Available via Moodle
No lecture recording information available for this module.

 

Further information

Disclaimer: The University makes every effort to ensure that this information on its Module Directory is accurate and up-to-date. Exceptionally it can be necessary to make changes, for example to programmes, modules, facilities or fees. Examples of such reasons might include a change of law or regulatory requirements, industrial action, lack of demand, departure of key personnel, change in government policy, or withdrawal/reduction of funding. Changes to modules may for example consist of variations to the content and method of delivery or assessment of modules and other services, to discontinue modules and other services and to merge or combine modules. The University will endeavour to keep such changes to a minimum, and will also keep students informed appropriately by updating our programme specifications and module directory.

The full Procedures, Rules and Regulations of the University governing how it operates are set out in the Charter, Statutes and Ordinances and in the University Regulations, Policy and Procedures.