Information détaillée concernant le cours

[ Retour ]
Titre

Automated Text Analysis

Dates

20-23 september 2021

Lang EN Workshop language is English
Organisateur(s)/trice(s)

Elisa Volpi, CUSO

Intervenant-e-s

Nikolay Marinov, Houston, USA

Description

Computational-Text-Analysis-for-PoliticalScience

Session 1

· Github 

· Jupyter Notebooks 

· Python Interactive Shell: Print Function 

· Assigning Variables and Values 

· Understanding Different Types of Objects in Python 

· Mathematical Operations 

· Boolean Operations 

· Transformation of Objects 

· Input function 

Session 2

· Understanding Python Strings 

· Combining Objects 

· Libraries in Python 

· Data Structures and Containers 

· List Comprehension 

Session 3

· Loops : for loops; if, else, elif loops - conditional loops; while loops; continue and break commands 

· Appending Lists 

· Defining Functions 

· Revise list comprehension 

· Join command 

· Opening, Closing, Updating text files: codecs library; Python inbuilt open command

· Basic Text Processing Operations: lookforwords; remove punctuation; remove digits 

Session 4

· Loading data into Python in different ways (Codecs, OS, etc.) 

· Manipulating data structures 

· Removing Stopwords

· Understanding and Building a Text-Cleaning Pipeline 

Session 5

· Managing Data with Pandas 

· Manipulating Data with Pandas 

· Mini Data Visualization Session 

· Short intro to Numpy 

Session 6

· Natural Language Processing Cleaning Pipeline 

· Adding to the NLP Pipeline: Tokenization Extra 

· Lambda Operator 

· More Visualization Methods 

Session 7

· Adding to the NLP Pipeline: Lemmatization 

· Adding to the NLP Pipeline: Stemming 

Session 8

· Adding to the NLP Pipeline: Pos-Tagging 

· Adding to the NLP Pipeline: Named Entity Recognition (NER) 

· Senses 

· Entity Linking 

Session 9

• Word-Embeddings and Vector Space Representation 

• Cosine-Similarity 

Session 10

Unsupervised Techniques 

· Clustering 

· Topic-Modelling

Session 11:


Supervised Machine Learning Techniques

• Classification: Naive Bayes; Random Forest; Logistic; K-Neighbors; Support Vector Machine 

• Evaluation Methods of Classification: Precision, Recall, F1 Score; Confusion Matrices; Cross-Validation 

Session 12:


• Different Ways to Approach A Sentiment Analysis Task: Based Sentiment Analysis; Classification; Rule

 

Session 13:


• Twitter Scraping and Data Cleaning Mini Session: 

 

 

Programme

Schedule:

Monday September 20th: 10:00-12:00 and 13:00-15:00

Tuesday September 21st: 10:00-12:00 and 13:00-15:00

Wednesday September 22nd: 10:00-12:00 and 13:00-16:00

Thursday Septmber 23rd: 10:00-12:00 and 13:00-16:00

Lieu

ZOOM !

Information
Places

20

Délai d'inscription 13.09.2021
short-url short URL

short-url URL onepage