Welcome to the LingHacks Resources page! Below is a growing collection of natural language processing/computational linguistics-related resources, primarily for students (though many may be applicable to non-students as well). Have a resource you think the community would find helpful? Let us know via this form!
Before (or while) jumping into NLP-related activities, it might be helpful to learn some introductory material. Below are some free resources that you can use to get up-to-speed on NLP.
Before learning computational linguistics, it would be beneficial to have a baseline level of familiarity with introductory computer science/programming. The maintainers of this resource page mostly use Python, so that's what this page will be centered around, but NLP resources in other programming languages do exist as well. We also include some references to Python software packages commonly used in NLP.
Math and Statistics
For a precise understanding of the quantitative workings of NLP systems, it helps to understand some [multivariable] calculus, linear algebra, probability, and statistics. It is possible to understand some basic algorithms with less mathematical background, but these areas of math and statistics become more important as NLP systems--particularly deep learning systems--become more complex. We recommend doing calculus (single-variable then multivariable), followed by linear algebra, followed by probability, followed by statistics (though high school AP-level statistics is helpful and can be done before or at the same time as single-variable calculus), but this is by no means a strict ordering requirement.
MIT OpenCourseWare's Introduction to Probability and Statistics
Seeing Theory, a visual introduction to probability and statistics by Brown University
NLP can be considered a subfield of machine learning, so it's beneficial to have a general understanding of the machine learning field before diving into the specifics of applying machine learning to language.
Udacity's Introduction to Artificial Intelligence (you'll have to make a free account to access this link)
Understanding linguistics--the science of language--isn't strictly necessary to do many computational linguistics-related things, but we still think it's helpful to grasp some basic terminology and concepts and ground your study of NLP in linguistic principles.
MIT OpenCourseWare's Introduction to Semantics and Pragmatics
Natural Language Processing
At last, some courses and reference sites specific to NLP!
Stanford's Natural Language Processing with Deep Learning course
Professor Dan Jurafsky's Speech and Language Processing textbook
Natural Language Processing with Dan Jurafsky and Chris Manning
Fast AI's Code-First Introduction to Natural Language Processing
Whether you're interested in pursuing a research career or just want to be familiar with some hot topics in NLP, here are some websites where you can read the latest NLP research:
The Association for Computational Linguistics Anthology (papers from most of the big NLP conferences and journals, including ACL, IJCNLP, TACL, and EMNLP)
ArXiv Computation and Language (pre-prints of NLP-related papers that may or may not have been peer-reviewed yet)
Papers With Code NLP Tasks (NLP papers accompanied by code, organized by computational task)
Connected Papers: a general paper search tool that helps you visually explore papers that are related to each other/to a certain topic
Caring About the World
NLP does not exist in a vacuum. Here are some resources that you can use to learn how/why that is and what you can do about it. Some of these resources extend to AI/tech in general as well.
Queer in AI's guide on how to make virtual conferences queer-friendly
Queer in AI's preprint on a case study of community-led participatory AI
Shirin Ghaffary's article on racism in hate speech detection algorithms
Field et al's survey of race, racism, and anti-racism in NLP
Blodgett et al's paper on pitfalls of in fairness benchmark datasets
Some proceedings of the ACM Conference on Fairness, Accountability, and Transparency
Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing
Dev et al's paper on challenges in representation of non-binary people in language technologies
Hutchinson et al's paper on the harms of NLP toward disabled people
Hassan et al's paper on the intersectional harms of NLP toward disabled people
Herold et al's paper on disability biases in assistive AI technologies, grounded in psychological literature
Caliskan et al's paper on human-like biases in machine learning algorithms (source: Race After Technology by Professor Ruha Benjamin)
Data & Society Research Institute's Algorithmic Accountability Primer (source: Race After Technology by Professor Ruha Benjamin)
Allied Media Conference (source: Race After Technology and Anti-Racism Daily)
GramBank (grammatical information for 2400 languages, source: Graham Neubig)
You're probably here to ultimately add to your resumé. So, here's how you can do that.
North American Computational Linguistics Olympiad - open to US & Canada-based 6th-12th grade students, happens every January
AI4ALL Summer Programs - open to high school students (exact demographics and grade levels vary by site and by year); the organization also periodically hires staff (college-aged or higher) to support student programs
Johns Hopkins Center for Language and Speech Processing Workshops - open to undergraduate+ students (may or may not happen each summer)
Stanford Center for the Study of Language and Information summer internship - open to undergraduate students
National Science Foundation Research Experiences for Undergraduates, CS opportunities
Start your own inclusive CS initiative that may or may not be related to NLP - check out NCWIT's AspireIT Toolkit for some guidance
CodePath - interview prep, iOS, Android, cybersecurity courses for college students interested in software engineering
Carnegie Mellon University Language Technologies Institute News page - they often post internship listings, research spotlights, and news about resources/courses
EPFL's summer internship programs: Summer@EPFL and EEE - open to undergrad and masters students
Max Planck Institute summer internships - open to all university students
ETH Zurich summer internships - open to college students
Imperial College London undergraduate research opportunities program - open to college students
And of course, LingHacks (all info on this site)! We host hackathons, provide resources for you to start clubs, and host/partner to host workshops in NLP. Sign up here to be notified of future events!