Resources

From FachschaftSprachwissenschaft
Jump to: navigation, search

This page is for sharing literature and tools to help everybody in their studies. Examples include wordlists, books freely available on the Web, LaTeX style files, software you have written... go ahead!

If you make your own content available, please specify under what license or at least what conditions you would like to release it.

Papers online

This section is for papers that you can cite without woes: They have been published in an official journal and are available online in that same version. Please pay attention to that when adding to this section!

For electronic papers: Make sure you are at a university computer or connected to the university network via VPN. This way, you can download many papers for free that otherwise require registration and payment.

Books

Books freely available on the net. If you do not find what you're looking for, you might want to try WikiBooks

Linguistics

Literature about Phonology, Morphology, Syntax, Semantics, Pragmatics, Logic from a natural language perspective and all sorts of new research topics (DRT, OT, LFG, *TAG, Tree Languages...)

Computer Science

Books about programming languages, algorithms, theoretical computer science... Please do not confuse this section with the one about Computational Linguistics.

How to Think Like a Computer Scientist: Learning with Python

An introduction to Python 2 programming for people with no previous knowledge in programming. Available for free online: http://ada.rg16.asn-wien.ac.at/~python/how2think/english/

Dive Into Python

"Dive Into Python" by Mark Pilgrim is a thorough introduction to the Python programming language for experienced programmers. It is undergoing constant development and is available for free in several languages: http://ccn.ucla.edu/tutorials/diveintopython/

Blackburn, Bos and Striegnitz: Learn Prolog Now!

Free Online Book about the logic programming language.

Covington: Prolog Programming in Depth

Prolog Programming in Depth can serve you as a very nice reference to Prolog, as it is a searchable PDF and also very straightforward and systematically arranged. As an introduction to Prolog it is also more accessible than The Art of Prolog. You can download a draft of it here. (In case it becomes unavailable, you can ask me about providing you with the PDF.)

J. Tang: Write Yourself a Scheme in 48 Hours

This may serve as a nice introduction to functional programming languages and/or parsing routines in Haskell. While Haskell is still not being widely used for NLP, that may very well change in the future, as it provides most of what NLP-interested people want: performance, ease of use, robustness, proximity (in syntax and semantics) to mathematical and logical paradigms, functional programming, type signatures, built-in lambda calculus... whatnot. During the course of this book you are also going to write a Scheme parser based on the Parsec library for Haskell. It is available from the author or as a WikiBook

Computational Linguistics

Books which neither fall into the domain of pure computer science, nor linguistics, but rather concern themselves with topics on both ends.

Blackburn and Striegnitz: Natural Language Processing Techniques in Prolog

Another free online book this time focussed not at Prolog itself, but at using Prolog for NLP tasks specifically.

Code

Code you've written and like to share. Please do not forget to add a license for it.

  • BananaSplit: dictionary-based, simplistic compound splitter that works both as a library and as an application, written by Niels Ott.
  • ClusterLib: Java library for hierarchical bottom-up clustering by Ramon Ziai and Niels Ott.

LaTeX

Tübinger Sprüchle

Style file by Niels Ott for the official statement to be included (and signed) in every Schriftliche Hausarbeit or written summary. Usage: \SpruechleAufsagen{YourName}. If you want another statement to be displayed, use \renewcommand{\DasTuebingerSpruechle}{YourBlaBlaHere}. (See also this blog post.)

Student Projects

Here are some examples of software produced by students. Please feel free to link your own code here.

A Reimplementation of the WERTi System

Aleks' re-implementation of the WERTi system (originally written by Detmar Meurers and his research associates) is currently available as source over on github. WERTi is an ICALL (Interactive Computer Aided Language Learning) system designed to ease the process of second language acquisiton trough various interactive on-line activities. The system will soon be deployed on a public server, were you can try it out.

Java Libraries around NLP

Levenshtein Distance

Also called string edit distance.

NLP Pipeline Components

  • OpenNLP is a statistics-based toolkit that provides code and model files for several languages (tagging, tokenization, sentence splitting, parsing, entity annotation).

Web and Miscellaneous Resources

This is the section about resources you stumbled upon in the Web that might be of interest to your fellow students.

Linguistics

Online collections of papers? Tips for a linguist's survival in our world of studies? Here they go.

Semantics Archive

The Semantics Archive is an online collection of various papers concerning natural language semantics. An invaluable resource for anyone interested in the meaning of words, phrases and discourse.

LaTeX for Linguists

This site is just the single most helpful place in the Web if you are a linguist who wants to get their stuff done in the superior-to-all-other-type-setting-systems-typesetting-system LaTeX. It covers most everything ranging from Chomsky-style examples, over how to do trees, discourse representation structures, all sorts of HPSG-esque matrices, to linguistic bibliographies and is surely worth a read.

Computational Linguistics

Academic Writing

This section contains references to style guides in academic (English) writing. You may find this useful if you're currently taking classes in Academic Writing, writing your B.A., M.A. or even some other thesis or term paper. Some of the links are rather technical and deal with the intricacies of writing larger documents in LaTeX (you are not going to write your paper in anything else, are you now?), others refer to general guidelines about style, content, and language of such documents.

Thesis in LaTeX

  • Using LaTeX to Write a PhD Thesis seems to be a nice site explaining how to handle your papers (especially larger PhD-thesis-like papers) in LaTeX. Thanks to Katya for pointing us to the link.
  • Many LaTeX document classes were designed for letter page format. The KomaScript classes adapt the layout for European formats such as A4 and provide many other useful functions.

Style Guidelines: Reports

Typically, an ISCL student has to write at least one report during the course of their studies: the internship report. The following links will hopefully help you to get on the right track.

  • Another "handbook" about writing lab reports that may contain nice hints for internship reports.
Some Hints

Thanks to Janina Radó, who taught the Academic Writing course in SS 2008, for these useful hints:

  • The only parts with subjective information are the introduction and the conclusion. The rest is purely factual. That should also be reflected in the language — avoid 1st person if possible, use neutral/ impersonal forms ("was necessary", "this was solved by developing ...", "this lead to an improvement..."). Don't overdo passive voice, though. 1st person is really only needed if you use it to distinguish things you did alone from those you did as part of a group — and you can even get around that one.
  • The main part of the report is the task results part. That is already done, so it should be described using past tense. In the rest of the paper you'll typically want to use present tense.
  • The report should not read like a diary. There is no need to present the process you went through to arrive at the solution, unless it is important on its own. That is, do not describe dead-ends in great detail simply to justify why you didn't get to complete the project. Mention (or discuss, as appropriate) them only if they are the kind of example that others may want to learn from. Also, the time you spent on a particular subpart does not need to be reflected in the length of the corresponding description.
  • There is no need to talk about obvious "subtasks" such as reading up on the method you were going to use, unless it took up a huge amount of your internship. The report is about how you used your skills and what new skills you learned, demonstrated though the actual thing you "created" — it is taken for granted that you can read and learn things that way.
  • Although the presentation is a better place to mention this, you may still want to include a sentence or two in the conclusion about the general atmosphere at the company you were at, how much help you were given and how much you were allowed to try out your own ideas. This is not a requirement and I'm not sure whether students ever look at previous reports before they select a place for their internships, but Dale may remember reading that this place was great (or a great horror) and advise students accordingly.

Computer Nerdery

A section to keep you up-to-date with hacks and interesting stuff that people from the SfS know about computers.

Backups

Everybody should do backups. If you don't, you will, eventually, but probably only until after you've learned your lesson. Hard. So here's some resources we have gathered and written in order to ease the task of reducing the pain of a computer crash two weeks before you're supposed to hand in your thesis.