Tag der Computerlinguistik
The Day of Computational Linguistics will be held somewhere during June this year and will serve to attract potential students from nearby high schools and colleges to our course program. After an introduction to computational linguistics in general and to Tübingen's ISCL in particular, attendees will be free to gather information from several different sections, each devoted to one particular facet of CL. The event is currently being organized by the Fachschaft members and if you are willing to join the preparations, you are very welcome to do so.
- 1 Date, Place and Time
- 2 Poster
- 3 Program
- 4 Sections
- 5 Ideas / Sources
- 6 Misc
Date, Place and Time
The Open Door Day will take place on 21 June in the SfS.
Well it is probably convenient to think as well of a proper beginning time and duration of the event. So, What about 10 a.m. as a beginning time and 4-5 hours at most as a duration (of course, on demand we can always continue, but it is good to have a plan)?
Here is both a PNG-version of the posters, as well as the original inkscape-made SVG. Please use the SVG if you are going to make any changes to the poster. Please use the PNG if you only want to look at it.
Here is the Java source that was used to generate the background "noise" (which was originally taken from gutenberg.org and is based on an excerpt from Romeo and Juliet).
Prof. Hinrichs will give an official introduction to the visitors presenting Tübingen and the course studies in ISCL.
- Place: Not known yet
Everyone can go to visit a total of five different sections, each devoted to a particular topic:
- Linguistics - Place: Not known yet
- Mathematics / Logic / Philosophy - Place: Not known yet
- Information Retrieval / Text Mining / Text Technology -Place: Not known yet
- Programming - Place: Not known yet
- Algorithms - Place: Not known yet
We will invite people from EML as well as IBM and probably other companies (Daimler Chrysler?) to give talks. The absolute maximum on the number of talks is three.
- Place: Not known yet
No content yet.
- Place: Not known yet
We want to invite our teachers to hold a sort of introductory lecture. Who shall we invite? Ideas so far are:
- Sam Featherson - Place: Not known yet
- Frank Richter - Place: Not known yet
Food and Drinks
During the whole program, or at least a large subset of it, food and drinks will be served in the hall. The faculty will pay for this as well.
- Place: Somewhere in the hall. I think the first floor makes sense.
Each section will give a short intro and is to be manned by two of us. Please volunteer.
Presentation of intriguing examples, most likely from German, since most attendees are going to be German. Ideas include:
- Collection of marked sentences in Sternefeld 2006 - initiate discussion about their grammaticality (e.g. "weil es wird aufhören können zu regnen" vs. "weil es hätte aufhören müssen zu regnen", "den Kuchen bäckt die Mutter und isst der Franz" vs. "den Kuchen bäckt die Mutter und isst der Franz Kaugummi")
- Presenting ambiguities in languages
- Show how different languages can be (there is an excellent example of Chinese weirdness here). We should only mention languages that people can actually learn in Tübingen. Good candidates for weirdness are certainly Old Irish and Nahuatl. We could actually present one language from every major typological category (e.g. Turkish for agglutinative, Icelandic or Old Irish for inflectional, Chinese for isolating and Nahuatl for (moderately) polysynthetic.
On basis of the examples we can try justify bracketing patterns and tree structures and present that.
The station to convince the mathematically-minded of our program.
At this station, people will be able to play around with a few mathematical concepts and tools that we use every day. It is somewhat hard to assess how much mathematical background people will have, so we should be prepared to explain everything from scratch. Offering a broad overview rather than a few little gems might help to avoid problems if some parts are less understandable than expected, and the risk of boredom with the audience is also minimized.
I know that I am probably proposing way too much here. Please tell me which of these numerous ideas you consider adequate, or provide me with some additional ideas.
On the whole, I suggest concentrating on three major topics:
1. Theoretical Computer Science
- demonstrate finite-state technology by means of a transducer that encodes some fancy morphological rules, preferably something German such as subjunctive inflection or plural forms for certain noun classes; perhaps use some graphical tool to project the FST onto a wall and let it process random strings ?
- explain the canonical "S --> VP NP" style toy CFG and discuss how this describes a language (introduce notions such as syntactic structure, derivation, ambiguity etc.)
- take this toy CFG to introduce CYK parsing and let people fool around a bit with it
- explain why it is not wise to simply try out all alternatives until the solution is found, this could be a good way of introducing complexity classes
- mention some undecidable problems and point out intuitively why they are undecidable
- create some confusion and mystery about NP-completeness and the P=NP problem
- introduce the basic set-theoretic notions and state some common sense theorems
- informally introduce basic predicate logic (boolean connectives, quantifiers etc.)
- demonstrate how useful FOL is for expressing facts about objects and their relations ("model theory")
- introduce the canonical scope ambiguity example (ExAy vs AxEy) to motivate its use in formal semantics
- maybe show the Peano axiomatization for natural numbers (not really CL-related, but nice to discuss notions like axioms, models etc.)
3. Discrete Mathematics
- introduce graphs and especially trees, explaining how to formalize them
- introduce the concepts of recursion and induction by proving some trivial property of trees
- combinatorics, e.g. "How many ways are there to bracket an expression?"
- some illustrative example for combinatorical explosion, perhaps some hints on how to avoid that
How do search engines work? What's a (linguistic) Corpus? Ideas:
- Present an annotated corpus with a cool interface (latest SPLICR alpha version maybe)
- Automatics text mining (possible demo application: WERTi)
This section will present a short introduction to Computer Science as practised in CL to the visitors. It will contain an introduction to problem solving using systematic methods (probably Algorithms, though people have voted to put that into the mathematics/logics section) including (but not limited to)
- Object Oriented programming
- Presentation of typical homeworks or projects (passivator)
This was an idea given by Anas so that there is a possibility algorithms to be explained without really showing and using any "scary" code for the purpose.
- Sorting and search algorithms could actually be used for an activity game. Let two teams of people try to sort a chaotic array of objects with as few steps as possible. People can choose to adhere to one of the standard algorithms or to use human intuition. Starting from the results, one could then introduce notions such as amortized analysis, divide-and-conquer, worst-case behaviour and average-case behaviour.
Ideas / Sources
It seems that there is a very nice introduction to CL on the pages of CL in Stuttgart. Anyone willing to share a link? Also, Hubert Truckenbrodt's scripts for introduction to Phonology and Ede Zimmerman's scripts for introduction to Semantics are very easy to understand and contain a lot of good examples.
This is the section for all small things that we can or have to do.
- Guest Book
- Flyer and Info materials for take away
- Some "Werbegeschenke" will be as well quite nice to have
- Orientation sheets (maps and posters showing the way to the different rooms)