Research Related Information
Research, being the number one cause of creating a better world so
far, is the noble activity Jonas spends his working time
pursuing. Research topics covered so far include:
- Federations of Smart Objects, and Meme Media.
- Humor generation, more below but also example movies available on another page.
- Image Processing (making a computer understand pictures). I did
what is roughly equvalent to my master's project in this field. My
research was on how color information can be used in "feature
detection" (finding some things in pictures), in this case detection
of "blobs" (round things) and "ridges" (oval things). Traditionally,
this type of processing has been done in greyscale images. The results
were not spectacular, mainly you can use color to throw away stuff you
are not interested in (if you know the color of what you seek) and to
get better contrast if the grey scale contrast is low.
- Natural Language Processing (making a computer understand or do
interesting things with human languages, sometimes also called
computational linguistics). My Ph.D. thesis is about these things,
mainly automatic grammar checking and summarization, and some low
level language processing tools usable for different types of language
tasks. I am still in this field, currently working on computational
humor, which means making computers recognize if something is a joke
or not and to automatically generate jokes.
- Japanology (stuff about Japan and Japanese culture). When I
took an introductory course in Japanese (though I did not have time to
actually attend the course very much) I also had to write
essays. These were supposed to contain fresh new research, though the
time alloted was very very small, so there are no stunning
results. One essay is an overview of the history of Japanese
mathematics and the one I like best is a study of the spare time
interests of Swedish and Japanese women. These are compared by
analyzing the contents of horoscopes from Japanese and Swedish
magazines for women. Since horoscopes can be assumed to tell you stuff
only in areas that interest you and to be made by people whose job it
is to know what this is, I thought this was one of my better research
ideas so far. The teacher agreed, but thought the essay in itself left
a lot to whish for.
Help Make the World a Better Place!
Since human languages are very loosely defined and hard to get a
handle on, one can rarely prove that one's research is correct. This
leads to the need for evaluating if humans think your program is
working or not. This is annoying, but necessary. If you have time and
interest, there might be evaluation projects going on that you as a
human can take part in. Stuff I have had evaluated so far include:
- Computer made summaries, are they good or bad?
- Puns in Japanese, are they funny or not (three times)
- Jokes in English, funny or not (still going)
Kindhearted and helpful people can go to the evaluations page
Publications
Here is a list of my research publications.
BibTex entries
People who want to cite my work (and who wouldn't?) can find the
citation information needed in this BibTex file of Jonas's publications.
Ph.D. Thesis
My PhD thesis from 2006 on "Language Technology for the Lazy - Avoiding
Work by Using Statistics and Machine Learning", in pdf.
Natural Language Processing
Ongoing projects
- Adding farting to the joking robots. A robot farting is probably also funny.
Upcoming
- "Using Long N-Grams and Skip-N-Grams to Classify E-Mails for Automatic Answering", Jonas Sjöbergh
2011
- Hercules Dalianis, Jonas Sjöbergh, and Eriks Sneiders, "Comparing Manual Text Patterns and
Machine Learning for Classification of E-Mails for Classification of E-Mails for
Automatic Answering by a Government Agency", CICLing 2011, pdf
2010
- Jonas Sjöbergh and Kenji Araki, "What Does 3.3 Mean? Using Informal Evaluation Methods to Relate
Formal Evaluation Results and Real World Performance", International Journal of Computational Linguistics Research, pdf
- Jonas Sjöbergh, Micke Kuwahara, and Yuzuru Tanaka, "Using Web-Based Meme Media Technologies to Create an Integrated Visual Environment for Clinical Trials", The IET International Conference on Frontier Computing 2010, pdf
- Jonas Sjöbergh and Kenji Araki, "Evaluation of a Humor Generation System by Real World Application with ¥500,000 to Win", LaCATODA 2010, pdf
2009
- Jonas Sjöbergh and Kenji Araki, "Robots Make Things
Funnier", in LNAI 5447, "New Frontiers in Artificial Intelligence: JSAI2008 Conference and Workshops, Revised Selected Papers", PDF (6.6 Mb)
- Jonas Sjöbergh and Kenji Araki, "A Measure of Funniness, Applied to Finding Funny Things in WordNet", Pacling 2009, pdf
- Jonas Sjöbergh and Kenji Araki, "A Very Modular Humor Enabled Chat-Bot for Japanese", Pacling 2009, pdf
2008
- Jonas Sjöbergh and Kenji Araki, "A Complete and Modestly Funny System for Generating and
Performing Japanese Stand-Up Comedy", COLING 2008, pdf
- Jonas Sjöbergh and Kenji Araki, "Robots Make Things
Funnier", The first international workshop on
laughter in interaction an body movements (LIBM
2008), pdf
- Dai Hasegawa, Jonas Sjöbergh, Rafal Rzepka and Kenji Araki,
"Are Body Movements Really Important for Joke Systems? Comparing
Different Styles of Joke Performance", The first
international workshop on laughter in interaction an body
movements (LIBM 2008), pdf
- Jonas Sjöbergh and Kenji Araki, "What is Poorly Said is a
Little Funny", LREC 2008, pdf
- Jonas Sjöbergh and Kenji Araki, "A Multi-Lingual Dictionary
of Dirty Words", LREC 2008, pdf
- Jonas Sjöbergh and Kenji Araki, "What Types of Translations
Hide in Wikipedia?", LKR 2008 pdf
2007
- Jonas Sjöbergh and Kenji Araki, "Recognizing Humor
Without Recognizing Meaning", WILF (CLIP) 2007 pdf
- Jonas Sjöbergh and Kenji Araki, "Recreating Humorous Split Compound Errors in Swedish by Using Grammaticality", Nodalida 2007 pdf
- Wanwisa Khanaraksombat and Jonas Sjöbergh, "Developing and
Evaluating a Searchable Swedish-Thai Lexicon", Nodalida 2007
pdf
- Martin Hassel and Jonas Sjöbergh, "Widening the HolSum
Search Scope", Nodalida 2007 pdf
- Jonas Sjöbergh and Kenji Araki, "Automatically Creating
Word-Play Jokes in Japanese", NL-178 pdf (abstract in Japanese!)
- Jonas Sjöbergh, "Older versions of the ROUGEeval
summarization evaluation system were easier to fool",
Information Processing & Management, Special Issue on
Summarization (still not printed, but coming out in 2007) pfd
2006
- Jonas Sjöbergh, "Vulgarities are fucking funny, or at
least make things a little bit funnier" Tech report pdf
- Jonas Sjöbergh, "The Internet as a Normative Corpus: Grammar
Checking with a Search Engine" Tech report pdf
- Jonas Sjöbergh and Viggo Kann, "Vad kan statistik avslöja
om svenska sammansテ、ttningar?", Sprテ・k och stil, vol. 16,
2006. pdf (Swedish)
- Martin Hassel and Jonas Sjöbergh, "Towards Holistic
Summarization: Selecting Summaries, not Sentences", LREC
2006. pdf
- Jonas Sjöbergh and Kenji Araki, "Extraction based
summarization using a shortest path algorithm" , 12th Annual Language
Processing Conference NLP2006, Yokohama, Japan, 2006. pdf
2005
- Martin Hassel and Jonas Sjöbergh, "A Reflection of the Whole
Picture Is Not Always What You Want, But That Is What We Give You",
presented at the "Crossing Barriers in Text Summarization
Research" workshop at RANLP'05. pdf
- Jonas Sjöbergh and Ola Knutsson, "Faking Errors to Avoid Making Errors: Very Weakly Supervised
Learning for Error Detection in Writing", presented at RANLP'05. pdf
- Jonas Sjöbergh, "Creating a free digital Japanese-Swedish
dictionary", presented at PACLING 2005. pdf
(see also www.japanska.se where some of the
fruits of this work are being used)
- Jonas Sjöbergh, "Chunking: an Unsupervised Method to Find
Errors in Text", Nodalida 2005. pdf
- Johnny Bigert, Jonas Sjöbergh, Ola Knutsson and Magnus Sahlgren,
"Unsupervised Evaluation of Parser Robustness", presented at CICling
2005. pdf (received second place
best paper award)
2004
- Johnny Bigert, Viggo Kann, Ola Knutsson and Jonas Sjöbergh,
"Grammar Checking for Swedish Second Language Learners",
CALL (Computer Aided Language Learning) for the Nordic Languages,
2004. pdf
- Jonas Sjöbergh and Viggo Kann, "Finding the Correct Interpretation of Swedish
Compounds, a Statistical Approach", LREC 2004,
Lisbon, Portugal. ps
2003
- Jonas Sjöbergh, "Bootstrapping a free part-of-speech lexicon
using a proprietary corpus", ICON 2003, India. ps pdf
- Jonas Sjöbergh, "Stomp, a POS-tagger with a different view",
RANLP 2003, Borovets, Bulgaria, 2003. ps
- Johnny Bigert, Ola Knutsson and Jonas Sjöbergh, "Automatic
Evaluation of Robustness and Degradation in Tagging and Parsing",
RANLP 2003, Borovets, Bulgaria, 2003. pdf
- Jonas Sjöbergh, "Combining POS-taggers for improved accuracy on Swedish
text", NoDaLiDa 2003, Reykjavik, 2003. ps pdf
Master's Thesis, 2001
I did my masters project at CVAP (Computer Vision and
Active Perception Laboratory), which is a part of Nada, KTH. My thesis is available in pdf
(gzipped) (2 Mb), though everything except the abstract is in
Swedish. Since it is about computer vision, there are a lot of
pictures, though. I studied how you can use the extra information
in colour images (compared to the grey scale images normally used)
when detecting features in images. I concentrated on blob
(circles) and ridge (ellipses) detection, in a scale space model
(so you can find all sizes of blobs/ridges).
The two main things you can use colour for (that I found) is
better contrast (the contrast can be high in one colour channel
while still being low in the grey scale image) and filtering
irrelevant features if they have the wrong colour.
It was tested in an application which did hand following and
gesture recognition. The finger tips and palm are blobs (though on
very different scales) and the fingers are ridges, anything that
is not skin coloured is probably not a part of the hand. There
used to be a video
(13 Mb) of the hand recognizer running on input from my
feature detection program. Last time I checked, it was still
there.
Demos
- Demo of SnålGranska
from 2004, a very resource lean grammar checker. The demo is
trained on Swedish, but the method works for pretty much any
language. The paper "Faking errors ..." and parts of the paper
"Grammar checking for Swedish second language learners" above
treat the underlying technologies.
Other Publications
2002
- Johnny Bigert, Ola Knutsson, Viggo Kann, Jonas Sjöbergh
(2002). Annotated Clauses and Flat Phrase Structures for Swedish,
Swedish Treebank Symposium, Växjö, november 2002. pdf
Japanology
Here are my student essays in Japanology. With only three
weeks allotted for doing research, writing and attending endless
amounts of seminars, no stunning results were achieved. Everyting
is written in Swedish.
- "Japans matematikhistoria, en översikt" (The
Mathematical History of Japan, an Overview), autumn 2004,
pdf (Swedish)
- "Japanskors och svenskors intressen: en studie av
damtidningshoroskop" (The interests of Japanese and Swedish Women:
a Study of Women's Magazine Horoscopes), spring 2005, pdf (Swedish)