NLUlite – An NLP Database
Written by Kay Ewbank   
Thursday, 11 September 2014

A new natural language parsing database that reads English texts and can then answer questions about them has been released as a public alpha.

NLULite has been created to be developer friendly, and consists of a server and a Python client. You use it by passing texts to it. The text is tagged using the tag frequencies provided in the Open American National Corpus (OANC). Sentences are then parsed by using parsing frequencies extracted from the OANC. A “distance” between words is obtained by using the Wordnet corpus (3.1). The parsing is then improved by choosing the sentences that make more sense according to the Framenet dataset.

As an example of the way it works, if you pass it the text from Wikipedia about snakes, it would then be able to answer questions such as:

what are the snakes able to do?

where do most of the snakes live?

what animal has no limbs?


Texts can include simple inference rules such as “If an animal has no limbs it cannot walk”, after which you (or a subsequent user) could ask “what does not walk”, and get an answer given in terms of the text submitted and the inference rules you’ve given.


Data sources can include web pages and RSS feeds. The data is kept as objects of the ‘wisdom’ class. Your code can set up many Wisdom objects, and each one is a separate knowledge base. Currently, you can only use NLUlite to parse texts that are smaller than a megabyte, though the developer plans to increase this in future versions. Once the text is parsed, the information is stored as XML.

NLULite is available in a single-threaded free version, or in a commercial multi-threaded version that parses pages much faster.

While there are a number of natural language projects, such as the Stanford Natural Language Processing Group, and the Natural Language Toolkit, this field is still developing.

More Information


Related Articles

Handbook of Natural Language Processing, 2nd Ed (book review)

Taming Text (book review)


To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, Facebook, Google+ or Linkedin,  or sign up for our weekly newsletter.



Electron 15 Adds String Encoding API

Electron 15 has been released with updates to Chromium and Node.js, along with a number of API updates. This release is the first of a new accelerated release cycle that the developers say mean a new  [ ... ]

PostgreSQL 14 Is Here - A Look At Its Past And Future

The latest release of PostgreSQL has new and exciting features. We look the most worthwhile of them identified by Umair Shahid, Head of PostgreSQL at Percona while referring to the past ideas that sha [ ... ]

More News






or email your comment to:


Last Updated ( Thursday, 11 September 2014 )