SpaCy Natural Language Processing Library Released
Written by Alex Denham   
Tuesday, 15 October 2019

There's a new release of SpaCy, a natural language processing library in Python that the developers describe as industrial strength and blazingly fast with a simple and productive API.

The development team says spaCy excels at large-scale information extraction tasks. It's written from the ground up in memory-managed Cython, the superset of Python that aims to provide C-like performance with code that is written mostly in Python. Independent research in 2015 found spaCy to be the fastest syntactic parser in the world. spaCy can be used to prepare text for deep learning and it interoperates with TensorFlow, PyTorch, scikit-learn, and Gensim.

spacylogo

 

The new release is described as being leaner, cleaner and even more user-friendly, with new model packages and features for training, evaluation and serialization. It has improved performance over lower-cased texts to overcome any problems due to the models having been trained on well formed data in terms of casing and formality, then being used for real on texts which have inconsistent casing and punctuation. The developers are overcoming this via a new data augmentation system, and the first feature to be introduced in the v2.2 models is a word replacement system that also supports paired punctuation marks, such as quote characters.

New pretrained models have been added for Norwegian and Lithuanian, though the developers say accuracy on both these languages should improve in subsequent releases, as the current models make use of neither pretrained word vectors nor the SpaCy pretrain command.

New CLI features for training have been added, especially for text categorization. Error messages have been improved, the documentation is updated, and the evaluation metrics are more detailed. Integrated support has been added for the text categorizer in the CLI, so you can now write commands in the same way you would when training the parser, entity recognizer or tagger. 

 

spacylogo

More Information

SpaCy Homepage

Related Articles

Rule-Based Matching In Natural Language Processing

Transformers Offers NLP For TensorFlow and PyTorch

Facebook Open Sources Natural Language Processing Model  

PyTorch Adds TorchScript API

NVIDA Updates Free Deep Learning Software

TensorFlow - Googles Open Source AI And Computation Engine

TensorFlow 2 Offers Faster Model Training

Rule-Based Matching In Natural Language Processing  

Zalando Flair NLP Library Updated

Intel Open Sources NLP Architect

Google SLING: An Open Source Natural Language Parser

Spark Gets NLP Library

Microsoft Expands Cognitive Services APIs

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.

Banner


Angular 9 Moves To Ivy
10/02/2020

Angular 9 has been released with improvements to the framework, Angular Material, and the CLI. This release also moves over to the Ivy compiler and runtime by default, and introduces improved ways of  [ ... ]



Google Shutting Down App Maker
04/02/2020

Google is closing down App Maker, its development tool for non-programmers. The announcement was made two weeks after it acquired AppSheet, a tool aimed at a similar market to App Maker. 


More News

graphics

 



 

Comments




or email your comment to: comments@i-programmer.info