|IBM Releases Deep Search For Scientific Discovery|
|Written by Nikos Vaggalis|
|Tuesday, 16 August 2022|
IBM's Deep Search for Scientific Discovery (DS4SD) Toolkit has been made available to the public. It comes from the depths of IBM's research labs using NLP to analyze mass amounts of data.
Deep Search is a cloud-based AI research service offered as a SaaS that allows researchers to load large amounts of structured or unstructured data to immediately find useful connections. The sources that Deep Search can consume vary and range from journal articles to patents to technical reports and more. By using AI and NLP it can ingest 20 pages per second whereas a typical human expert takes 1–2 minutes per page just to read, and automatically extracts the semantic units and their relationships. It then builds a searchable knowledge graph which enables its users to:
robustly explore information extracted from tens of thousands of documents without having to read a single paper.
As such it has been widely adopted in the scientific field, for instance on Covid research or for alternative cancer treatments by working out the connections between individual research papers, or discovering new molecules. Of course, the use cases are not constrained to the medical research sector but can be applied anywhere there is data like documents, legal briefs, financial statements, technical specifications, research papers, slide decks, you name it.
IBM has made available part of the service in the form of a toolbox , calling it Deep Search for Scientific Discovery (DS4SD). This toolbox is broken down into two parts, Deep Search Experience and Deep Search Toolkit.
The Deep Search Experience is the automatic document conversion service which allows users to upload documents to inspect a document’s conversion quality, using a simple drag-and-drop interface that makes it very easy for non-experts to use. This part is not open sourced but has been made publicly available online for anyone to use. To work with the Deep Search Experience service,you upload your document and then let it work its magic:
The Deep Search toolkit, on the other hand, is an open source Python package allowing users to interact with the Deep Search platform by programmatically uploading and converting documents in bulk. They can point to a folder and direct the toolkit to upload the documents, convert them, and ultimately analyze the contents of the text, tables, and figures. The Deep Search Toolkit is available as a PyPI package. It can be installed using the standard Python package managers like
The Deep Search Experience is reachable at
while you can find the Python DeepSearch Toolkit on its repo.
The wider context is that we are entering an era where AI evolution and advancements in Computer Science will play a crucial role in bringing society forward.That's the one ingredient necessary for success; the other is the democratization by open sourcing those tools in order to make them available to as many brains as possible, increasing multi-fold the chances of making a groundbreaking discovery and so changing the world for the better.
or email your comment to: firstname.lastname@example.org
|Last Updated ( Tuesday, 16 August 2022 )|