The NeuroScholar Project is the flagship project for the Biomedical Knowledge Engineering Research Group at the Information Sciences Institute in Marina Del Rey. We are a group specializing in computational approaches (based on Natural Language Processing and Knowledge Engineering) to computing with information drawn from the scientific literature.

The literature is the natural repository for scientific knowledge, our mission is therefore to find ways to make that information readily available to non-computational biomedical scientists and help them organize their understanding of their systems of interest.

We have generated two downloadable working systems at present:

  1. The NeuroScholar System
  2. The NeuARt II System.

We are developing NLP architecture for Information Extraction of data from large bodies of text. We will also soon be moving websites (watch this space!).

About the Neuroscholar Project

The subject of neuroscience is complex, broad and deep. It uses data from many disciplines:  anatomy, physiology, chemistry, physics, molecular biology, cognitive science and ethology to name a few. It traverses many temporal and spatial scales; from milliseconds to generations, and angstroms to meters. The brain itself has been called 'the most complex object in the known universe' (by Nobel-Prize winner James Watson) and the number of individual cells, and connections between cells is (literally) astronomical. The biggest challenge to understanding the large-scale organization of the brain across systems, modalities and scales is therefore complexity of our own data. We contend that knowledge management systems could be built that address this challenge.

The NeuroScholar project provides knowledge engineering software for use by the neuroscience community. The basic framework of our approach is illustrated in the Figure shown below. Shown here is a typical scenario facing neuroscientists: the information required is scattered throughout a number of knowledge sources (in the literature, on the web, in local data files in the lab). NeuroScholar permits users to capture 'fragments' from the knowledge sources, which then can be used to define knowledge representation items within the primary system.The NeuroScholar system permits users to isolate fragments of data from those sources and then bring them together to form facts which may then be incorporated with interpretations and relations to build representations of knowledge within NeuroScholar that may then be used.

Rather than having to remember or keep physical notes about the thousands of individual facts, assumptions and interpretations that underlie a theoretical perspective, scientists can use NeuroScholar to store, retrieve, evaluate and communicate what they think and the reasoning that defines why they think it.

The NeuroScholar system is the flagship application of this project but we have developed other tools to be used in conjunction with the system. These deliver specialized functionality to a neuroscience knowledge user such as the 'Electronic Laboratory Notebook' (ELN), support for schematic diagrams (Diagrammar) and neuroanatomical mapping functions (NeuARt II). The NAWS system is a method for using NeuroScholar to be able to run analyses on it's contents as a remote webservice, and the Sangam project is concerned with intergrating information between different web services (of which NeuroScholar could be one). We have also built software engineering tools to assist with the construction of NeuroScholar-like knowledge bases. This subsystem is called the 'View-Primitive Data Model framework' (VPDMf').

We currently work in close collaboration with the laboratories of Larry Swanson and Alan Watts at USC, to provide knowledge engineering support for their work concerning detailed neuroanatomical circuitry of the hypothalamus and the effects of glucoprivation as a metabolic form of stress.

We provide the full source code for all aspects of the system (available from Sourceforge (http://www.sourceforge.net/projects/neuroscholar). We distribute this code under a license that is very similar in form to the LGPL (put together by the legal department at USC).

The NeuroScholar system is having some impact in the biological community. We typically announce new releases to four mailing lists: the 'comp-neuro@neuroinf.org' and 'bio_bulletin_board@bioinformatics.org' lists; our own announcement list: 'neuroscholar-announce@lists.sourceforge.net' and a local list for neuroscience graduate students at USC ('ngf@usc.edu'). Announcements of this kind usually cause a significant spike in the number of downloads of our software from the SourceForge website. Click here to be taken to the NeuroScholar downloads statistics page. These statistics are the primary way that we monitor our level of impact within the community.

Research Agenda

The mission of this project is to answer large-scale questions about the organization of the whole brain, across modalities, scales and systems. We strive to accomplish this by the development and delivery of stable open-source software in direct collaboration with neuroscientists so that they can use in their everyday research. We strive to make our software reliable and general enough to be genuinely useful to the overall community of neuroscientist as well as our immediate collaborators.

Our strategy can be broken down into the following three general approaches:

  • Build high-throughput tools for knowledge acquisition and management of neuroscience data.
  • Build computational knowledge representations that capture the logicical structure of neuroscience experiments
  • Build modeling/theoretical frameworks for performing computational analysis on our representations.
  • Build working, releasable software! (see our SourceForge project-site for downloads)

The specific aims of our current work may be paraphrased as follows:

  • Developing an ontological framework for neuroscientific experiments in general.
  • To build a knowledge-management system for a laboratory that effectively captures all the knowledge of that laboratory.
  • Using Natural Language Processing (NLP) techniques to extract information efficiently from the current literature thereby making the process of data entry into the Neuroscholar suite as straightforward as possible.
  • To instantiate these systems in the Watts and Swanson laboratories in a way that empowers and accelerates the work that they are doing.