Predictive Analytics

Probabilistic Modeling, Predictive Analytics & Intelligent Design from Multiple Medical Knowledge Sources

Bioingine.com; Probabilistic Modeling and Predictive Analytics Platform for A.I driven Deep Learning to discover Pathways from Clinical Data to Suggested Ontology for Pharmacogenomics; achieving Personalization and Driving Precision Medicine.

Data Integration in the Life Sciences: 11th International Conference, DILS 2015, Los Angeles, CA, USA, July 9-10, 2015, Proceedings

The Feature Diagram from the book above:-

Pharmacogenomics_pathway

Pharmacogenomic knowledge representation, reasoning and genome-based clinical decision support based on OWL 2 DL ontologies

Combining Multiple knowledge Sources and also Ontologies:-

[Suggested Ontologies for Pharmacogenomics converging to help find a Pathway]
  • Patient Data (HL7, C-CDA)
  • Gene Ontology
  • ChEBI Ontology

Integration of Knowledge for Personalized Medicine:- Pharmacogenomics case-study

Looking Forward: The Case for Intelligent Design (and Infrastructure) in Life Science Biologics R&D Sponsored by: Dassault Systèmes; Alan S. Louie, Ph.D. January 2015

http://gate250.com/tc2/IDC%20Biologics%20White%20Paper.pdf

Advertisements

Semantic Data Lake Delivering Tacit Knowledge – Evidence based Clinical Decision Support

Can the complexity be removed and tacit knowledge delivered from the plethora of the medical information available in the world.

” Let Doctors be Doctors”

Semantic Data Lake becomes the Book of Knowledge ascertained by correlation and causation resulting into Weighted Evidence

Characteristics of Bioingine.com Cognitive Computing Platform

  • Architecture style moves from Event driven into Semantics driven
  • Paradigm shift in defining system behavior – it is no more predicated and deterministic – Non Predicated Design
  • Design is “systemic” contrasting the technique such as objected oriented based design, development and assembling components
  • As such a system is better probabilistically studied.
  • Design is context driven, where the boundary diminishes between context and concept
  • System capability is probabilistically programmed by machine learning based on A.I, NLP and algorithms driven by ensemble of Math
  • Design based on Semantic mining and engineering takes precedence to complex event processing (CEP). CEP and Event Driven Architecture (EDA) are the part of the predicated system design. Business rules engine may be an overkill.
  • Ontology is created driven by both information and numbers theory

–Algebra – relationship amongst variables

–Calculus – rate of change in variable and its impact on the other

–Vector Space – study of states of the variables

Bioingine.com algorithm design driven by Probabilistic Ontology

  • Probabilistic Ontology characterizes the ecosystem’s behavior
  • Complex System’s semantic representation evolves generatively
  • System better represented by semantic multiples. Overcomes the barrier of Triple Store (RDF)
  • Human’s interact with the system employing knowledge inference technique
  • Inductive knowledge precedes knowledge by deduction

Bioingine.com is a Probabilistic Computing Machine

  • System’s behavior better modeled by the employ of probability, statistics and vector calculus (Statistics based on HDN an advancement to Bayes Net, where acyclic in DAG is overcome)
  • Generally the system is characterized by high dimensionality in its data set (variability) in addition to volume and velocity.
  • Most computing is in-memory 

BioIngine.com; is designed based on mathematics borrowed from several disciplines and notably from Paul A M Dirac’s quantum mechanics. The approach overcomes many of the inadequacies in the Bayes Net that is based on the directed acyclic graph (DAG). Like knowledge relationships in the real word, and as was required for quantum mechanics, our approaches are neither unidirectional nor do they avoid cycles.

Bioingine.com Features –

  • Bi-directional Bayesian Probability for knowledge Inference and Biostatistics (Hyperbolic complex).
  • Built upon medical ontology (in fact this is discovered by machine learning, AI techniques).
  • Can be both hypothesis and non-hypotheses driven.
  • Quantum probabilities transformed to classical integrating vector space, Bayesian knowledge inference, and Riemann zeta function to deal with sparse data and finally driven by overarching Hyperbolic Dirac Net.
  • Builds into web semantics employing NLP. (Integrates both System Dynamics and Systems Thinking).

Framework of Bioingine –Dirac-Ingine Algorithm Ensemble of Math 

Q-UEL & HDN (More Info click the link)

Clinical Data Analytics – Loss of Innocence (Predictive Analytics) in a Large High Dimensional Semantic Data Lake

Slide1

From Dr. Barry Robson’s notes:-

Is Data Analysis Particularly Difficult in Biomedicine?

Looking for a single strand of evidence in billions of possible semantic multiple combinations by Machine Learning

Of all disciplines, it almost seems that it is clinical genomics, proteomics, and their kin, which are particularly hard on the data-analytic part of science. Is modern molecular medicine really so unlucky? Certainly, the recent explosion of biological and medical data of high dimensionality (many parameters) has challenged available data analytic methods.

In principle, one might point out that a recurring theme in the investigation of bottlenecks to development of 21st century information technology relates to the same issues of complexity and very high dimensionality of the data to be transformed into knowledge, whether for scientific, business, governmental, or military decision support. After all, the mathematical difficulties are general, and absolutely any kind of record or statistical spreadsheet of many parameters (e.g., in medicine; age, height, weight, blood-pressure, polymorphism at locus Y649B, etc.) could, a priori, imply many patterns, associations, correlations, or eigensolutions to multivariate analysis, expert system statements, or rules, such as jHeight:)6ft, Weight:)210 lbs> or more obviously jGender:)male, jPregnant:)no>. The notation jobservation> is the physicists’ ket notation that forms part of a more elaborate “calculus” of observation. It is mainly used here for all such rule-like entities and they will generally be referred to as “rules”.

As discussed, there are systems, which are particularly complex so that there are many complicated rules not reducible to, and not deducible from, simpler rules (at least, not until the future time when we can run a lavish simulation based on physical first principles).

Medicine seems, on the whole, to be such a system. It is an applied area of biology, which is itself classically notorious as a nonreducible discipline.

In other words, nonreducibility may be intrinsically a more common problem for complex interacting systems of which human life is one of our more extreme examples. Certainly there is no guarantee that all aspects of complex diseases such as cardiovascular disease are reducible into independently acting components that we can simply “add up” or deduce from pairwise metrics of distance or similarity.

At the end of the day, however, it may be that such arguments are an illusion and that there is no special scientific case for a mathematical difficulty in biomedicine. Data from many other fields may be similarly intrinsically difficult to data mine. It may simply be that healthcare is peppered with everyday personal impact, life and death situations, public outcries, fevered electoral debates, trillion dollar expenditures, and epidemiological concerns that push society to ask deeper and more challenging questions within the biomedical domain than routinely happen in other domains.

 Large Number of Possible Rules Extractable a Priori from All Types of High-Dimensional Data

For discovery of relationships between N parameters, there are almost always x (to the power N) potential basic rules, where x is some positive constant greater than unity and which is characteristic of the method of data representation and study. For a typical rectangular data input like a spreadsheet of N columns,

[2 to the power of N] – N – 1  = X numbers of tag rules from which evidence requires being established. Record with 100 variables and joint probability 2 means;

2^100-100-1 = 1.267650600228229401496703205275 × 10^30