implicate order

The BioIngine.com Platform Beta Release 1.0 on the Anvil

The BioIngine.com™ 

Ingine; Inc™, The BioIngine.com™, DiracIngine™, MARPLE™ are all Ingine Inc © and Trademark Protected; also The BioIngine.com is Patent Pending IP belonging to Ingine; Inc™.

Screen Shot 2016-09-01 at 8.32.18 PM

High Performance Cloud based Cognitive Computing Platform

The below figure depicts the healthcare analytics challenge as the order of complexity is scaled.

1. Introduction Beta Release 1.0

It is our pleasure to introduce startup venture Ingine; Inc that brings to market The BioIngine.com™Cognitive Computing Platform for the Healthcare market, delivering Medical Automated Reasoning Programming Language Environment (MARPLE) capability based on the mathematics borrowed from several disciplines and notably from late Prof Paul A M Dirac’s Quantum Mechanics.

The BioIngine.com™; is a High Performance Cloud Computing Platformdelivering HealthCare Large-Data Analytics capability derived from an ensemble of bio-statistical computations. The automated bio-statistical reasoning is a combination of “deterministic” and “probabilistic” methods employed against both structured and unstructured large data sets leading into Cognitive Reasoning.

The BioIngine.com™; delivers Medical Automated Reasoning based on a Medical Automated Programming Language Environment (MARPLE) capability, so better achieving 2nd order semantic interoperability1 in the Healthcare ecosystem. (Appendix Notes)

The BioIngine.com™ is a result of several years of efforts with Dr. Barry Robson; former Chief Scientific Officer, IBM Global Healthcare, Pharmaceutical and Life Science. His research has been in developing quantum math driven exchange and inference language achieving semantic interoperability, while also enabling Clinical Decision Support System, that is inherently Evidence Based Medicine (EBM). The solution, besides enabling EBM, also delivers knowledge graphs for Public Health surveys including those sought by epidemiologists. Based on Dr Robson’s experience in the biopharmaceutical industry and pioneering efforts in bioinformatics, this has the data mining driven potential to advance pathways planning from clinical to pharmacogenomics.

The BioIngine.com™; brings the machinery of Quantum Mechanics to Healthcare analytics; delivering a comprehensive data science experience that covers both Patient Health and Population Health (Epidemiology) analytics, driven by a range of bio-statistical methods from descriptive to inferential statistics, leading into evidence driven medical reasoning.

The BioIngine.com™; transforms the large clinical data sets generated by interoperability architectures, such as in Health Information Exchange (HIE) into “semantic lake” representing the Health ecosystem that is more amenable to bio-statistical reasoning and knowledge representation. This capability delivers evidence-based knowledge needed for Clinical Decision Support System, better achieving Clinical Efficacy by helping to reduce medical errors.

The BioIngine.com™; platform working against large clinical data sets or while residing within the large Patient Health Information Exchange (HIE) works in creating opportunity for Clinical Efficacy, while it also facilitates in the better achievement of “Efficiencies in the Healthcare Management” that Accountable Care Organization (ACO) seeks.

Our endeavors have resulted in the development of revolutionary Data Science to deliver Health Knowledge by Probabilistic Inference. The solution developed addresses critical areas in both scientific and technical, notably the healthcare interoperability challenges of delivering semantically relevant knowledge both at patient health (clinical) and public health level (Accountable Care Organization).

2. WhyThe BioIngine.com™?

The basic premise in engineering The BioIngine.com™ is in acknowledging the fact that in solving knowledge extraction from the large data sets (both structured and unstructured), one is confronted by very large data sets riddled by high-dimensionality and uncertainty.

Generally in solving insights from the large data sets the order in complexity is scaled as follows:-

A. Insights around :- “what” 

For large data sets, descriptive statistics are adequate to extract a “what” perspective. Descriptive statistics generally delivers statistical summary of the ecosystem and the probabilistic distribution.

B. Univariate Problem :- “what” 

Considering some simplicity in the variables relationships or is cumulative effects between the independent variables (causing) and the dependent variables (outcomes):-

a) Univariate regression (simple independent variables to dependent variables analysis)

b) Correlation Cluster – shows impact of set of variables or segment analysis.

           https://en.wikipedia.org/wiki/Correlation_clustering

[From above link:- In machine learningcorrelation clustering or cluster editing operates in a scenario where the relationships between the objects are known instead of the actual representations of the objects. For example, given a weighted graph G = (V,E), where the edge weight indicates whether two nodes are similar (positive edge weight) or different (negative edge weight), the task is to find a clustering that either maximizes agreements (sum of positive edge weights within a cluster plus the absolute value of the sum of negative edge weights between clusters) or minimizes disagreements (absolute value of the sum of negative edge weights within a cluster plus the sum of positive edge weights across clusters). Unlike other clustering algorithms this does not require choosing the number of clusters k in advance because the objective, to minimize the sum of weights of the cut edges, is independent of the number of clusters.]

C. Multivariate Analysis (Complexity increases) :- “what”

a) Multiple regression (considering multiple univariate to analyze the effect of the independent variables on the outcomes)

b) Multivariate regression – where multiple causes and multiple outcomes exists

All the above are still discussing the “what” aspect. When the complexity increases the notion of independent and dependent variables become non-deterministic, since it is difficult to establish given the interactions, potentially including cyclic paths of influence in a network of interactions, amongst the variables. A very simple example in just a simple case is that obesity causes diabetes, but the also converse is true, and we may also suspect that obesity causes type 2 diabetes cause obesity… In such situation what is best as “subject” and what is best as “object” becomes difficult to establish. Existing inference network methods typically assume that the world can be represented by a Directional Acyclic Graph, more like a tree, but the real world is more complex than that that: metabolism, neural pathways, road maps, subway maps, concept maps, are not unidirectional, and they are more interactive, with cyclic routes. Furthermore, discovering the “how” aspect becomes important in the diagnosis of the episodes and to establish correct pathways, while also extracting the severe cases (chronic cases which is a multivariate problem). Indeterminism also creates an ontology that can be probabilistic, not crisp.

Most ACO analytics addresses the above based on the PQRS clinical factors, which are all quantitative. Barely useful for advancing the ACO into solving performance driven or value driven outcomes most of which are qualitative.

D. Neural Net :- “what”

https://www.wolfram.com/language/11/neural-networks/?product=mathematica

The above discussed challenges of analyzing multivariate pushes us into techniques such as Neural Net; which is the next level to Multivariate Regression Statistical Approach…. where multiple regression models are feeding into the next level of clusters, again an array of multiple regression models.

The Neural Net method still remains inadequate in exposing “how” probably the human mind is organized in discerning the health ecosystem for diagnostic purposes, for which “how”, “why”, “when” etc becomes imperative to arrive at accurate diagnosis and target outcomes efficiently. Its learning is “smudged out”. A little more precisely put: it is hard to interrogate a Neural Net because it is far from easy to see what are the weights mixed up in different pooled contributions, or where they come from.

“So we enter Probabilistic Computations which is as such Combinatorial Explosion Problem”.

E. Hyperbolic Dirac Net (Inverse or Dual Bayesian technique): – “how”, “why”, “when” in addition to “what”.

Note:- Beta Release 1.0 only addresses HDN transformation and inference query against the structured data sets and Features A, B and E. However, as a non-packaged solution C and D features can still be explored.

Release 2.0 will deliver full A.I driven reasoning capability MARPLE working against both structured and unstructured data sets. Furthermore, it will be designed to be customized for EBM driven “Point Of Care” and “Care Planning” productized user experience.

The BioIngine.com™offers a comprehensive bio-statistical reasoning experience in the application of the data science as discussed above that blends descriptive and inferential statistical studies.

The BioIngine.com™; is a High Performance Cloud Computing Platformdelivering HealthCare Large-Data Analytics capability derived from an ensemble of bio-statistical computations. The automated bio-statistical reasoning is a combination of “deterministic” and “probabilistic” methods employed against both structured and unstructured large data sets leading into Cognitive Reasoning.

Given the challenge of analyzing against the large data sets both structured (EHR data) and unstructured data; the emerging Healthcare analytics are around above discussed methods D and E; Ingine Inc is unique in the Hyperbolic Dirac Net proposition.

Q-UEL Toolkit for Medical Decision Making :- Science of Uncertainty and Probabilities

Screen Shot 2016-08-24 at 11.07.49 AM

Quantum Universal Exchange Language

Emergent | Interoperability | Knowledge Mining | Blockchain

Q-UEL

  1. It is a toolkit / framework
  2. Is an Algorithmic Language for constructing Complex System
  3. Results into a Inferential Statistical mechanism suitable for a highly complex system – “Hyperbolic Dirac Net”
  4. Involves an approach that is based on the premise that a Highly Complex System driven by the human social structures continuously strives to achieve a higher order in the entropic journey by continuos discerning the knowledge hidden in the system that is in continuum.
  5. A System in Continuum seeking Higher and Higher Order is a Generative System.
  6. A Generative System; Brings System itself as a Method to achieve Transformation. Similar is the case for National Learning Health System.
  7. A Generative System; as such is based on Distributed Autonomous Agents / Organization; achieving Syndication driven by Self Regulation or Swarming behavior.
  8. Essentially Q-UEL as a toolkit / framework algorithmically addresses interoperability, knowledge mining and blockchain; while driving the Healthcare Eco-system into Generative Transformation achieving higher nd higher orders in the National Learning Health System.
  9. It has capabilities to facilitate medical workflow, continuity of care, medical knowledge extraction and representation from vast large sets of structured and unstructured data, automating bio-statistical reasoning leading into large data driven evidence based medicine, that further leads into clinical decision support system including knowledge management and Artificial Intelligence; and public health and epidemiological analysis.

http://www.himss.org/achieving-national-learning-health-system

GENERATIVE SYSTEM :-

https://ingine.wordpress.com/2013/01/09/generative-transformation-system-is-the-method/

A Large Chaotic System driven by Human Social Structures has two contending ways.

a. Natural Selection – Adaptive – Darwinian – Natural Selection – Survival Of Fittest – Dominance

b. Self Regulation – Generative – Innovation – Diversity – Cambrian Explosion – Unique Peculiarities – Co Existence – Emergent

Accountable Care Organization (ACO) driven by Affordability Care Act transforms the present Healthcare System that is adaptive (competitive) into generative (collaborative / co-ordinated) to achieve inclusive success and partake in the savings achieved. This is a generative systemic response contrasting the functional and competitive response of an adaptive system.

Natural selection seems to have resulted in functional transformation, where adaptive is the mode; does not account for diversity.

Self Regulation – seems like is a systemic outcome due to integrative influence (ecosystem), responding to the system constraints. Accounts for rich diversity.

The observer learns generatively from the system constraints for the type of reflexive response required (Refer – Generative Grammar – Immune System – http://www.ncbi.nlm.nih.gov/pmc/articles/PMC554270/pdf/emboj00269-0006.pdf)

From the above observation, should the theory in self regulation seem more correct and that adheres to laws of nature, in which generative learning occurs. Then, the assertion is “method” is offered by the system itself. System’s ontology has an implicate knowledge of the processes required for transformation (David Bohm – Implicate Order)

For very large complex system,

System itself is the method – impetus is the “constraint”.

In the video below, the ability for the cells to creatively create the script is discussed which makes the case for self regulated and generative complex system in addition to complex adaptive system.

 

Further Notes on Q-UEL / HDN :-

  1. That brings Quantum Mechanics (QM) machinery to Medical Science.
  2. Is derived from Dirac Notation that helped in defining the framework for describing the QM. The resulting framework or language is Q-UEL and it delivers a mechanism for inferential statistics – “Hyperbolic Dirac Net”
  3. Created from System Dynamics and Systems Thinking Perspective.
  4. It is Systemic in approach; where System is itself the Method.
  5. Engages probabilistic ontology and semantics.
  6. Creates a mathematical framework to advance Inferential Statistics to study highly chaotic complex system.
  7. Is an algorithmic approach that creates Semantic Architecture of the problem or phenomena under study.
  8. The algorithmic approach is a blend of linguistics semantics, artificial intelligence and systems theory.
  9. The algorithm creates the Semantic Architecture defined by Probabilistic Ontology :- representing the Ecosystem Knowledge distribution based on Graph Theory

To make a decision in any domain, first of all the knowledge compendium of the domain or the system knowledge is imperative.

System Riddled with Complexity is generally a Multivariate System, as such creating much uncertainty

A highly complex system being non-deterministic, requires probabilistic approaches to discern, study and model the system.

General Characteristics of Complex System Methods

  • Descriptive statistics are employed to study “WHAT” aspects of the System
  • Inferential Statistics are applied to study “HOW”, “WHEN”, “WHY” and “WHERE” probing both spatial and temporal aspects.
  • In a highly complex system; the causality becomes indeterminable; meaning the correlation or relationships between the independent and dependent variables are not obviously established. Also, they seem to interchange the position. This creates dilemma between :- subject vs object, causes vs outcomes.
  • Approaching a highly complex system, since the priori and posterior are not definable; inferential techniques where hypothesis are fixed before the beginning the study of the system become enviable technique.

Review of Inferential Techniques as the Complexity is Scaled

Step 1:- Simple System (turbulence level:-1)

Frequentist :- simplest classical or traditional statistics; employed treating data random with a steady state hypothesis – system is considered not uncertain (simple system). In Frequentist notions of statistics, probability is dealt as classical measures based only on the idea of counting and proportion. This technique is applied to probability to data, where the data sets are rather small.

Increase complexity: Larger data sets, multivariate, hypothesis model is not established, large variety of variables; each can combine (conditional and joint) in many different ways to produce the effect.

Step 2:- Complex System (turbulence level:-2)

Bayesian :- hypothesis is considered probabilistic, while data is held at steady state. In Bayesian notions of statistics, probability is of the hypothesis for a given sets of data that is fixed. That is, hypothesis is random and data is fixed. The knowledge extracted contains the more subjectivist notions of uncertainty, belief, reliability, or confidence often used in automated inference and decision support systems.

Additionally the hypothesis can be explored only in an acyclic fashion creating Directed Acyclic Graphs (DAG)

Increase the throttle on the complexity: Very large data sets, both structured and unstructured,  Hypothesis random, multiple Hypothesis possible, Anomalies can exist, There are hidden conditions, need arises to discover the “probabilistic ontology” as they represent the system and the behavior within.

Step 3: Highly Chaotic Complex System (turbulence level:-3)

Certainly DAG is now inadequate, since we need to check probabilities as correlations and also causations of the variables, and if they conform to a hypothesis producing pattern, meaning some ontology is discovered which describes the peculiar intrinsic behavior among a specific combinations of the variables to represent a hypothesis condition. And, there are many such possibilities within the system, hence  very chaotic and complex system.

Now the System itself seems probabilistic; regardless of the hypothesis and the data. This demands Multi-Lateral Cognitive approach

Telandic …. “Point – equilibrium – steady state – periodic (oscillatory) – quasiperiodic – Chaotic – and telandic (goal seeking behavior) are examples of behavior here placed in order of increasing complexity”

A Highly Complex System, demands a Dragon Slayer – Hyperbolic Dirac Net (HDN) driven Statistics (BI-directional Bayesian) for extracting the Knowledge from a Chaotic Uncertain System.

Platform for BigData Driven Medicine and Public Health Studies [ Deep Learning & Biostatistics ]

Panel_Logo

Bioingine.com; Platform for comprehensive statistical and probability studies for BigData Driven Medicine and Public Health.

Importantly helps redefine Data driven Medicine as:-

Ontology (Semantics) Driven Medicine

Comprehensive Platform that covers Descriptive Statistics and Inferential Probabilities.

Beta Platform on the anvil. Signup for Demo by sending mail to

“demo@bioingine.com”

Bioingine.com employs algorithmic approach based on Hyperbolic Dirac Net that allows inference nets that are a general graph (GC), including cyclic paths, thus surpassing the limitation in the Bayes Net that is traditionally a Directed Acyclic Graph (DAG) by definition. The Bioingine.com approach thus more fundamentally reflects the nature of probabilistic knowledge in the real world, which has the potential for taking account of the interaction between all things without limitation, and ironically this more explicitly makes use of Bayes rule far more than does a Bayes Net.

It also allows more elaborate relationships than mere conditional dependencies, as a probabilistic semantics analogous to natural human language but with a more detailed sense of probability. To identify the things and their relationships that are important and provide the required probabilities, the Bioingine.com scouts the large complex data of both structured and also information of unstructured textual character.

It treats initial raw extracted knowledge rather in the manner of potentially erroneous or ambiguous prior knowledge, and validated and curated knowledge as posterior knowledge, and enables the refinement of knowledge extracted from authoritative scientific texts into an intuitive canonical “deep structure” mental-algebraic form that the Bioingine.com can more readily manipulate.

BigData Driven Medicine Program :-

http://med.stanford.edu/iddm.html

Objectives and Goals

Informatics & Data-Driven Medicine (IDDM) is a foundation area within the Scholarly Concentration program that explores the new transformative paradigm called BIG DATA that is revolutionizing medicine. The proliferation of huge databases of clinical, imaging, and molecular data are driving new biomedical discoveries and informing and enabling precision medical care. The IDDM Scholarly Concentration will provide students insights into this important emerging area of medicine, and introducing fundamental topics such as information management, computational methods of structuring and analyzing biomedical data, and large-scale data analysis along the biomedical research pipeline, from the analysis and interpretation of new biological datasets to the integration and management of this information in the context of clinical care.

Requirements

Students who pursue Informatics & Data-Driven Medicine in conjunction with an application area, such as Immunology, are required to complete 6 units including:

Biomedin 205: Precision Practice with Big Data

Hyperbolic Dirac Net (HDN) + Data Mining to Map Clinical Pathways (The Tacit Knowledge)

 

Bioingine.com employs algorithmic approach based on Hyperbolic Dirac Net that allows inference nets that are a general graph (GC), including cyclic paths, thus surpassing the limitation in the Bayes Net that is traditionally a Directed Acyclic Graph (DAG) by definition.

The Bioingine.com approach thus more fundamentally reflects the nature of probabilistic knowledge in the real world, which has the potential for taking account of the interaction between all things without limitation, and ironically this more explicitly makes use of Bayes rule far more than does a Bayes Net.

It also allows more elaborate relationships than mere conditional dependencies, as a probabilistic semantics analogous to natural human language but with a more detailed sense of probability. To identify the things and their relationships that are important and provide the required probabilities, the Bioingine.com scouts the large complex data of both structured and also information of unstructured textual character.

It treats initial raw extracted knowledge rather in the manner of potentially erroneous or ambiguous prior knowledge, and validated and curated knowledge as posterior knowledge, and enables the refinement of knowledge extracted from authoritative scientific texts into an intuitive canonical “deep structure” mental-algebraic form that the Bioingine.com can more readily manipulate.

Discussion on employing HDN to map Clinical Pathways (The Tacit Knowledge)

Screenshot 2016-01-05 21.04.17

In the below referenced articles on the employ of Bayesian Net to model Clinical Pathways as probabilistic inference net, replace Bayesian Net to achieve stress tested Hyperbolic Dirac Net (HDN) which is a non-acyclic Bayesian resolving both correlation and causation in both the direction; etymology –> outcomes and outcomes –> etymology

1. Elements of Q-UEL 

Q-UEL is based on the Dirac Notation and associated algebra The notation was introduced into later editions of Dirac’s book to facilitate understanding and use of quantum mechanics (QM) and it has been a standard notation in physics and theoretical chemistry since the 1940s

a) Dirac Notation

In the early days of quantum theory, P. A. M. (Paul Adrian Maurice) Dirac created a powerful and concise formalism for it which is now referred to as Dirac notation or bra-ket (bracket ) notation

<bra vector exprn* | operator exprn* | ket vector exprn*> 

[ exprn* is expression]

It  is an  algebra for observations and measurements, and probabilistic inference from  them

 QM is a system for representing observations and measurements, and drawing probabilistic inference from them.

In Dirac’s notation what is known is put in a ket, “|>” . So, for example, “|p >” expresses the fact that a particle has momentum p. It could also be more explicit: |p = 2> , the particle has momentum equal to 2; | x = 1.23 , the particle has position 1.23 |Ψ > represents a system in the state and is therefore called the state vector. 

The ket |> can also be interpreted as the initial state in some transition or event.

The bra <| represents the final state or the language in which you wish to express the content of the ket

Hyperbolic Dirac Net, has ket |> as row vector, and bra <| as column vector

b) hh = +1 Imaginary Number

QM is a system for representing observations and measurements, and drawing probabilistic inference from them. The Q in Q-UEL refers to QM, but a simple mathematical transformation of QM gives classical everyday behavior. Q-UEL inherits the machinery of QM by replacing the more familiar imaginary number i (such that ii = -1), responsible for QM as wave mechanics, by the hyperbolic imaginary number h (such that hh=+1). Hence our inference net in general is called the Hyperbolic Dirac Net (HDN)

In probability theory A, B, C, etc. represent things, states, events, observations, measurements, qualities etc. In this paper we mean medical factors, including demographic factors such as age and clinical factors such as systolic blood pressure value or history of diabetes.

They can also stand for expressions containing many factors, so note that by e.g.

P(A|B) we would usually mean that it also applies to, say, P(A, B | C, D, E). In text, P(A,B, C,…) with ellipsis ‘…’ means all combinatorial possibilities, P(A), P(B), P(A, C), P(B, D, H) etc.

2) Employing Q-UEL  preliminary inference net as the query can be created.

“Will my female patient age 50-59 taking diabetes medication and having a body mass index of 30-39 have very high cholesterol if the systolic BP is 130-139 mmHg and HDL is 50-59 mg/dL and non-HDL is 120-129 mg/dL?”.

This forms a preliminary inference net as the query, which may be refined and to which probabilities must be assigned

The real answers of interest here are not qualitative statements, but the final probabilities. The protocols involved map to what data miners often seem to see as two main options in mining, although we see them as the two ends of a continuum.

Method (A) may be recognized as Unsupervised (or unrestricted) data mining and post-filtering, and is the method mainly used here. In this approach

we (1) mine data (“observe”),(2) compute a very large number of the more significant probabilities and render them as tags and maintained as Knowledge Representative Store (KRS) or Semantic Lake (“evaluate”), (3) use a propose inference net as a query to search amongst the probabilities represented by those tags, but only looking for those relevant to complete the net and assign probabilities to it, assessing what is available, and seeing what can be substituted (“interpret”), and (4) compute the overall probability of the final inference net in order to make a decision (“decide”). Unsupervised data mining is preferred because it generates many tags for an SW-like approach, and may uncover new unexpected relationships that could be included in the net.

Method (B) uses supervised (or restricted) data mining and prefiltering. Data mining considers only what appears in the net. The down-stream user interested in inference always accesses the raw database, while in (A) he or she may never see it.

The advantage of (B) is that mining is far less computationally demanding both in terms of processing and memory. Useful to computing HDN for a specified Hypothesis.

The Popular Bayes Net BN Compared with our Hyperbolic Dirac Net HDN.

Each probabilities of any kind can also be manipulated for inference in a variety of ways, according to philosophy (which is a matter of concern ). The BN is probably the most popular method, perhaps because it does seem to be based on traditional, conservative, principles of probability. However, the BN is traditionally (and, strictly speaking, by definition) confined to a probability network that is a directed acyclic graph (DAG).

In general, reversibility, cyclic paths and feedback abound in the real world, and we need probabilistic knowledge networks that are general graphs, or even more diffuse fields of influence, not DAGs. In our response as the Hyperbolic Dirac Net (HDN), “Dirac” relates to use of Paul A. M. Dirac’s view of quantum mechanics (QM).

QM is not only a standard system for representing probabilistic observation and inference from it in physics, but also it manages and even promotes concepts like reversibility and cycles. The significance of “hyperbolic” is that it relates to a particular type of imaginary number rediscovered by Dirac. Dirac notation entities, Q-UEL tags, and the analogous building blocks of an HDN all have complex probabilities better described as probability amplitudes. This means that they have the form x + jy where x and y are real numbers and j is an imaginary number, though they can also be vectors or matrices with such forms as elements.

Q-UEL is seen as a Lorentz rotation i → h of QM as wave mechanics. The imaginary number involved is now no longer the familiar i such that ii = -1, but the hyperbolic imaginary number, called h in Q-UEL, such that hh = +1.

This renders the HDN to behave classically. A basic HDN is an h-complex BN.

Both BN and basic HDN may use Predictive Odds in which conditional probabilities (or the HDN’s comparable h-complex notions) are replaced by ratios of these.

Discussions on Employing Bayesian Net to Model Clinical Pathways (Replace BN by HDN to achieve Hyperbolic BN)

Development of a Clinical Pathways Analysis System with Adaptive Bayesian Nets and Data Mining Techniques 

D. KOPEC*, G. SHAGAS*, D. REINHARTH**, S. TAMANG

 

Pathway analysis of high-throughput biological data within a Bayesian network framework

Senol IsciCengizhan OzturkJon Jones and Hasan H. Otu

Are Standardized Clinical Pathways Stymying Drug Innovation?

HDN :- Need for Agile Clinical Pathways that do not impede Drug Innovation

Oncologists Say Clinical Pathways Are Too Confining

Creating fixed plans for treating common malignancies promises to make the work of nurses and other staff more predictable and practiced, increasing efficiency and reducing errors that could lead to poor outcomes and hospitalization. For payers, pathways also gave them another way to insert awareness of costs directly into the examining room.

“The way the pathways are constructed does promote consideration of value-driven practice, which is to say that the pathways vendors all take into account cost of care, but only after considering efficacy and toxicity,” said Michael Kolodziej, MD, national medical director of oncology solutions at Aetna, and a former medical director at US Oncology. “So there is an element here of reduction of cost of care, by trying to encourage physicians to consider the relative value of various treatment options. This has now become the mantra in oncology.”

Studies found that using pathways can indeed cut costs substantially without hurting outcomes.

Probabilistic Modeling, Predictive Analytics & Intelligent Design from Multiple Medical Knowledge Sources

Bioingine.com; Probabilistic Modeling and Predictive Analytics Platform for A.I driven Deep Learning to discover Pathways from Clinical Data to Suggested Ontology for Pharmacogenomics; achieving Personalization and Driving Precision Medicine.

Data Integration in the Life Sciences: 11th International Conference, DILS 2015, Los Angeles, CA, USA, July 9-10, 2015, Proceedings

The Feature Diagram from the book above:-

Pharmacogenomics_pathway

Pharmacogenomic knowledge representation, reasoning and genome-based clinical decision support based on OWL 2 DL ontologies

Combining Multiple knowledge Sources and also Ontologies:-

[Suggested Ontologies for Pharmacogenomics converging to help find a Pathway]
  • Patient Data (HL7, C-CDA)
  • Gene Ontology
  • ChEBI Ontology

Integration of Knowledge for Personalized Medicine:- Pharmacogenomics case-study

Looking Forward: The Case for Intelligent Design (and Infrastructure) in Life Science Biologics R&D Sponsored by: Dassault Systèmes; Alan S. Louie, Ph.D. January 2015

http://gate250.com/tc2/IDC%20Biologics%20White%20Paper.pdf

Semantic Data Lake Delivering Tacit Knowledge – Evidence based Clinical Decision Support

Can the complexity be removed and tacit knowledge delivered from the plethora of the medical information available in the world.

” Let Doctors be Doctors”

Semantic Data Lake becomes the Book of Knowledge ascertained by correlation and causation resulting into Weighted Evidence

Characteristics of Bioingine.com Cognitive Computing Platform

  • Architecture style moves from Event driven into Semantics driven
  • Paradigm shift in defining system behavior – it is no more predicated and deterministic – Non Predicated Design
  • Design is “systemic” contrasting the technique such as objected oriented based design, development and assembling components
  • As such a system is better probabilistically studied.
  • Design is context driven, where the boundary diminishes between context and concept
  • System capability is probabilistically programmed by machine learning based on A.I, NLP and algorithms driven by ensemble of Math
  • Design based on Semantic mining and engineering takes precedence to complex event processing (CEP). CEP and Event Driven Architecture (EDA) are the part of the predicated system design. Business rules engine may be an overkill.
  • Ontology is created driven by both information and numbers theory

–Algebra – relationship amongst variables

–Calculus – rate of change in variable and its impact on the other

–Vector Space – study of states of the variables

Bioingine.com algorithm design driven by Probabilistic Ontology

  • Probabilistic Ontology characterizes the ecosystem’s behavior
  • Complex System’s semantic representation evolves generatively
  • System better represented by semantic multiples. Overcomes the barrier of Triple Store (RDF)
  • Human’s interact with the system employing knowledge inference technique
  • Inductive knowledge precedes knowledge by deduction

Bioingine.com is a Probabilistic Computing Machine

  • System’s behavior better modeled by the employ of probability, statistics and vector calculus (Statistics based on HDN an advancement to Bayes Net, where acyclic in DAG is overcome)
  • Generally the system is characterized by high dimensionality in its data set (variability) in addition to volume and velocity.
  • Most computing is in-memory 

BioIngine.com; is designed based on mathematics borrowed from several disciplines and notably from Paul A M Dirac’s quantum mechanics. The approach overcomes many of the inadequacies in the Bayes Net that is based on the directed acyclic graph (DAG). Like knowledge relationships in the real word, and as was required for quantum mechanics, our approaches are neither unidirectional nor do they avoid cycles.

Bioingine.com Features –

  • Bi-directional Bayesian Probability for knowledge Inference and Biostatistics (Hyperbolic complex).
  • Built upon medical ontology (in fact this is discovered by machine learning, AI techniques).
  • Can be both hypothesis and non-hypotheses driven.
  • Quantum probabilities transformed to classical integrating vector space, Bayesian knowledge inference, and Riemann zeta function to deal with sparse data and finally driven by overarching Hyperbolic Dirac Net.
  • Builds into web semantics employing NLP. (Integrates both System Dynamics and Systems Thinking).

Framework of Bioingine –Dirac-Ingine Algorithm Ensemble of Math 

Q-UEL & HDN (More Info click the link)

Clinical Data Analytics – Loss of Innocence (Predictive Analytics) in a Large High Dimensional Semantic Data Lake

Slide1

From Dr. Barry Robson’s notes:-

Is Data Analysis Particularly Difficult in Biomedicine?

Looking for a single strand of evidence in billions of possible semantic multiple combinations by Machine Learning

Of all disciplines, it almost seems that it is clinical genomics, proteomics, and their kin, which are particularly hard on the data-analytic part of science. Is modern molecular medicine really so unlucky? Certainly, the recent explosion of biological and medical data of high dimensionality (many parameters) has challenged available data analytic methods.

In principle, one might point out that a recurring theme in the investigation of bottlenecks to development of 21st century information technology relates to the same issues of complexity and very high dimensionality of the data to be transformed into knowledge, whether for scientific, business, governmental, or military decision support. After all, the mathematical difficulties are general, and absolutely any kind of record or statistical spreadsheet of many parameters (e.g., in medicine; age, height, weight, blood-pressure, polymorphism at locus Y649B, etc.) could, a priori, imply many patterns, associations, correlations, or eigensolutions to multivariate analysis, expert system statements, or rules, such as jHeight:)6ft, Weight:)210 lbs> or more obviously jGender:)male, jPregnant:)no>. The notation jobservation> is the physicists’ ket notation that forms part of a more elaborate “calculus” of observation. It is mainly used here for all such rule-like entities and they will generally be referred to as “rules”.

As discussed, there are systems, which are particularly complex so that there are many complicated rules not reducible to, and not deducible from, simpler rules (at least, not until the future time when we can run a lavish simulation based on physical first principles).

Medicine seems, on the whole, to be such a system. It is an applied area of biology, which is itself classically notorious as a nonreducible discipline.

In other words, nonreducibility may be intrinsically a more common problem for complex interacting systems of which human life is one of our more extreme examples. Certainly there is no guarantee that all aspects of complex diseases such as cardiovascular disease are reducible into independently acting components that we can simply “add up” or deduce from pairwise metrics of distance or similarity.

At the end of the day, however, it may be that such arguments are an illusion and that there is no special scientific case for a mathematical difficulty in biomedicine. Data from many other fields may be similarly intrinsically difficult to data mine. It may simply be that healthcare is peppered with everyday personal impact, life and death situations, public outcries, fevered electoral debates, trillion dollar expenditures, and epidemiological concerns that push society to ask deeper and more challenging questions within the biomedical domain than routinely happen in other domains.

 Large Number of Possible Rules Extractable a Priori from All Types of High-Dimensional Data

For discovery of relationships between N parameters, there are almost always x (to the power N) potential basic rules, where x is some positive constant greater than unity and which is characteristic of the method of data representation and study. For a typical rectangular data input like a spreadsheet of N columns,

[2 to the power of N] – N – 1  = X numbers of tag rules from which evidence requires being established. Record with 100 variables and joint probability 2 means;

2^100-100-1 = 1.267650600228229401496703205275 × 10^30

Evidence based Medicine driven by Inferential Statistics – Hyperbolic Dirac Net

Slide1

http://sociology.about.com/od/Statistics/a/Introduction-To-Statistics.htm

From above link

Descriptive Statistics (A quantitative summary)

Descriptive statistics includes statistical procedures that we use to describe the population we are studying. The data could be collected from either a sample or a population, but the results help us organize and describe data. Descriptive statistics can only be used to describe the group that is being studying. That is, the results cannot be generalized to any larger group.

Descriptive statistics are useful and serviceable if you do not need to extend your results to any larger group. However, much of social sciences tend to include studies that give us “universal” truths about segments of the population, such as all parents, all women, all victims, etc.

Frequency distributionsmeasures of central tendency (meanmedian, and mode), and graphs like pie charts and bar charts that describe the data are all examples of descriptive statistics.

Inferential Statistics

Inferential statistics is concerned with making predictions or inferences about a population from observations and analyses of a sample. That is, we can take the results of an analysis using a sample and can generalize it to the larger population that the sample represents. In order to do this, however, it is imperative that the sample is representative of the group to which it is being generalized.

To address this issue of generalization, we have tests of significance. A Chi-square or T-test, for example, can tell us the probability that the results of our analysis on the sample are representative of the population that the sample represents. In other words, these tests of significance tell us the probability that the results of the analysis could have occurred by chance when there is no relationship at all between the variables we studied in the population we studied.

Examples of inferential statistics include linear regression analyseslogistic regression analysesANOVAcorrelation analysesstructural equation modeling, and survival analysis, to name a few.

Inferential Statistics:- Bayes Net  [Good for simple Hypothesis]

“Suppose that there are two events which could cause grass to be wet: either the sprinkler is on or it’s raining. Also, suppose that the rain has a direct effect on the use of the sprinkler (namely that when it rains, the sprinkler is usually not turned on)… The joint probability function is: P(G, S, R) = P(G|S, R)P(S|R) P(R)”. The example illustrates features common to homeostasis of biomedical importance, but is of interest here because, unusual in many real world applications of BNs, the above expansion is exact, not an estimate of P(G, S, R).

Inferential Statistics: Hyperbolic Dirac Net (HDN) – System contains innumerable Hypothesis

HDN Estimate (forward and backwards propagation)

P(A=’rain’) = 0.2 # <A=’rain’ | ?>

P(B=’sprinkler’) = 0.32 # <B=’sprinkler’ | ?>

P(C=’wet grass’) =0.53 # <? | C=’wet grass>

Pxx(not A) = 0.8

Pxx(not B) = 0.68

Pxx(not C) = 0.47

# <B=’sprinkler’ | A=’rain’>

P(A, B) = 0.002

Px(A) = 0.2

Px(B) = 0.32

Pxx(A, not B) = 0.198

Pxx(not A, B) = 0.32

Pxx(not A, not B) = 0.48

#<C=’wet grass’|A=’rain’,B=’sprinkler’>

P(A,B,C) = 0.00198

Px(A, B) = 0.002

Px(C=’wet grass’) =0.53

Pxx(A,B,not C) = 0.00002

End

Since the focus in this example is on generating a coherent joint probability, Pif and Pif* are not included in this case, and we obtain {0.00198, 0.00198} = 0.00198. We could us them to dualize the above to give conditional probabilities. Being an exact estimate, it allows us to demonstrate that the total stress after enforced marginal summation (departure from initial specified probabilities) is very small, summing to 0.0005755. More typically, though, a set of input probabilities can be massaged fairly drastically. Using the notation initial -> final, the following transitions occurred after a set of “bad initial assignments”.

P (not A) = P[2][0][0][0][0][0][0][0][0][0] = 0.100 -> 0.100000

P (C) = P[0][0][1][0][0][0][0][0][0][0] = 0.200 -> 0.199805

P ( F,C) = P[0][0][1][0][0][1][0][0][0][0] = 0.700 -> 0.133141

P (C,not B,A) = P[1][2][1][0][0][0][0][0][0][0] = 0.200 -> 0.008345

P (C,I,J,E,not A) = P[2][1][0][1][0][0][0][1][1][0] = 0.020 -> 0.003627

P (B,F,not C,D) = P[0][1][2][1][0][1][0][0][0][0] = 0.300 -> 0.004076

P (C) = P[0][0][1][0][0][0][0][0][0][0] = 0.200 -> 0.199805

P ( F,C) = P[0][0][1][0][0][1][0][0][0][0] = 0.700 -> 0.133141

P (C,not B,A) = P[1][2][1][0][0][0][0][0][0][0] = 0.200 -> 0.008345

P (C,I,J,E,not A) = P[2][1][0][1][0][0][0][1][1][0] = 0.020 -> 0.003627

P (B,F,not C,D) = P[0][1][2][1][0][1][0][0][0][0] = 0.300 -> 0.004076

Data Science On Non-Locality, Hidden Problems and Lack of Information

Discourse with Dr. Barry Robson, about some thing quite bizarre in solving an unknown problem against uncertainty.

nash

Srinidhi Boray (SB) – Hey Barry Question !!! “Graphing should pave way for creating the tacit knowledge by context for a chosen hypothesis; and infinite varieties of the hypothesis is technically possible in an ecosystem”

Dr. Barry Robson (BR) – Your question does not make sense, it is like John Nash’s lecture while he was having a schizophrenic breakdown in ‘A Beautiful mind”

“And so we see that if the zeroes of the Riemann Zeta Function correspond to singularities in the space-time then conventional number theory breaks down in the face of relativistic exploration… Sometimes, our expectations are betrayed by the numbers… And variables are impossible to assign any rational value…”

SB – What I am struggling is with this:-

High dimensionality creates billions of probabilities, from the billions of probabilities knowledge is Inferred working through the morass, while applying probabilistic ontology and looking for weighted evidence

As such, knowledge is spread over the system topology – so it is non-local. This means the ecosystem Consciousness is all of the non-local knowledge in continuum both as experience and evidence.

With the above proposition, how can one solve a problem that is yet to surface, since non-local knowledge supposes that a problem is lurking in the corner, and I am not aware. Is this a Bizarre question ?

BR – “But, If you meant that we should emphasize the vast spaces and entropy that we need to overcome, then yes  that is the essence of the problem.”

On Non-Locality, Hidden Problems and Lack of Information

Dr. Barry Robson

We sometimes speak of “non-locality” to mean some business to do hidden problems, when one does not know even how to get information to achieve the solution, even assuming that the solution exists at all to allow us get an exact solution. This can seem odd because location would not seem to have much to do with solvability of problems, except   perhaps in a mathematical or simulated space. In some sense, of course, that may indeed be exactly what we do mean, and we might mean more precisely that we cannot locate something in a descriptive space.

We might have meant “non-locatability”, i.e. the needle in a haystack, and essentially entropic, problem.

But equally and alternatively, we might mean that the shape of a space, real or mathematical, changes so locally, so suddenly, as to have the characteristics of a singularity, making behavior in those regions impossible to integrate, and require infinite information to locate and describe it precisely, and with complete certainty.   We might speak better of “non-localizability”, or perhaps of “a limit to localizability”.

Alternatively, because we cannot dissect it further, “non-separability” might be a better word, and as discussed briefly below, Schrödinger used that term and it later transformed into non-locality.

We can reach many concepts that seem to have rather little to do with each other and even less to do with our starting, point, although there also seems to be a thread of connection. Many and perhaps ultimately all of these may in fact be manifestations of the same core ideas, as follows.

For a long time after Newton, it was envisaged that we predict the past and future of everything if we had the required information. We speak of not knowing the details of motions of molecules, on certain important statistics like temperature and pressure, simple because we cannot look at each particle and its properties plot the effects of Newton’s laws of motion backward and forward in time.

There are two modern views that say that these things are most usually unknowable, that we cannot get all the required information, Uncertainty principle of quantum mechanics, and Chaos Theory.

Looking first at the first of these, Albert Einstein, the most famous objector to a fundamentally probabilistic nature for quantum mechanics nonetheless felt that there had to be hidden variables from which the seemingly probabilistic nature of things emerged. This was rather like particle properties and Newton’s laws, so in that sense he was Newtonian, and certainly a Newtonian revisionist, applying corrections to Newton’s view of the world to allow for the effect of the finite velocity of light and the bending of space-time.

The famous interpretation of Bohm’s interpretation, is all about hidden variables but it seems somewhat different constitutes an implicate order (hidden order) which organizes a particle, and which may itself be the result of yet a higher meta- or super-implicate order which organizes a field.

The flavor of the above seems closer to what we can quite readily understand in the modern world of widespread computer technology. We are very familiar with worlds of people and things that are not really in places and moments of time that they seem to be, nor are they really doing things with probabilities that can only be understood and quantified retrospectively by observations and counting and statistical analysis, but are instead pre-computed to occur with that probability. It is of course true of movies and plays, and stories and myths and legends in general, and perhaps even in the ongoing narrative in our brains that we call consciousness, but it is certainly most clearly linked to precise concepts about information in the world of simulation and computer games.

From that perspective, quantum mechanics is like the processor architecture and machine language, and as in Plato’s “Allegory of the Shadows in the Cave” what is simulated and perceived has little to do with what is “really” the case. Importantly, like a real computer system, it has to forge reality with a limited information capacity so that we cannot have infinite velocities, precise singularities, and total separation of the behavior of particles. Even in the wave description, the behavior of distinct particles not the evolution of the wave equation cannot always be separated into finer-grained parts.

“Non-separability” in that kind of sense was originally, and perhaps still should be, used in place of “non-locality”. As in our computer simulation, we may not information available to separate out events so that they appear entangled. As in our brains, only a few bits per second might in fact reach central working memory, and like the blind spot in our retina, we fill the gaps with the illusion of assumption. We may not have enough information to describe every little bump and crack in a mountain, but resort to an equation to simulate fractals.

Singularities like those we discuss inside black holes stand out as requiring infinite precision for their description and represent a seeming breach of the uncertainty principles: they are just too darned sharp. The singularity of a black hole cannot be observed because there seems to be a principle of cosmic censorship consistent with lack of availability for any information about a singularity. Since anything can classically become a black hole if it has some mass and is small and localized enough, i.e. can reach a certain critical density, black holes for this reason (and other presumably related reasons) meet up with the quantum world.

The information about the “action” of in quantum system, at least as an i–complex quantity as is required for Schrödinger’s wave mechanics,   cannot be finer grained than about 1/(4 x pi) nor exceed a change of 2 x pi or more without becoming redundant (the phase problem).

But as noted above, there are two modern views that say how we cannot get all the required information, not just Uncertainty principle of quantum mechanics, but also Chaos Theory. We may expect that they are not distinct. Indeed, quantum chaos is a branch of physics, which studies chaotic classical dynamical systems in terms of quantum theory, to try and understand the relationship between quantum mechanics and classical Chaos Theory. The correspondence principle sees classical mechanics as the classical limit of quantum mechanics. If so, there must be quantum mechanisms underlying classical chaos. A Chaotic systems if often taken as synonymous with a non-linear one. In a non-linear system, output is unpredictable as not proportional in any sense to input, there is a deficiency of information between the independent (input) and dependent (output) variables.  There is fundamental connections here with Riemann who worked on number theory and on the curvature of spaces that underlay Einstein’s theories of relativity, though as far as we know, these two topics were distinct, at least as much as anything is distinct from anything else, for Riemann. In the 1950’s though, Atle Selberg was studying the number-theoretic implications of the analytic structure of certain curved spaces, and derived an equation that had the eigenvalues of a differential operator on one side and the lengths of closed curves in the space on the other side. This equation known as the known as the Selberg Trace Formula, encodes the number-theoretic properties that underlie the structure of the curved space. Dennis Hejhal later computed eigenvalues and closed curves for various curved spaces. In one of his early calculations, Hejhal thought he had solved an important conjecture by Riemann that about the location of certain non-trivial zeros in the i-complex space of the Riemann zeta function that   controlled the location of the prime numbers in the natural number series. These zeros turned out to really be the zeta zeros, but their presence turned out to be error in the sense that the curved space had a singularity that required an adjustment in the differential operator in order to handle it, and once done, the zeta zeros disappeared. However, there remains the well known interest of mathematical physicists in connecting quantum chaos with the mysteries of number theory through Riemann’s   zeta function, and there may remain considerable mileage in attempting to understand   the curvature of spaces in terms of limitations in the amount of information available to describe them.

For example, intriguingly, in 1977 Berry and Tabor made a still unproven mathematical conjecture that for the quantum dynamics of flow on compact Riemann surface, the spectrum of quantum energy eigenvalues behaves like a sequence of independent random variables, provided that the underlying classical dynamics contain enough information to be completely integratable.

Quantum Theory driven (QEXL Approach) Cognitive Computing Architecture resolving Healthcare Interoperability (BigData – HIE/ ACO )

http://www.BioIngine.com

[healthcare cognitive computing platform]

Conquering Uncertainties Creating Infinite Possibilities

(Possible application :- Achieving Algorithm Driven ACO)

HDN_Cognitive_Computing

Introduction

The QEXL Approach is a Systems Thinking driven technique that has been designed with the intension of developing “Go To Market” solutions for Healthcare Big Data applications requiring integration between Payor, Provider, Health Management (Hospitals), Pharma etc; where the systemic complexities tethering on the “edge of chaos” pose enormous challenges in achieving interoperability owing to existence of plethora of healthcare system integration standards and management of the unstructured data in addition to structured data ingested from diverse sources. Additionally, The QEXL Approach targets for the creation of Tacit  Knowledge Sets by inductive techniques and probabilistic inference from the diverse sets of data characterized by volume, velocity and variability. In fact, The QEXL Approach facilitates algorithmic driven Proactive Public Health Management, while rendering business models achieving Accountable Care Organization most effective.

The QEXL Approach is an integrative multivariate declarative cognitive architecture proposition to develop Probabilistic Ontology driven Big Data applications creating interoperability among Healthcare systems. Where, it is imperative to develop architecture that enable systemic capabilities such as Evidence Based Medicine, Pharmacognomics, biologics etc; while also creating  opportunities for studies such as Complex Adaptive System (CAS). Such approach is vital to develop ecosystem as an response to mitigate the Healthcare systemic complexities. Especially CAS studies makes it possible to integrate both macro aspects (such as epidemiology) related to Efficient Heathcare Management Outcomes ; and micro aspects (such as  Evidence Based Medicine and Pharmacogenomics that helps achieve medicine personalization) achieving Efficacy in the Healthcare delivery, to help achieve systemic integrity. In The QEXL Approach QEXL stands for “Quantum Exchange Language”, and Q-UEL is the initial proposed language. The QEXL Consortium embraces Quantal Semantics, Inc; (NC) and Ingine, Inc; (VA), and collaborates with The Dirac Foundation (UK), which has access to Professor Paul Dirac’s unpublished papers. The original consortium grew as a convergence of responses to four stimuli:

  1. The “re-emerging” interest in Artificial Intelligence (AI) as “computational thinking”, e.g. under the American Recovery Act;
  2. The President’s Council of Advisors on Science and Technology December 2010 call for an “XML-like” “Universal Exchange Language” (UEL) for healthcare;
  3. A desire to respond to the emerging Third World Wide Web (Semantic Web) by an initiative based on generalized probability theory  – the Thinking Web; and
  4. In the early courses of these  efforts, a greater understanding  of what Paul Dirac meant in his  Nobel Prize dinner speech where he stated that quantum mechanics should be applicable to all aspects of human thought.

The QEXL Approach

The QEXL Approach is developed based on considerable experiences in Expert Systems, linguistic theory, neurocognitive science, quantum mechanics, mathematical and physics-based approaches in Enterprise Architecture, Internet Topology, Filtering Theory, Semantic Web, Knowledge Lifecycle Management, and principles of Cloud Organization and Integration. The idea for well-formed probabilistic programming reasoning language is simple.  Importantly, also, the more essential features of it for reasoning and prediction are correspondingly simple such that the programmers are not necessarily humans, but structured and unstructured (text-analytic) “data mining” software robots. We have constructed a research prototype Inference Engine (IE) network (and more generally a program) that “simply” represents a basic Dirac notation and algebra compiler, with the caveat that it extends to Clifford-Dirac algebra; notably a Lorentz rotation of the imaginary number i (such that ii = -1) to the hyperbolic imaginary number h (such that hh = +1) corresponding to Dirac’s s, and gtime or g5) is applied.

[Outside the work of Dr. Barry Robson, this approach has not been tried in the inference and AI fields, with one highly suggestive exception: since the late 1990s it has occasionally been used in the neural network field by T. Nitta and others to solve the XOR problem in a single “neuron” and to reduce the number of “neurons” generally. Also suggestively, in particle physics it may be seen as a generalization of the Wick rotation time i x time used by Richard Feynman and others to render wave mechanics classical.  It retains the mathematical machinery and philosophy of Schrödinger’s wave mechanics but, instead of probability amplitudes as wave amplitudes, it yields classical but complex probability amplitudes encoding two directions of effect: “A acts on B, and B differently on A”. It maps to natural language where words relate to various types of real and imaginary scalar, vector, and matrix quantities. Dirac’s becomes the XML-like semantic triple . ]  

The QEXL Approach involves following  interdependent components.

  • Q-UEL (Probabilistic Inference + Phenomenon Of Interest): Addresses global issues that potentially pervade all human endeavors, and hence universal interoperability is of key importance
  •  (Inference Engine + Semantic Inferencing): Project addressing universal meaning underlying diverse natural languages on the Internet, and the use of that in knowledge representation
  • Inference Engine + Decentralized Infra: A link infrastructure for intra- and inter-cloud interoperability and integration in a coherent high level “metaware” environment. This component can also be explored to be replaced with simpler industry ready solutions such as MarkLogic® Enterprise NoSQL Database on Hadoop Distributed File System.

In an endeavor of this kind the partitions-of-work are inevitably artificial; it is important that this does not impede the integrity of optimal solutions.  The most important aspect in The QEXL Approach is, in essence where architecturally Probabilistic Inference (PI) and Data Architecture for the Inference Engine (IE)  is designed to be cooperative; software robots are created while PI and IE interact; and the inference knowledge gained by the PI and IE provide rules for solvers (robots) to self compile and conduct queries etc. This is therefore the grandeur of the scheme: This approach will have facilitated programming by nice compilers so that writing the inference network is easy, but it is not required to write the inference net as input code to compile, with the exception of reusable metarules as Dirac expressions with variables to process other rules by categorical and higher order logic. The robots are designed and programmed to do the remaining coding required to perform as solvers. So the notion of a compiler disappears under the hood. The robots are provided with well-formed instructions as well formed queries. Once inferences are formed, different “what – if” questions can be asked. Given that probability or that being the case, what is the chance of… and so on. It is as if having acquired knowledge, Phenomenon Of Interest (POI) is in a better state to explore what it means. Hyperbolic Dirac Networks (HDNs) are inference networks capable of overcoming the limitations imposed by Bayesian Nets (and statistics) and creating generative models richly expressing the “Phenomenon Of Interest” (POI) by the action of expressions containing binding variables. This may be thought of as an Expert System but analogous to Prolog data and Prolog programs that act upon the data, albeit here a “probabilistic Prolog”. Upfront should be stated the advantages over Bayes Nets as a commonly used inference method, but rather than compete with such methods the approach may be regarded as extending them. Indeed a Bayes Net as a static directed acyclic conditional probability graph is a subset of the Dirac Net as a static or dynamic general bidirectional graph with generalized logic and relationship operators, i.e. empowered by the mathematical machinery of Dirac’s quantum mechanics.

 The QEXL Approach Theory :- Robson Quantitative Semantics Algebra (RQSA)

Developed by Dr. Barry Robson

Theory :- The QEXL Approach based on Robson Quantitative Semantics Algebra – RQSA (Link to development of algorithm – overcoming limitations of Gold Stand Bayesian Network – to solve uncertainty while developing probabilistic ontology)

Impact Of The QEXL Approach

Impact of The QEXL Approach creating Probabilistic Ontology based on Clifford-Dirac algebra has immense opportunity in advancing the architecture to tackle large looming problems involving System of Systems; in which vast uncertain information emerge. Generally, as such systems are designed and developed employing Cartesian methods; such systems do not offer viable opportunity to deal with vast uncertain information when ridden with complexity. Especially when the context complexity poses multiple need for ontologies, and such a system inherently defies Cartesian methods. The QEXL Approach develops into an ecosystem response while it overcomes the Cartesian dilemma (link to another example for Cartesian Dilemma) and allows for generative models to emerge richly expressing the POI. The models generatively develops such that the POI behavior abstracted sufficiently lend for the IE and the Solvers to varieties of studies based on evidence and also allows for developing systemic studies pertaining to Complex Adaptive System and Complex Generative Systems afflicted by multiple cognitive challenges. Particularly, The QEXL Approach has potential to address complex challenges such as evidence based medicine (EBM); a mission that DoD’s Military Health System envisions while it modernizes its Electronics Health Record System – Veterans Health Information Systems and Technology Architecture (VistA). Vast potential also exists in addressing Veteran Administration’s (VA) Million Veteran Program (MVP); an effort by VA to consolidate genetic, military exposure, health, and lifestyle information together in one single database. By identifying gene-health connections, the program could consequentially advance disease screening, diagnosis, and prognosis and point the way toward more effective, personalized therapies.

Although The QEXL Approach is currently targeted to the healthcare and pharmaceutical domains where recognition of uncertainty is vital in observations, measurements and predictions, and probabilities underlying a variety of medical metrics, the scope of application is much more general. The QEXL Approach is to create a generic multivariate architecture for complex system characterized by Probabilistic Ontology that employing generative order will model “POI” facilitating creation of “communities of interest” by self-regulation in diverse domains of interest, requiring integrative of disciplines to create complex studies. The metaphor of “Cambrian Explosion” may aptly represent the enormity of the immense possibilities in advancing studies that tackle large systemic concerns riddled with uncertain information and random events that The QEXL Approach can stimulate.

Image

The inference engine can be conceptualized into solutions such as MarkLogic NoSQL + Hadoop (HDFS). http://www.marklogic.com/resources/marklogic-and-hadoop/

It is interesting to note that in the genesis of evolving various NoSQL solutions based on Hadoop few insights have emerged related to need for designing the components recognizing their cooperative existence.

The Goal of The QEXL Approach: Is all about Contextualization 

The goal employing The QEXL Approach is to enable the realization of cognitive multivariate architecture for Probabilistic Ontology, advancing the Probabilistic Ontology based architecture for context specific application; such as Healthcare. Specifically, The QEXL Approach will develop PI  that helps in the creation of generative models that depicts the systemic behavior of the POI riddled with vast uncertain information. Generally, uncertainty in the vast information is introduced by the System of Systems complexity that is required to resolve multiples of ontologies, standards etc., these further introduce cognitive challenges. The further goal of The QEXL Approach is to overcome such challenges, by addressing interoperability at all levels, including the ability to communicate data and knowledge in a way that recognizes uncertainty in the world, so that automated PI and decision-making is possible. The aim is semiotic portability, i.e. the management of signs and symbols that deals especially with their function and interactions in both artificially constructed and natural languages. Existing systems for managing semantics and language are mostly systems of symbolic, not quantitative manipulation, with the primary exception of BayesOWL.  RQSA, or Robson Quantitative Semantic Algebra by its author Dr. Barry Robson, to distinguish it from other analogous systems, underlies Q-UEL. It is the development of (a) details of particular aspect of Dirac’s notation and algebra that is found to be of practical importance in generalizing and correctly normalizing Bayes Nets according to Bayes Theorem (i.e. controlling coherence, which ironically Bayes Nets usually neglect, as they are unidirectional), (b) merged with the treatment of probabilities and information based on finite data using the Riemann Zeta function that he has employed for many years in bioinformatics and data mining (http://en.wikipedia.org/wiki/GOR_method), and (c) the extension to more flavors of hyperbolic imaginary number to encode intrinsic “dimensions of meaning” under a revised Rojet’s thesaurus system.

The Layers of the Architecture Created by The QEXL Approach

The QEXL Layered View

The QEXL Layered View

Layer 1- Contextualization: Planning, Designing driven by Theories 

A.    Probabilistic Ontology creating Inferencing leading into Evidence Based Medicine

i.     Aspects addressed by Q-UEL Tags and Kodaxil Inferencing

  1. Autonomy / Solidarity
  2. Inferencing (Kodaxil and Q – UEL)
  3. MetaData
  4. Security / Privacy
  5. Consented vs Un-consented Data
  6. Creating Incidence Rule (predicated – Q-UEL and Kodaxil)

ii.     Kodaxil:-  Enforcing Semantics across data sources (global text and data interoperability) – universal meaning underlying diverse natural languages on the Internet

iii.     Fluxology:- Logical Meta Data Cloud (A link infrastructure for intra- and inter-cloud interoperability and integration in a international setting)

  1. Adaptive
  2. Emergent Data Usage Patterns (networks of networks – enables by Probabilistic Ontology rules)
  3. Modeless Emergent Hierarchies
  4. Federation and Democratization Rule for Data (contract, trust, certificates, quality)

B.    Development of Probabilistic Model Representing Universal Abstraction of Phenomenon Of Interest

C.   Targeting Architecture to Application

  • Evidence Based Medicine
  • Genomics
  • Systemic Healthcare Studies
  • etc

Layer 2 – A: Operational Architecture (Logical )

A.    Reference Architecture

  1. Business Con Ops (Use cases)
  2. Conceptual Target Solution Architecture

Layer 2 – B: Data Management – Data Ingestion and Processing 

  1.  The processing of entries in the source data into form suitable for data mining
  2. The data mining of that processed data to obtain summary rules
  3. The capture of the appropriate released summary rules for inference

B.    Data Storage and Retrieval, Transactions

  1. Secure Storage and Retrieval
  2. Enable Secure Transactions
  3. Secure Data Exchange among several stake-holders and data owners

C.    Data Lifecycle, Data Organization Rules, Data Traceability to the Events, 

  1. Security and privacy by encryption and disaggregation of the EHR in a manner that is balanced against authorized access for extraction of global clinical and biomedical knowledge.
  2. Mechanisms for fine-grained consent permitting sharing and data mining.
  3. Mechanisms for secure alerting of patient or physician by backtrack when an authorized researcher or specialist notes that a patient is at risk.
  4. Structure and format that allows all meaningful use cases to be applied in reasonable time, including large-scale data mining.
  5. Assemblies across sources and data users forming contextual work patterns
  6. Hardened Security Framework

D.    Large EHR repository scaling

E.    Data Mining Rules

F.     Extracting and creating Incidence Rules

G.    Experimenting, observing and creating Semantic Inferences

H.    Visualization 

 The below two layers can be implemented in varieties of BigData platforms such as Hortonworks, Pivotal, Altiscale

Layer 3 – Application Layer (Schema-less for structured and unstructured Knowledge Repository – KRS)

Layer 4 – Infrastructure Architecture (Physical) (Hadoop and MapReduce for Large Data File-management and Processing; and Distributed / Concurrent Computations)