METHODS USED IN The BioIngine APPROACH: ROOTS OF THE HYPERBOLIC DIRAC NETWORK (HDN). – Dr. Barry Robson
General Approach : Solving the Representation and Use of Knowledge for the Real World.
Blending Systematically Produced and Unsystematically Existing Information and Synthesizing the Knowledge.
The area of our efforts in the support of healthcare and biomedicine is essentially one in Artificial Intelligence (AI). For us, however, this means a semantic knowledge engineering approach intimately combined with principles of probability theory, information theory, number theory, theoretical physics, data analytic principles, and even linguistic theory. These contributions and the unification of these, in the manner described briefly later below, is the general theory of an entity called the Hyperbolic Dirac Net (HDN), a means of representing and probabilistically quantifying networks of knowledge of both a simple probabilistic, and an even more sophisticated probabilistic semantic, nature in a way that has not been possible for previous approaches. It provides the core methodology for making use of medical knowledge in the face of considerable uncertainty and risk in the practice of medicine, and not least the need to manage massive amounts of diverse data, including both structured data and unstructured natural language text. As described here, the ability of the HDN and its supporting Q-UEL language to handle also the kind of interactions between things that we describe in natural language by using verbs and propositions, take account of the complex lacework of interactions between things, and do so when our knowledge is of probabilistic character, are of pressing and crucial importance to development of a higher level of information technology in many fields, but particularly in medicine.
In a single unified strike, the mathematics of the HDN, adapted in a virtually seamlessand natural way from a standard in physics due to Nobel Laureate Paul Dirac as discussed below, addresses several deficiencies (both well-known and less well advertised) in current forms of automated inference. These deficiencies largely relate to assumptions and representations that are not fully representative of the real world. They are touched upon later below, but the general one of most strategic force is as follows. As is emphasized and as discussed here, of essential importance to modern developments in many industries and disciplines, and not least in medicine, is the capture of large amounts of knowledge in what we call a Knowledge Representation Store (KRS). Each entry or element in such a store is a statement about the world. Whatever the name, the captured knowledge includes basic facts and definitions about the world in general, but also knowledge about specific cases (and looking more like what is often meant by “data”), such as a record about the medical status of a patient or a population. From such a repository of knowledge, general and specific, end users can invoke automated reasoning and inference to predict, aid decision making, and move forward acting on current best evidence Wide acceptance and pressing need is demonstrated (see below) by numerous efforts from the earliest Expert systems to the emerging Semantic Web, an international effort to link not just web pages (as with the World Wide Web) but also data and knowledge, and comparable efforts such as Never-Ending Language Learning system (NELL) at Carnegie Mellon University. The problem is that there is no single agreed way to actually using such a knowledge store in automated reasoning and inference, especially when uncertainty is involved.
In part this problem is perhaps in part because there is the sense that there is something deep that is still missing in what we mean by “Artificial Intelligence” (AI), and in part by lack of agreement in how to reason with connections of knowledge represented as a general graph. The latter is even to the extent that the popular Bayes Net is, by its original definition, a directed acyclic graph (DAG) that ignores or denies cyclic paths in knowledge networks, in stark contrast to the multiple interactions in a “mind map” concept map in student study notes, a subway map, biochemical pathways, physiological interactions, the wiring of the human brain, and the network of interactions in ecology. Primarily, however, the difficulty is that the elements of knowledge in the Semantic Web and other KRS-like efforts are for the most part presented as authoritative assertions rather than treated probabilistically. This is the despite the fact that the pioneering Expert Systems for medicine needed from the outset to be essentially probabilistic in order to manage uncertainty in the knowledge used to make decisions and the combining of it, and to deduce most probable diagnosis and select best therapy amongst many initial options, although here too there is lack of agreement, and almost every new method represented a different perception and use of uncertainty. Many of the aspects, use of a deeper theory, arrangement of knowledge elements into a general graph, might be addressed in the way a standard repository of knowledge is used, i.e. applied after a KRS is formed, but a proper and efficient treatment can only associate probability with the elements of represented knowledge from the outset (even though, like any aspect of knowledge, the probabilities should be allowed to evolve by refinement and updating). One cannot apply a probabilistic logic without probabilities in the axioms, or at least not to any advantage. Further, it makes no sense to have elements of knowledge, however they are used, that state unequivocally that some things are true, e.g. that obese patients are type 2 diabetics, because it is a matter of probability, in this case describing the scope of applicability of the statement to patients, i.e. only some 20-30% are so. Indeed, in that case, using only certainty or near-certainty, this medically significant association might never have appeared as a statement in the first place. Note that the importance of probabilistic thinking is also exemplified here by the fact that the reader may have been expecting or thinking in terms of “type 2 patients are obese”, which is not the same thing and has a probability of about 90%, closer to certainty, but noticeably still not 100%. All the above aspects, including the latter “two way” diabetes example, relate to matters that are directly relevant, and the differentiating features, of an HDN. The world that humans perceive is full of interactions in all directions, yet full of uncertainty, so we cannot only say that
“HDN = Semantic Knowledge + General Graph + Probability = Best Decision Making”
but also that any alternative method runs the risk of being seriously wrong or severely approximate if ignores any of knowledge or general graph or probability. For example, the popular Bayes Net as discussed below is probabilistic, but it uses only conditional and prior probabilities as knowledge, is a very restricted form of graph. Conversely, approach like that of IBM’s well-known Watson is clearly limited, and leaves a great deal to be sifted, corrected, and reasoned by the user, if is primarily a matter of “a super search engine” rather than inferring from an intricate lacework of probabilistic interactions. Importantly, even if it might be argued that some areas of science and industry can for the most part avoid such subtleties relating to probability, it is certainly not true in medicine, as the above diabetes example illustrates. From the earliest days of clinical decision support it clearly made no sense to pick, for example, “a most true diagnosis” from a set of possible diagnoses each registered only, on the evidence available so far, as true or false. What is vitally important to medicine is a semantic system that the real world merits, one capable of handling degree of truth and uncertainty in a quantitative way. Our larger approach, additionally building on semantic and linguistic theory, can reasonably be called probabilistic semantics. By knowledge in an HDN we also mean semantic knowledge in general, including that expressed by statements with relationships that are verbs of actions. In order to be able also to draw upon the preexisting Semantic Web and other efforts that contain such statements, however, the HDN approach is capable of making use of knowledge represented as certain.
Knowledge and reasoning from it does not stand alone from the rest of information management in the domain that generates and uses it, and it is a matter to be seriously attended to when, in comparison to many other industries such as finance, interoperability and universally accepted standards are lacking. Importantly, the application of our approach, and our strategy for healthcare and biomedicine, covers a variety of areas in healthcare information technology that we have addressed as proofs-of-concept in software development, welded into a single focus by a unification made possible through the above theoretical and methodological principles. These areas include digital patient records, privacy and consent mechanisms, clinical decision support, and translational research (i.e. getting the results of relevant biomedical research such as new genomics findings to physicians faster). All of these are obviously required to provide information for actions taken by physicians and other medical workers, but the broad sweep is also essential because no aspect stands alone: there has been a need for new semantic principles, based on the core features of the AI approach, to achieve interoperability and universal exchange.
There are various terms for such a knowledge store. “Knowledge Representation Store” is actually our term emphasizing that it is (in our view) analogous to human memory as enabled and utilized by human thought and language, but now in a representation that computers can readily read directly and use efficiently (while in our case also remaining readable directly by humans in a natural way).
In such cases, probability one (P=1) is the obvious assignment, but strictly speaking in our approach this technically means that it is an assertion that awaits refutation, in the manner of the philosophy of Karl Popper, and consistent with information theory in which the information content I of any statement of probability P is I = -ln(P), i.e. we find information I=0 when probability P=1. A definition such as “cats are mammals” seems an exception, but then, as long as it stands as a definition, it will not be refuted.
These are the rise of medical IT (and AI in general) as the next “Toffler wave of industry”, the urgent need to greatly reduce inefficiency and the high rate of medical error, especially considering to the strain on healthcare systems by the booming elderly population, the rise of genomics and personalized medicine, their impact on the pharmaceutical industry, belief systems and ethics, and their impact on the increased need for management of privacy and consent.
The Data Deluge Makes the Scientific Method Obsolete
2008 – MapReduce
Large Data Processing – classification
Google created the framework for MapReduce – MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper.
Mahout ML is mostly restricted to set theory. Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily in the areas of collaborative filtering, clustering and classification.
2012 – Apache SPARK
Apache SPARK Introduced to deal with Very Large Data and IN-Memorry Processing. It is an architecture for cluster computing – that increases the computing compared with slow MapReduce by 100 times and also better solves parallelization of the algorithm. Apache Spark is an open-source cluster-computing framework. Originally developed at the University of California, Berkeley’s AMPLab
GraphX is a distributed graph processing framework on top of Apache Spark. Because it is based on RDDs, which are immutable, graphs are immutable and thus GraphX is unsuitable for graphs that need to be updated, let alone in a transactional manner like a graph databasE. GraphX can be viewed as being the Spark in-memory version of Apache Giraph, which utilized Hadoop disk-based MapReduce.
Machine learning – the ability of computers to understand data, manage results and infer insights from uncertain information – is the force behind many recent revolutions in computing. Email spam filters, smartphone personal assistants and self-driving vehicles are all based on research advances in machine learning. Unfortunately, even as the demand for these capabilities is accelerating, every new application requires a Herculean effort. Teams of hard-to-find experts must build expensive, custom tools that are often painfully slow and can perform unpredictably against large, complex data sets.
The Probabilistic Programming for Advancing Machine Learning (PPAML) program aims to address these challenges. Probabilistic programming is a new programming paradigm for managing uncertain information.
Ingine Responded to DARPA’s RFQ with a detailed architecture based on Barry’s innovation in the algorithm that basically solves the above ask to some extent. Importantly it solve Probabilistic Ontology for Knowledge Extraction from Uncertainty and Semantic Reasoning.
In this era of big data, the rates at which these data sets grow continue to accelerate. The ability to manage and analyze the largest data sets is always severely taxed. The most challenging of these data sets are those containing relational or network data. The HIVE challenge is envisioned to be an annual challenge that will advance the state of the art in graph analytics on extremely large data sets. The primary focus of the challenges will be on the expansion and acceleration of graph analytic algorithms through improvements to algorithms and their implementations, and especially importantly, through special purpose hardware such as distributed and grid computers, and GPUs. Potential approaches to accelerate graph analytic algorithms include such methods as massively parallel computation, improvements to memory utilization, more efficient communications, and optimized data processing units.
2017 Data Science Dealing with Large Data Still Sucks
Despite emergence of Big Data, Machine Learning, Graphing Techniques and Semantic Web. The convergence is still far fleeting. Especially Semantic / Cognitive / Knowledge Extraction techniques are very poorly defined and there does not exists a framework approach to knowledge engineering leading into Machine Learning and automation in Knowledge Extraction, Representation, Learning and Reasoning. This is what Q-UEL and HDN solves at the algorithmic level.
Is an Algorithmic Language for constructing Complex System
Results into a Inferential Statistical mechanism suitable for a highly complex system – “Hyperbolic Dirac Net”
Involves an approach that is based on the premise that a Highly Complex System driven by the human social structures continuously strives to achieve a higher order in the entropic journey by continuos discerning the knowledge hidden in the system that is in continuum.
A System in Continuum seeking Higher and Higher Order is a Generative System.
A Generative System; Brings System itself as a Method to achieve Transformation. Similar is the case for National Learning Health System.
A Generative System; as such is based on Distributed Autonomous Agents / Organization; achieving Syndication driven by Self Regulation or Swarming behavior.
It has capabilities to facilitate medical workflow, continuity of care, medical knowledge extraction and representation from vast large sets of structured and unstructured data, automating bio-statistical reasoning leading into large data driven evidence based medicine, that further leads into clinical decision support system including knowledge management and Artificial Intelligence; and public health and epidemiological analysis.
A Large Chaotic System driven by Human Social Structures has two contending ways.
a. Natural Selection – Adaptive – Darwinian – Natural Selection – Survival Of Fittest – Dominance
b. Self Regulation – Generative – Innovation – Diversity – Cambrian Explosion – Unique Peculiarities – Co Existence – Emergent
Accountable Care Organization (ACO) driven by Affordability Care Act transforms the present Healthcare System that is adaptive (competitive) into generative (collaborative / co-ordinated) to achieve inclusive success and partake in the savings achieved. This is a generative systemic response contrasting the functional and competitive response of an adaptive system.
Natural selection seems to have resulted in functional transformation, where adaptive is the mode; does not account for diversity.
Self Regulation – seems like is a systemic outcome due to integrative influence (ecosystem), responding to the system constraints. Accounts for rich diversity.
From the above observation, should the theory in self regulation seem more correct and that adheres to laws of nature, in which generative learning occurs. Then, the assertion is “method” is offered by the system itself. System’s ontology has an implicate knowledge of the processes required for transformation (David Bohm – Implicate Order)
For very large complex system,
System itself is the method – impetus is the “constraint”.
In the video below, the ability for the cells to creatively create the script is discussed which makes the case for self regulated and generative complex system in addition to complex adaptive system.
Further Notes on Q-UEL / HDN :-
That brings Quantum Mechanics (QM) machinery to Medical Science.
Is derived from Dirac Notation that helped in defining the framework for describing the QM. The resulting framework or language is Q-UEL and it delivers a mechanism for inferential statistics – “Hyperbolic Dirac Net”
Created from System Dynamics and Systems Thinking Perspective.
It is Systemic in approach; where System is itself the Method.
Engages probabilistic ontology and semantics.
Creates a mathematical framework to advance Inferential Statistics to study highly chaotic complex system.
Is an algorithmic approach that creates Semantic Architecture of the problem or phenomena under study.
The algorithmic approach is a blend of linguistics semantics, artificial intelligence and systems theory.
The algorithm creates the Semantic Architecture defined by Probabilistic Ontology :- representing the Ecosystem Knowledge distribution based on Graph Theory
To make a decision in any domain, first of all the knowledge compendium of the domain or the system knowledge is imperative.
System Riddled with Complexity is generally a Multivariate System, as such creating much uncertainty
A highly complex system being non-deterministic, requires probabilistic approaches to discern, study and model the system.
General Characteristics of Complex System Methods
Descriptive statistics are employed to study “WHAT” aspects of the System
Inferential Statistics are applied to study “HOW”, “WHEN”, “WHY” and “WHERE” probing both spatial and temporal aspects.
In a highly complex system; the causality becomes indeterminable; meaning the correlation or relationships between the independent and dependent variables are not obviously established. Also, they seem to interchange the position. This creates dilemma between :- subject vs object, causes vs outcomes.
Approaching a highly complex system, since the priori and posterior are not definable; inferential techniques where hypothesis are fixed before the beginning the study of the system become enviable technique.
Review of Inferential Techniques as the Complexity is Scaled
Step 1:- Simple System (turbulence level:-1)
Frequentist :- simplest classical or traditional statistics; employed treating data random with a steady state hypothesis – system is considered not uncertain (simple system). In Frequentist notions of statistics, probability is dealt as classical measures based only on the idea of counting and proportion. This technique is applied to probability to data, where the data sets are rather small.
Increase complexity: Larger data sets, multivariate, hypothesis model is not established, large variety of variables; each can combine (conditional and joint) in many different ways to produce the effect.
Step 2:- Complex System (turbulence level:-2)
Bayesian :- hypothesis is considered probabilistic, while data is held at steady state. In Bayesian notions of statistics, probability is of the hypothesis for a given sets of data that is fixed. That is, hypothesis is random and data is fixed. The knowledge extracted contains the more subjectivist notions of uncertainty, belief, reliability, or confidence often used in automated inference and decision support systems.
Additionally the hypothesis can be explored only in an acyclic fashion creating Directed Acyclic Graphs (DAG)
Increase the throttle on the complexity: Very large data sets, both structured and unstructured, Hypothesis random, multiple Hypothesis possible, Anomalies can exist, There are hidden conditions, need arises to discover the “probabilistic ontology” as they represent the system and the behavior within.
Step 3: Highly Chaotic Complex System (turbulence level:-3)
Certainly DAG is now inadequate, since we need to check probabilities as correlations and also causations of the variables, and if they conform to a hypothesis producing pattern, meaning some ontology is discovered which describes the peculiar intrinsic behavior among a specific combinations of the variables to represent a hypothesis condition. And, there are many such possibilities within the system, hence very chaotic and complex system.
Now the System itself seems probabilistic; regardless of the hypothesis and the data. This demands Multi-Lateral Cognitive approach
Telandic…. “Point – equilibrium – steady state – periodic (oscillatory) – quasiperiodic – Chaotic – and telandic (goal seeking behavior) are examples of behavior here placed in order of increasing complexity”
A Highly Complex System, demands a Dragon Slayer – Hyperbolic Dirac Net (HDN) driven Statistics (BI-directionalBayesian) for extracting the Knowledge from a Chaotic Uncertain System.
When applied, the probabilities involved in Bayes’ theorem may have different probability interpretations. In one of these interpretations, the theorem is used directly as part of a particular approach to statistical inference. With the Bayesian probability interpretation the theorem expresses how a subjective degree of belief should rationally change to account for evidence: this is Bayesian inference, which is fundamental to Bayesian statistics. However, Bayes’ theorem has applications in a wide range of calculations involving probabilities, not just in Bayesian inference.
The theory should perhaps be distinguished from an existing purely number-theoretic area sometimes also known as Zeta Theory, which focuses on the Riemann Zeta Function and the ways in which it governs the distribution of prime numbers
The Riemann zeta function is an extremely important special function of mathematics and physics that arises in definite integration and is intimately related with very deep results surrounding the prime number theorem. While many of the properties of this function have been investigated, there remain important fundamental conjectures (most notably the Riemann hypothesis) that remain unproved to this day. The Riemann zeta function is defined over the complex plane for one complex variable, which is conventionally denoted (instead of the usual ) in deference to the notation used by Riemann in his 1859 paper that founded the study of this function (Riemann 1859). It is implemented in the Wolfram Language as Zeta[s].
3. 1900. Ramanujan’s mathematical work was primarily in the areas of number theory and classical analysis. In particular, he worked extensively with infinite series, integrals, continued fractions, modular forms, q-series, theta functions, elliptic functions, the Riemann Zeta-Function, and other special functions.
Hardy wrote in Ramanujan’s obituary :
There is always more i n one of Ramanujan’s formulae than meets the eye, as anyone who sets to work to verify those which look the easiest will soon discover. In some the interest lies very deep, in others comparatively near the surface; but there is not one, which is not curious and entertaining.
In geometry and physics, spinors are elements of a (complex) vector space that can be associated with Euclidean space. Like geometric vectors and more general tensors, spinors transform linearly when the Euclidean space is subjected to a slight (infinitesimal) rotation. When a sequence of such small rotations is composed (integrated) to form an overall final rotation, however, the resulting spinor transformation depends on which sequence of small rotations was used, unlike for vectors and tensors. A spinor transforms to its negative when the space is rotated through a complete turn from 0° to 360° (see picture), and it is this property that characterizes spinors. It is also possible to associate a substantially similar notion of spinor to Minkowski space in which case the Lorentz transformations of special relativity play the role of rotations. Spinors were introduced in geometry by Élie Cartan in 1913. In the 1920s physicists discovered that spinors are essential to describe the intrinsic angular momentum, or “spin”, of the electron and other subatomic particles.
The quantum electrodynamical law which applies to spin-1/2 particles and is the relativistic generalization of the Schrödinger equation. In dimensions (three space dimensions and one time dimension), it is given by
6. 1930. Dirac publishes his book on his pivotal view of quantum mechanics, including his earliest mentions of an operator with the properties of the hyperbolic h number such that hh = +1. It extends the theory of wave mechanics to particle mechanics. P. A. M. Dirac, The Principles of Quantum Mechanics, First Edition, Oxford University Press, Oxford (1930).
8. 1939. DIRAC PUBLISHES HIS BRAKET NOTATION. It is incorporated into the third edition of his book.
P.A.M. Dirac (1939). A new notation for quantum mechanics, Mathematical Proceedings of the Cambridge Philosophical Society 35 (3): 416–418
9. 1974. Robson develops his Expected Information approach that preempts the Bayes Net method.
B. Robson, Analysis of the Code Relating Sequence to Conformation in Globular Proteins: Theory and Application of Expected Information, Biochem. J141, 853-867 (1974).
10. 1978. The Expected Information approach crystallizes as the GOR method widely used in bioinformatics.
Garnier, D. J. Osguthorpe, and B. Robson, Analysis of the Accuracy and Implications of Simple Methods for Predicting the Secondary Structure of Globular Proteins”, J. Mol. Biol. 120, 97-120 (1978).
11. 1982 . Buchannan and Shortliffe describe the first medical Expert System. It is based on probabilistic statements, but sets a tradition of innovation and diverse controversial methods in automated medical inference.
Buchanan, E.H. Shortliffe, (1982) Rule Based Expert Systems. The Mycin Experiments of the Stanford Heuristic Programming Project, Addison-Wesley: Reading, Massachusetts.
12. 1985. Pearl Gives Full Accound the Bayes Net method.
Pearl, Probabilistic Reasoning in Intelligent Systems. San Francisco CA: Morgan Kaufmann (1985).
13. March 1989, Sir Tim Berners-less invented WWW: – Introduced non-linear linking of information across systems.
Tim laid out his vision for what would become the Web in a document called “Information Management: A Proposal”.Believe it or not, Tim’s initial proposal was not immediately accepted. In fact, his boss at the time, Mike Sendall, noted the words “Vague but exciting” on the cover. The Web was never an official CERN project, but Mike managed to give Tim time to work on it in September 1990. He began work using a NeXT computer, one of Steve Jobs’ early products.
14. 1997. Clifford Algebra using h becomes more widely recognized as a tool for engineers as well as scientists and physicists.
Gürlebeck, W. Sprössig, Quaternionic and Clifford Calculus for Physicists and Engineers, Wiley, Chichester (1997)
15. 1999. Tim Berners-Lee described the Semantic Web vision in the following terms
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web, the content, links, and transactions between people and computers. A Semantic Web, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The intelligent agents people have touted for ages will finally materialize. (1999)
16. 2000. Khrennikov gives description of a primarily h-complex quantum mechanics.
Khrenikov, Hyperbolic quantum mechanics, Cornell University Library, arXiv:quant-ph/0101002v1 (2000).
17. 2000. Bucholz and Sommer refine work showing that neural networks as inference systems modeled on the brain can usefully use the hypercomplex imaginary number h.
S. Buchholz, and G. Sommer, A hyperbolic multilayer perceptron International Joint Conference on Neural Networks, IJCNN 2000, Como,Italy, Vol. 2 of pp. 129-133. Amari, S-I and. Giles, C.L M. Gori. M. and Piuri, V. Eds. IEEE Computer Society Press, (2000).
18. 2003. Robson Points out that the Expected Information method in bioinformatics is really the use of the partially summated Riemann Zeta function, and a best choice for treatment of sparse data in data mining in general.
B Robson (2003) “Clinical and Pharmacogenomic Data Mining. 1. The generalized theory of expected information and application to the development of tools” J. Proteome Res. (Am. Chem. Soc.) 283-301, 2
19. 2003. Nitta Shows that the power of the h-complex approach in neural nets is primarily due to its ability to solver the notorious exclusive-or logical problem in a single neuron.
Nitta, Solving the XOR problem and the detection of symmetry using a single complex-valued neuron, Neural Networks 16:8, 1101-1105, T. (2003).
20. 2003. Khrennikov consolidates the notion of an extensively h-complex quantum mechanics, but feels that i-complex, h-complex, and real world mechanics are three spate systems.
A.Khrennikov, A. Hyperbolic quantum mechanics, Adv. in Applied Clifford Algebras, Vol.13, 1 (2003).
21.2004. Khrennikov notes possible relation between h-complex quantum mechanics and mental function.
Khrennikov, On Quantum-Like Probabilistic Structure of Mental Information, Open Systems Information Dynamics, Vol. 11, 3, 267-275 (2004).
22. 2004 Rochon shows that the full Riemann Zeta function is both i-complex and h-complex.
Rochon, A Bicomplex Riemann Zeta Function, Tokyo J. of Math.
23. 2004. Robson argues that zeta theory is a solution to high dimensionality problems in data mining.
Robson, The Dragon on the Gold: Myths and Realities for Data Mining in Biotechnology using Digital and Molecular Libraries, J. Proteome Res. (Am. Chem. Soc.) 3 (6), 1113 – 9 (2004).
24. 2005. Robson argues that all statements in zeta theory and in prime number theory are really statements relevant to data and data mining, and describes first link to Dirac’s quantum mechanics and Dirac’s braket notation.
Robson, Clinical and Pharmacogenomic Data Mining: 3. Zeta Theory As a General Tactic for Clinical Bioinformatics, J. Proteome Res. (Am. Chem. Soc.) 4(2); 445-455 (2005)
25. 2005. Code CliniMiner/Fano based on Zeta Theory and prime number theory is used in first pioneering effort in data mining large number of patient records.
Mullins, I. M., M.S. Siadaty, J. Lyman, K. Scully, G.T. Garrett, G. Miller, R. Muller, B. Robson, C. Apte, C., S. Weiss, I. Rigoutsos, D. Platt, and S. Cohen, Data mining and clinical data repositories: Insights from a 667,000 patient data set, Computers in Biology and Medicine, 36(12) 1351 (2006).
26. 2007. Robson recognizes that the imaginary number required to reconcile zeta theory with quantum mechanics and to allow Dirac notation to be used in inference is the hyperbolic imaginary number h, not the imaginary number i. Unaware of the work of Khrennikov, he makes no Khrennikov-like distinction between h-complex quantum mechanics and the everyday world.
Mullins, I. M., M.S. Siadaty, J. Lyman, K. Scully,G.T. Garrett, G.Miller, R. Muller, B.Robson, C. Apte, C., S. Weiss, I. Rigoutsos, D. Platt, and S. Cohen, Data mining and clinical data repositories: Insights from a 667,000 patient data set, Computers*in*Biology* and*Medicine, 36(12) 1351 (2006)
27. 2007. Robson recognizes that the imaginary number required to reconcile zeta theory with quantum mechanics and to allow Dirac notation to be used in inference is the hyperbolic imaginary number h, not the imaginary number i. Unaware of the work of Khrennikov, he makes no Khrennikov like distinction between h complex quantum mechanics and the every day world.
Robson, The New Physician as Unwitting Quantum Mechanic: Is Adapting Dirac’s Inference System Best Practice for Personalized Medicine, Genomics and Proteomics, J. Proteome Res. (A. Chem. Soc.), Vol. 6, No. 8: 3114 – 3126, (2007).
Robson, B. (2007) “Data Mining and Inference Systems for Physician Decision Support in Personalized Medicine” Lecture and Circulated Report at the 1st Annual Total Cancer Care Summit, Bahamas 2007.
28. 2008. Data Mining techniques using the full i-complex and h-complex zeta function are developed.
Robson, Clinical and Pharmacogenomic Data Mining: 4. The FANO Program and Command Set as an Example of Tools for Biomedical Discovery and Evidence Based Medicine” J. Proteome Res., 7 (9), pp 3922–3947 (2008).
29. 2008. Nitta and Bucholtz explore decision process boundaries of h-complex neural nets.
Nitta, and S. Bucholtz, On the Decision Boundaries of Hyperbolic Neurons. In 2008 International Joint Conference on Neural Networks (IJCNN).
30. 2009. Semantic Web starts to emerge but runs into bottleneck regarding the best approach for probabilistic treatment.
31. 2009. Baek and Robson propose that, for reasons of bandwidth limitations and security, the Internet should consist of data-centric computing by smart software robots. Robson indicates that they could be based on h-complex inference systems and link to semantic theory.
Robson B.. and Baek OK. The Engines of Hippocrates. From the Dawn of Medicine to Medical and Phrmaceuteutical Infomatics, Wiley, 2009.
Robson B. (2009) “Towards Intelligent Internet-Roaming Agents for Mining and Inference from Medical Data”, Future of Health Technology Congress, Technology and Informatics, Vol. 149, 157-177 IOS Press
Robson, B. (2009) “Links Between Quantum Physics and Thought” (A. I. Applications in Medicine) , Future of Health Technology Congress, Technology and Informatics, Vol. 149, 157-177 IOS Press.
32. 2009. Nivitha et al. develop new learning algorithms for complex-valued networks.
S. Savitha, S. Suresh, S. Sundararajan, and P, Saratchandran, A new learning algorithm with logarithmic performance index for complex-valued neural networks, Neurocomputing 72 (16-18), 3771-3781 (2009).
33. 2009. Khrennikov argues for the h-complex Hilbert space as providing the “contextual” (underlying rationale, hidden variables etc.) for all quantum mechanics.
Khrennikov, Contextual Approach to Quantum Formalism, Springer (2009)
34. 2010. Robson and Vaithiligam describe how zeta theory and h-complex probabilistic algebra can resolves challenges in data mining by the pharmaceutical industry.
Robson and A. Vaithiligam, Drug Gold and Data Dragons: Myths and Realities of Data Mining in the Pharmaceutical Industry pp25-85 in Pharmaceutical Data Mining, Ed Balakin, K. V. , John Wiley Sons (2010).
35. 2010. PCAST. December Report by the US President’s Council of Advisors on science and Technology calls for an XML-like Universal Exchange Langue for medicine including disaggregation for the patient record on the Internet for patient access, security, and privacy.
36. 2011. First description of Q-UEL in response to PCAST 2010.
Robson, B., Balis, U. G. J. and Caruso, T. P. (2011)“Considerations for a Universal Exchange Language for Healthcare.” In Proceedings of 2011 IEEE 13th International Conference on e-Health Networking, Applications and Services (Healthcom 2011), 173– 176. Columbus, MO: IEEE, 2011.
37. 2011. Robson and Colleagues develop the method of match-and-edit instructions for extracting
Robson, B., Li, J., Dettinger, R., Peters, A., and Boyer, S.K. (2011), Drug discovery using very large numbers of patents. General strategy with extensive use of match and edit operations. Journal of Computer-Aided Molecular Design 25(5): 427-441
38. 2011. Kuroe et al. consolidate the theory of h– complex neural nets.
Kuroe, T. Shinpei, and H. Iima, Models of Hopfield-Type Clifford Neural Networks and Their Energy Functions – Hyperbolic and Dual Valued Networks, Lecture Notes in Computer Science, 7062, 560 (2011).
39. 2012. Robson argues that h-complex algebra is an appropriate basis for Artificial Intelligence in the Pharmaceutical Industry.
Robson, B. (2012) “Towards Automated Reasoning for Drug Discovery and Pharmaceutical Business Intelligence”, Pharmaceutical Technology and Drug Research, 2012 1: 3 ( 27 March 2012 )
40. 2013. Goodman and Lassiter attempt to reconcile and restore interest in probabilistic semantics after a long period of domination by classical logic. N. D. Goodman and D. Lassiter, Probabilistic Semantics and Pragmatics: Uncertainty in Language and Thought,
42. 2013 Robson promotes Q-UEL from a public health perspective. B. Robson, Rethinking Global Interoperability in Healthcare. Reflections and Experiments of an e-Epidemiologist from Clinical Record to Smart Medical Semantic Web Johns Hopkins Grand Rounds Lectures (last accessed 3/14/2013).
44. 2014. Robson et al. release formal description of consolidated second version of Q-UEL.
Robson, T. P. Caruso and U. G. J. Balis, Suggestions for a Web Based Universal Exchange and Inference Language for Medicine, Computers in Biology and Medicine, 43(12) 2297 (2013).
45. 2013. Moldoveneua expresses view that hyperbolic quantum mechanics can’t also include wave mechanics. Possible attack on Khrennikov’s idea that hyperbolic quantum mechanics can show interference as for waves. Signs of growing sense that hyperbolic quantum mechanics is simply the everyday world described in terms of the machinery of traditional quantum mechanics.
Moldoveanu, Non viability of hyperbolic quantum mechanics as a theory of Nature, Cornell University Library, arXiv:1311.6461v2 [quant-ph] (2013).
46. 2013. First full description of the Hyperbolic Dirac Net and its relation to Q-UEL and to Bayes Nets.
Robson, Hyperbolic Dirac Nets for Medical Decision Support. Theory, Methods, and Comparison with Bayes Nets, Computers in Biology and Medicine, 51, 183 (2013).
47. 2014. Kunegis et al.c develop h-complex algorithms for dating recommender systems.
Kunegis, G. Gröner, and T, Gottrron, On-Line Dating Recommender Systems, the Split Complex Number Approach, (Like/Dislike, Similar/Disimilar) http://userpages.uni- koblenz.de/~kunegis/paper/kunegis-online-dating-recommender-systems-the-split- complex-number-approach.pdf (last accessed 6/1/2014).
48. 2015. Robson describes extension of Hyperbolic Dirac Net to semantic reasoning and probabilistic lingusitics.
Robson, B. “POPPER, a Simple Programming Language for Probabilistic Semantic Inference in Medicine. Computers in Biology and Medicine ” Computers in biology and Medicine”, (in press), DOI: 10.1016/j.compbiomed.2014.10.011 (2015).
50. 2015. Robson et al. describe medical records in Q-UEL format and PCAST disaggregation for patient security and privacy.
Robson, B., Caruso, T, and Balis, U. G. J. (2015) “Suggestions for a Web Based Universal Exchange and Inference Language for Medicine. Continuity of Patient Care with PCAST Disaggregation.” Computers in Biology and Medicine (in press) 01/2015; 56:51. DOI: 10.1016/j.compbiomed.2014.10.022
51. 2015. Mathematician Steve Deckelman of U. Wisconsin-Stout and Berkeley validates the theoretical principles Hyperbolic Dirac Net.
Deckelman and Robson, B. (2015)“Split-Complex Numbers and Dirac Bra-Kets” Communications in Information andSystems (CIS), in press.
The inference net on which this dualization is performed is defined as an estimate of a probability as an expression comprising simpler probabilities and or association measures, i.e. each with fewer attributes (i.e. arguments, events, states, observations or measurements) that the joint probability estimated, where each attribute corresponds to nodes of a general graph and the probabilities or association measures represent their interdependencies as edges. It is not required that the inference net be an acyclic directed graph, but the widely used BN that satisfies that description by definition is a useful starting point for making use of the given probabilities to address the same or similar problems. Specifically for the estimation of a joint probability, and HDN properly constructed with prior probabilities, and whether or not it contains cyclic paths, is purely real valued and its construction principles represent a generalization of Bayes Theorem. Any imaginary part indicates the degree of departure from Bayes Theorem over the net as a whole, and the direction of conditionality in which the degree of departure occurs, and thus the HDN provides an excellent book-keeping tool that Bayes Theorem is satisfied overall. Specially for the estimation of a conditional probability, it follows conversely from the above that any expression for a joint probability validated by the above means can serve as the generator of an HDN for the estimation of a conditional probability simply by dividing it by the HDN counterparts of prior probabilities, whence the resulting net is not purely real save by coincidence of probability values.
52. 2015. Implementation of a web based universal exchange and inference language for medicine: Sparse data, probabilities and inference in data mining of clinical data repositories
52. 2015. Robson, B., and S. Boray, The Structure of Reasoning in Answering Multiple Choice Medical Licensing Examination Questions. Computer Studies towards Formal Theories of Clinical Decision Support and Setting and Answering Medical Licensing Examinations, Workshop Lecture presentation, Proceedings of the IEEE International conference of Bioinformatics and Biomedicine, 9th-11th November, Washington DC (2015)
Bioingine.com employs algorithmic approach based on Hyperbolic Dirac Net that allows inference nets that are a general graph (GC), including cyclic paths, thus surpassing the limitation in the Bayes Net that is traditionally a Directed Acyclic Graph (DAG) by definition.
The Bioingine.com approach thus more fundamentally reflects the nature of probabilistic knowledge in the real world, which has the potential for taking account of the interaction between all things without limitation, and ironically this more explicitly makes use of Bayes rule far more than does a Bayes Net.
It also allows more elaborate relationships than mere conditional dependencies, as a probabilistic semantics analogous to natural human language but with a more detailed sense of probability. To identify the things and their relationships that are important and provide the required probabilities, the Bioingine.com scouts the large complex data of both structured and also information of unstructured textual character.
It treats initial raw extracted knowledge rather in the manner of potentially erroneous or ambiguous prior knowledge, and validated and curated knowledge as posterior knowledge, and enables the refinement of knowledge extracted from authoritative scientific texts into an intuitive canonical “deep structure” mental-algebraic form that the Bioingine.com can more readily manipulate.
Discussion on employing HDN to map Clinical Pathways (The Tacit Knowledge)
In the below referenced articles on the employ of Bayesian Net to model Clinical Pathways as probabilistic inference net, replace Bayesian Net to achieve stress tested Hyperbolic Dirac Net (HDN) which is a non-acyclic Bayesian resolving both correlation and causation in both the direction; etymology –> outcomes and outcomes –> etymology
1. Elements of Q-UEL
Q-UEL is based on the Dirac Notation and associated algebra The notation was introduced into later editions of Dirac’s book to facilitate understanding and use of quantum mechanics (QM) and it has been a standard notation in physics and theoretical chemistry since the 1940s
Hyperbolic Dirac Net, has ket |> as row vector, and bra <| as column vector
b) hh = +1 Imaginary Number
QM is a system for representing observations and measurements, and drawing probabilistic inference from them. The Q in Q-UEL refers to QM, but a simple mathematical transformation of QM gives classical everyday behavior. Q-UEL inherits the machinery of QM by replacing the more familiar imaginary number i (such that ii = -1), responsible for QM as wave mechanics, by the hyperbolic imaginary number h (such that hh=+1). Hence our inference net in general is called the Hyperbolic Dirac Net (HDN)
In probability theory A, B, C, etc. represent things, states, events, observations, measurements, qualities etc. In this paper we mean medical factors, including demographic factors such as age and clinical factors such as systolic blood pressure value or history of diabetes.
They can also stand for expressions containing many factors, so note that by e.g.
P(A|B) we would usually mean that it also applies to, say, P(A, B | C, D, E). In text, P(A,B, C,…) with ellipsis ‘…’ means all combinatorial possibilities, P(A), P(B), P(A, C), P(B, D, H) etc.
2) Employing Q-UEL preliminary inference net as the query can be created.
“Will my female patient age 50-59 taking diabetes medication and having a body mass index of 30-39 have very high cholesterol if the systolic BP is 130-139 mmHg and HDL is 50-59 mg/dL and non-HDL is 120-129 mg/dL?”.
This forms a preliminary inference net as the query, which may be refined and to which probabilities must be assigned
The real answers of interest here are not qualitative statements, but the final probabilities. The protocols involved map to what data miners often seem to see as two main options in mining, although we see them as the two ends of a continuum.
Method (A) may be recognized as Unsupervised (or unrestricted) data mining and post-filtering, and is the method mainly used here. In this approach
we (1) mine data (“observe”),(2) compute a very large number of the more significant probabilities and render them as tags and maintained as Knowledge Representative Store (KRS) or Semantic Lake (“evaluate”), (3) use a propose inference net as a query to search amongst the probabilities represented by those tags, but only looking for those relevant to complete the net and assign probabilities to it, assessing what is available, and seeing what can be substituted (“interpret”), and (4) compute the overall probability of the final inference net in order to make a decision (“decide”). Unsupervised data mining is preferred because it generates many tags for an SW-like approach, and may uncover new unexpected relationships that could be included in the net.
Method (B)uses supervised (or restricted) data mining and prefiltering. Data mining considers only what appears in the net. The down-stream user interested in inference always accesses the raw database, while in (A) he or she may never see it.
The advantage of (B) is that mining is far less computationally demanding both in terms of processing and memory. Useful to computing HDN for a specified Hypothesis.
The Popular Bayes Net BN Compared with our Hyperbolic Dirac Net HDN.
Each probabilities of any kind can also be manipulated for inference in a variety of ways, according to philosophy (which is a matter of concern ). The BN is probably the most popular method, perhaps because it does seem to be based on traditional, conservative, principles of probability. However, the BN is traditionally (and, strictly speaking, by definition) confined to a probability network that is a directed acyclic graph (DAG).
In general, reversibility, cyclic paths and feedback abound in the real world, and we need probabilistic knowledge networks that are general graphs, or even more diffuse fields of influence, not DAGs. In our response as the Hyperbolic Dirac Net (HDN), “Dirac” relates to use of Paul A. M. Dirac’s view of quantum mechanics (QM).
QM is not only a standard system for representing probabilistic observation and inference from it in physics, but also it manages and even promotes concepts like reversibility and cycles. The significance of “hyperbolic” is that it relates to a particular type of imaginary number rediscovered by Dirac. Dirac notation entities, Q-UEL tags, and the analogous building blocks of an HDN all have complex probabilities better described as probability amplitudes. This means that they have the form x + jy where x and y are real numbers and j is an imaginary number, though they can also be vectors or matrices with such forms as elements.
Q-UEL is seen as a Lorentz rotation i → h of QM as wave mechanics. The imaginary number involved is now no longer the familiar i such that ii = -1, but the hyperbolic imaginary number, called h in Q-UEL, such that hh = +1.
This renders the HDN to behave classically. A basic HDN is an h-complex BN.
Both BN and basic HDN may use Predictive Odds in which conditional probabilities (or the HDN’s comparable h-complex notions) are replaced by ratios of these.
Discussions on Employing Bayesian Net to Model Clinical Pathways (Replace BN by HDN to achieve Hyperbolic BN)
Introducing, Ingine, Inc. it is a startup in its incipient stages of developing BioIngine platform, which brings advancement in data science around Interoperability. Particularly with healthcare data mining and analytics dealing with medical knowledge extraction. Below are some of the lessons learned discussed while dealing with the healthcare transformation concerns, especially with the ONC’s Interoperability vision.
As an introduction, want to include the following passage from the book
The Engines of Hippocrates: From the Dawn of Medicine to Medical and Pharmaceutical Informatics
Healthcare administration has often been viewed as one of the most conservative of institutions. This is not simply a matter of the inertia of any complex bureaucratic system. A serious body with an impressive history and profound responsibilities cannot risk unexpected disruptions to public service by changing with every fashionable new convenience, just for the sake of modernity. A strong motivation is needed to change a system on which lives depend and which, for all its faults, is still for the most part an improvement on anything that went before. However, this is also to be balanced against the obligation of healthcare, as an application of science and evolving human wisdom, to make appropriate use of the new findings and technologies available. This is doubly indicated when significant inefficiencies and accidents look as if they can be greatly relieved by upgrading the system. Sooner or later something has to give, and the pressure of many such accumulating factors can sometimes force a relatively entrenched system to change in a sudden way, just as geological pressures can precipitate an earthquake. An Executive Forum on Personalized Medicine organized by the American College of Surgeons in New York City in October 2002 similarly warned of the increasingly overwhelming accumulation of arguments demanding reform of the current healthcare system…if there is to be pain in making changes to an established system, then it makes sense to operate quickly, to incorporate all that needs to be incorporated and not spin out too much the phases of the transitions, and lay a basis for ultimately assimilating less painfully all that scientific vision can now foresee. But scientific vision is of course not known for its lack of imagination and courage, and is typically very far from conservative, still making an element of future shock inevitable in the healthcare industry.
Complicated vs Complexity
A) Generally approaching to characterize a system, there are two views, complicated and complex. Complicated is with problems of system operations and population management, while complex problems are about multi-variability with an individual patient diagnosis.
Below link discusses providing better scenarios regarding complicated vs complexity
Generally, all management concerns around operations, payment models, healthcare ecosystem interactions, etc deal with delivering the systemic efficiencies. These are basically complicated problems residing in the system, which when resolved yield the hidden efficiencies.
All those that affect the delivery of the clinical efficacy have to deal with complex problem. Mostly owing to the high dimensionality (multi-variability) of the longitudinal patient data.
When both, complicated and complex concerns are addressed the Healthcare as an overarching complex system will begin to yield the desired performance driven outcomes.
B) Standards around Interoperability has generally dealt with following three levels of health information technology interoperability:
1 – “Foundational” interoperability allows data exchange from one information technology system to be received by another and does not require the ability for the receiving information technology system to interpret the data.
2 – “Structural” interoperability is an intermediate level that defines the structure or format of data exchange (i.e., the message format standards) where there is uniform movement of healthcare data from one system to another such that the clinical or operational purpose and meaning of the data is preserved and unaltered. Structural interoperability defines the syntax of the data exchange. It ensures that data exchanges between information technology systems can be interpreted at the data field level.
3 – “Semantic” interoperability provides interoperability at the highest level, which is the ability of two or more systems or elements to exchange information and to use the information that has been exchanged. Semantic interoperability takes advantage of both the structuring of the data exchange and the codification of the data including vocabulary so that the receiving information technology systems can interpret the data. This level of interoperability supports the electronic exchange of patient summary information among caregivers and other authorized parties via potentially disparate electronic health record (EHR) systems and other systems to improve quality, safety, efficiency, and efficacy of healthcare delivery.
The above levels of interoperability only deal with achieving semantic compatibility between systems in the data transacted from the large number of myriad systems (EHRs) while they converge into a heterogeneous architecture (HIE / IoT). This only deals with the complicated concerns within the system. They do not necessarily deal with the extraction and discernment of the knowledge hidden in the complex health ecosystem system. To achieve this for some simplicity sake, let us define need for a second order semantic interoperability that concerns with the data mining approaches required in the representation of the systemic medical knowledge. It is this medical knowledge; implicit, explicit and tacit that all together form evidence based medicine much desired to facilitate any clinical decision support system.
C) In the present efforts around Interoperability, which centers mostly around data standards (HL7v2, HLv3, FHIR, C-CDA, ICD-10, LOINC, SNOMED etc) and clinical quality measures (QRDA); only complicated concerns have been addressed and not necessarily the complex problems. This is the vexation in the quality measures reporting. While this has advanced the adoption of EHR by the hospitals, it is still far from it becoming an effective decision support tool for the physicians
It must be noted that in the MU2 criteria, it is suggested that besides achieving health information exchange pivotal to the creation of Accountable Care Organization (ACO), at the least five-health priority or critical health risk conditions must be addressed employing clinical decision support system. Deservedly created, this point creates a need for addressing clinical efficacy, in addition to achieving best possible system efficiencies leading to systemic performance driven outcomes. This means a much deeper perspective is required to be included in the Interoperability efforts to better drive data science around data mining that can help better engage physicians in the realization of the performance driven outcomes. Rather than allowing physicians to be encumbered by the reimbursement model driven EHRs. Also, although most EHR vendors employ C-CDA to frame the longitudinal patient view, they do not necessarily send all the data to the Health Information Exchange, this results into truncating the full view of the longitudinal patient records to the physicians.
D) Physician, Primary Care, Cost in Rendering and Shortage Physician workforce
When dealing with the primary care, it is desired that today’s physicians who are over-burdened, moving forward works as a team lead engaging variety of healthcare professionals, while also better enabling trained nurse practitioners. Furthermore, also rendering the work in a lesser-cost environment while moving away from higher cost environments such as hospitals and emergency care facilities. This also means moving away from service-based models into performance based payment models becomes imperative.
It must be noted that dealing with the way an organization generally works reassigning responsibilities both horizontally and vertically, has to do only with the complicated concerns of the system, not the complex problem. Again it must be emphasized that data mining related to evidence based medicine, which is in a way knowledge culled from the experiences of the cohorts within the health ecosystem, will play a vital role in improving the much desired clinical efficacy leading ultimately to better health outcomes. This begins to address the complex systemic problems, while also better engaging the physicians who find the mere data entry into the EHR cumbersome and intrusive; and not able to derive any clinical decision support from the integration of the System of systems (SoS).
Correlation vs Causations
A) While we make a case for better enabling evidence based medicine (EBM) driven by data mining as a high priority in the interoperability scheme of things, we also would like to point out the need for creating thorough systematic review aided by automation which is vital to EBM. This also means dealing with Receiver-Operating Characteristic (ROC) Curves http://www.ncbi.nlm.nih.gov/pubmed/15222906
“”The consensus of expert opinion based upon systematic reviews can either result in a solid and confident unanimous opinion, a reliable opinion with serious minority objections, a genuine controversy with no objective resolution, or simply the conclusion that we currently lack sufficient evidence and do not know the answer.””
Also, another reference to:-
Reflections on the Nature and Future of Systematic Review in Healthcare. By:- Dr. Barry Robson
In the recent times Bayesian statistics has emerged as a gold standard to developing curated EBM (http://www.ncbi.nlm.nih.gov/pubmed/10383350) and; in this context we would like to draw attention that while correlation is important as discussed in the above linked article, which is developed from the consensus of the cohorts in the medical community, it is also important to ascertain the causation. This demands need for a holistic Bayesian statistics as proposed in the new algorithms, including those built on proven ideas in physics advancing the scope of the Bayesian Statistics as developed by Dr. Barry Robson. The approach and its impact on the Healthcare Interoperability and analytics are discussed in the link provided below.
We extend Q-UEL, our universal exchange language for interoperability and
inference in healthcare and biomedicine, to the more traditional fields of public health surveys. These are the type associated with screening, epidemiological and cross-sectional studies, and cohort studies in some cases similar to clinical trials. “”There is the challenge that there is some degree of split between frequentist notions of probability as (a) classical measures based only on the idea of counting and proportion and on classical biostatistics as used in the above conservative disciplines, and (b) more subjectivist notions of uncertainty, belief, reliability, or confidence often used in automated inference and decision support systems. Samples in the above kind of public health survey are typically small compared with our earlier “Big Data” mining efforts. An issue addressed here is how much impact on decisions should sparse data have. “””””
B) Biostatistics, Algebra, Healthcare Analytics and Cognitive Computing
Another interesting aspect that emerges is the need for biostatistics and such many doctors with MD qualification are getting additionally qualified in Public Health Management, which also deals with Biostatistics. Dealing with population health one hand and clinical efficacy on the other, Interoperability via biostatistics has to deliver both views macro wrt systemic outcomes and at the micro level clinical efficacies. Developing such capabilities means much grander vision for Interoperability, as discussed in the OSEHRA, VA sponsored Open Source Efforts in making VistA available to the world market at a fraction cost. More discussion on the OSEHRA forum in the below link.
“”””Tom Munnecke – The Original Architect of VistA – This move to a higher level of abstraction is a bit like thinking of things in terms of algebra, instead of arithmetic. Algebra gives us computational abilities far beyond what we can do with arithmetic. Yet, those who are entrenched in grinding through arithmetic problems have a disdain for the abstract facilities of algebra.””””
Interesting point to note in the discussions on the above link, is that a case is being made for the role of data science (previously called Knowledge Engineering during last three decades) driving better new algorithms, including those built on proven ideas in physics in the Healthcare Interoperability. This helps in advancing the next generations of the EHR capabilities, eventually emerging as a medical science driven cognitive computing platform. The recommendation is in the employ of advances in the data science in moving the needle from developing a deterministic or a predicated System of systems (SoS) based on schemas such as FHIM (http://www.fhims.org), that proves design laborious and is outmoded, to harnessing the data locked in the heterogeneous system by the employ of advanced Bayesian statistics, new algorithms, including those built on proven ideas in physics and especially exploitation of the algebra. This approach delivered on a BigData architecture as a Cognitive Computing Platform with schema-less approaches has a huge benefit in terms of cost, business capability and time to market, delivering medical reasoning from the healthcare ecosystem as realized by the interoperability architectures.