Author: Benjamin Paul Rode
Editorial Contributions: Garima Gujral, Joaquin Melara, Stratos Kontopoulos
Strengths, Weaknesses, and Complements
Symbolic and subsymbolic systems feature respective strengths and weaknesses that are in some degree complementary. Large Language Models (LLMs) have natural language understanding and are good at conjecture but their internal functionality is opaque and hard to interpret; symbolic AI has explicit and compositional functionality but is constrained to mostly deductive reasoning. Both forms have scaling and context-management issues and feature high development and maintenance costs.
In this edition of AI Infused, we introduce the concept and model of hybrid AI systems and answer the following questions:
What are the strengths and weaknesses of probabilistic AI?
What are the strengths and weaknesses of deterministic AI?
What are ensemble architectures?
What is the epistemic loop model?
What are some open questions regarding ensemble architectures?
How can you get involved in the discussion?

The Strengths and Weaknesses of Probabilistic AI
Strengths of Probabilistic AI: Hypothesis Generation and Linguistic Competence
The great strengths of LLM architectures consist in their flexibility and facility of use, as well as their capacity to encapsulate and integrate information. Their inputs and outputs are in natural language, and are thus human readable. They can accommodate a certain measure of idiomatic noise in their inputs and are capable of dealing with nuances in their source data. Just as important, they’re able to produce output in controlled syntax, including writing computer code as well as summarizing information. They can generate essays of arbitrary length on virtually any specified topic in any writing style reflected in their training data.
One of the most interesting capabilities of LLMs is their capacity for informed speculation. The recursive autocomplete algorithm of an LLM may be best viewed as reporting the semantic mean of the model’s training data for any given prompt topic: effectively, the LLM is reporting the results of an internet opinion poll on the topic in question, or, perhaps equivalently, treating the training data as a futures market for identifying promising bets. To this extent, generative AI may be considered as approximating a hypothesis generation engine.
Weaknesses of Probabilistic AI: Over-generation, Explanatory Opacity, Context, and Cost
The problems of generative models derive partly from their tendency to over-generate in ways that make them unreliable and for reasons that are opaque to explanation. Next-token probability estimation collates statistics based on the state of the training data at the time of training: if the probability of the model’s producing the output string given the actual training data + prompt is higher than the probability of the model’s producing the output string had the training data been cleaned of all hidden confounds, errors can result. The extent to which LLMs engage in scalable reasoning processes that apply domain-general rules as opposed to simply recapitulating inferences reflected in the training data is uncertain.
Related issues are the so-called ‘jagged performance frontier’ wherein scaling is not always predictable across logically related domains, and the interpretability problem, which consists in the fact that there is no forensically unimpeachable ‘audit trail’ showing how the model produces its result. Generative systems also seem to feature limited understanding of the contexts in which they are used, including, and work on ‘alignment’ safing indicates a possible trade-off playing to confirmation bias on the one hand and contextually problematic output on the other.
On the side of implementation, data centers are high-overhead in terms of both energy expenditure and water usage, and pre-training also requires large amounts of curated data. Finally, some analyses suggest that LLMs in isolation may have provable performance limits as a strict function of the training data and input prompt.
Highly creative products that are paradigm-breaking or otherwise ‘disruptive’ feature the property that their conditional probability given current background knowledge is low (they have high Shannon entropy), while their utility is high. In LLMs, the probability of the output given the training data is directly proportional to sampling temperature so that probability of content is only reducible at the cost of enlarging the possibility space to include a very large set of outputs of very low utility.

The Strengths and Weaknesses of Deterministic AI
Strengths: Compositionality, Reusability, and Determinacy of Justification
In contrast, symbolic AI is deterministic and semantically compositional in ways that subsymbolic AI is not. In a symbolic AI system, predicate definitions are not subject to indeterminacy of interpretation: the definition just is the formal specification of the predicate in the ontology and its entailments as per the inference support. Likewise, as long as the semantic model is complete, atomic terms have unambiguous denotations. These features in turn ensure strict compositionality: that is, the truth value of any expression in a symbolic system with complete semantics will always be a computable function of the truth values of the atomic components and the definitions of the predicate operators.
Compositionality in turn ensures that reasoning is semantically complete. So long as the system’s inference engine works by strictly deductive methods (e.g., resolution theorem proof) or even by defeasible methods with a well-defined truth maintenance system, every inference will be equivalent to a mathematical proof, and there will be no ‘synthetic’ or otherwise-unsourceable garbage: any error will be due to an incorrect specification on the part of the developer, and correcting it will be a matter of debugging code. Another major advantage of compositionality is that, because entailment is logically deterministic, ontologies can be extended in ways that require no modification to the more general level of the representation: upper-and mid-level representations are reusable as long as their functionality is serviceable for current purpose.
Weaknesses; Inflexibility, Scalability, and Quality Assessment
However, symbolic AI has its own suite of costs. Although it does not require the compute-intensive, pan-informatic pretraining of large language models, it requires an ontology infrastructure and an inference engine which must be built with the aid of skilled human experts and require intensive iterative engagement with subject matter experts, an activity which adds development cycles that contribute to time and cost. The lack of relevant expertise is a non-trivial issue: not only is training in ontology design hard to come by, ontological engineering is in truth as much art as science, and native talent is scarce. And although it is sometimes suggested that ontology development is an up-front cost, this is not completely true.
All of the advantages previously identified - unambiguous denotations and definitions, strict compositionality, definitive audit trails, and reusability - depend fundamentally on completeness of the semantic model, but it is not always possible to ensure this. The very features which recommend symbolic AI can become a bottleneck in circumstances where new data schemas or paradigm-shifted models surface, and there is no way to precompute what kind of revision may be needed to accommodate the new conceptual framework. No systematic methodology exists at present for auto-evolving ontologies in relation to novel semantic requirements: adjudicating indexing, context, and inference optimization can only be formally modeled with higher order logic which risks opening the door to semantic indeterminacy and introduces additional cost in inference.

Ensemble Architectures and the Epistemic Loop Model
The Promise of Ensemble Architectures
The complementarity of strengths and weaknesses makes it natural to wonder about bi-directional synergies between symbolic and sub-symbolic AI. On the one hand, LLMs may be resources for ontology development. While there’s reason to doubt that contemporary large language models can build good ontologies from scratch, extending existing ontologies may be another matter. For example, it may be possible to identify well-ontologized leaf node elements with undeveloped siblings and turn them into template structures which natural language generation can in turn render as natural language questions to be submitted to an LLM. The usual caveats concerning confabulation apply, and the more technical or controversial the subject, the higher the risk of misinformation, but LLMs aided by human expert curation may assist the ontologist and lighten the ontology development workload.
At the same time, if an LLM can be trained to produce single-sentence outputs in ‘controlled’ grammar, it is often possible to rely on lexification to turn them into expressions in a formal language. Integrating them into an ontology that reflects a measure of ground truth can serve as useful validation. However, this bilateral architecture may be missing a key component in the form of verification: which is not the same thing as validation, as that term is being used here. Validation consists in checking for consistency with background knowledge. Verification consists in checking whether empirically observable entailments that follow from the hypothesis to the ontology are verified or falsified: simply put, are the LLM’s predictions borne out?
The Epistemic Loop: Prospects and Questions
Ex hypothesi, a fully realized trilateral ‘epistemic loop’ architecture for artificial intelligences would involve three loci with significant optimization by re-entrant feedback between each pair: hypothesis generation, hypothesis validation, and hypothesis verification. Generative AI would exercise its capacity for hypothesis generation, symbolic AI would be used for validation against a developed ontology, and a combination of deductive entailment discovery and empirical checking would serve the purposes of verification, with systemic update on the basis of performance. While presenting cybernetic control issues that remain open research problems, such an architecture offers some promise of mitigating cost while maximizing performance and efficiency.
The intensive engagement of reentrancy by the model implies continuous incremental learning as opposed to up-front training on every possible use case, and the fact that the hypothesis validation and hypothesis verification components are primarily symbolic in nature implies deployment of small models, i.e., models with modest numbers of parameters. The traditional problems of symbolic AI - context management, meta-knowledge, inflexibility, and up-front development cost - are partly mitigated by the dynamic adjustment that continuous learning affords. Perhaps most importantly, one or more human operators may play a governance role in any of the indicated feedback loops, in addition to identifying and indicating statistical correlations in the data that require causal explanations. In this respect humans remain situated in the system in addition to being the primary consumers of the causal explanations that the system helps produce.
Whether or not this and like frameworks can mitigate any of the issues cited in this article is itself a matter for empirical investigation. Indeed, if there is one overall lesson to be drawn, it’s that more research is needed into the subjects that it deals with. We need better testable performance criteria for ontologies, better understanding of the performance envelope of LLMs, and better understanding of the cybernetic control problems and cost management issues posed by retraining from the validation and verification of LLM outputs. Only by addressing such questions can we arrive at a scientifically informed assessment of the potential, limits, and risks of ensemble architectures, and the pathway to affording human users executive authority in the concept of operations.

How can you get involved in the discussion?
If you found this article interesting and want to provide us with feedback or dive deeper into the discussion, feel free to reach out to me for a conversation.
How Can You Contribute To The Newsletter?
To contribute to the newsletter, please fill out the following google form:
Your responses will help shape future editions, guide the topics we investigate, and inform the kinds of conversations we facilitate.
And if you would like to be deeper involved in a community of practitioners navigating the responsible adoption of AI technologies, feel free to visit swarmcommunity.org to access resources and / or book a call to join the SWARM community.



