Welcome to the new SBIR.gov, to assist in getting you situated with the system, a preview of the new login and registration process is available here. Please reach out to the website support team with any questions via sba.sbir.support@reisystems.com

Topic

Funding Opportunities

Icon: back arrowBack to Funding Opportunities Search

Exploring the ability to Enhance Scalability For Knowledge Graphs For Enterprise Solutions

Seal of the Agency: DOD

Funding Agency

DOD

USAF

Year: 2025

Topic Number: AF25B-T002

Solicitation Number: 25.B

Tagged as:

STTR

BOTH

Solicitation Status: Open

NOTE: The Solicitations and topics listed on this site are copies from the various SBIR agency solicitations and are not necessarily the latest and most up-to-date. For this reason, you should use the agency link listed below which will take you directly to the appropriate agency server where you can read the official version of this solicitation and download the appropriate forms and rules.

View Official Solicitation

Release Schedule

  1. Release Date
    April 2, 2025

  2. Open Date
    April 2, 2025

  3. Due Date(s)

  4. Close Date
    May 21, 2025

Description

TECHNOLOGY AREAS: Trusted AI and Autonomy; Advanced Computing and Software; Biotechnology; Hypersonics; Integrated Network System-of-Systems OBJECTIVE: The objective of this Phase I STTR project is to advance the applicability and optimization of Knowledge Graph technologies for large-scale predictive analytics, suitable for adoption in the Defense Industrial Base (DIB) 's manufacturing operations. Such manufacturing efforts yield enormous quantities of heterogeneous data, both in structured and unstructured formats, with contributions from multiple participants across multiple supply chain tiers. The focus of Phase I is to demonstrate the technical merit and feasibility of solutions in two key areas: the provision of efficient, versatile plugins for automating graph construction from diverse input data types, which can maintain the same access permissions defined for the source data, and the design of a scalable, high-performance approach that can combine multiple graphs while managing increased data loads and preserving data integrity and query performance. A successful Phase I project will demonstrate that a Knowledge Graph based approach can meet the scalability, access control, and data security requirements crucial to underpinning large-scale predictive analytics based on DIB data and form the basis of an AI-based analytics platform to support DAF operations. The foundational work provided in Phase I can be further extended and validated in Phase II and applied to real-world use cases such as the KC-46 platform and its DIB supply chain. In the longer term, this project will be integrated within the Earth 616 framework and underpin a supply chain analytics platform that can ingest and analyze high volumes of heterogeneous data from multiple diverse sources, including participants from the DIB and open source feeds, whilst maintaining data integrity, traceability and information security. The ability of enhanced knowledge graph architectures to support the provision of efficient, accurate, and timely analytics capabilities, including risk analysis and decision making, will provide critical capabilities to the DoD and address operational gaps in supply chain management. DESCRIPTION: This STTR project aims to provide novel new software architectures and components that will underpin the DoD’s ability to ingest, trust, and utilize large volumes of heterogeneous data from the DIB, such that the DoD can perform advanced analytics on DAF operations, such as supply chain. This project will provide necessary enhancement to the capabilities of Earth 616 by ensuring that knowledge graph technologies are able to scale to the levels of data volume required and providing data security and managing access control constraints. This will ensure that the analytics platform can be applied to real-world use cases and make significant inroads into understanding and mitigating risks in DAF supply chain. The work proposed to meet these objectives is as follows: Efficient Ingestion of Supply Chain Data: create an architecture that supports automated ingestion and processing of supply chain data from diverse sources, including partners from multiple tiers of the supply chain. The goal is to provide the capability of automatically converting from existing data formats into a graph format to drastically simplify supply chain data ingestion. Real-world Knowledge Graph Interfaces This effort aims to dramatically improve the interface with real-world data while simultaneously maintaining the integrity and permissions for data. The approach will demonstrate sophisticated methods for ingesting data from diverse sources, including various file formats and relational databases, representing both structured and unstructured data into a graph database architecture. Furthermore, an advanced system to ensure data integrity will utilize distributed ledger technology, creating a tamper-evident environment. This system will be complemented by a comprehensive mapping framework that replicates the original data permissions structure within the graph database. The required solution will incorporate a sophisticated identity and credential-based management system, capable of seamlessly bridging disparate data ecosystems. This will enable cohesive access control across heterogeneous data environments, ensuring that appropriate permissions are maintained throughout the data lifecycle, from ingestion to analysis and reporting. Through these enhancements, a secure, scalable, and highly interoperable data management solution that addresses the complex requirements of modern data-driven organizations while maintaining the highest standards of data integrity and access control will be delivered. Scalable Design This project further aims to thoroughly investigate a Knowledge Graphs distributed cluster architecture to determine how best to effectively support growing data volumes. As Earth 616 moves forward and tackles more data from different platforms, it is going to need multiple graph databases, each capable of processing trillions of nodes. What is also needed is a common schema that overlays a distributed network of Graph databases to form a seamlessly connected set of Graph nodes that can be queried in a common, efficient way in a distributed setting. A systems architecture will scale seamlessly with data, reducing costs and hardware requirements while optimizing performance across interconnected datasets. The solution will maintain data integrity and deliver high performance, capable of handling billions of nodes and trillions of relationships with millisecond response times. PHASE I: DAF requires a robust predictive analytics system to proactively augment supply chain operations. The primary focus of the Phase I efforts will be to develop and demonstrate the feasibility of ingesting high volumes of heterogeneous data and maintaining data integrity and security, which are essential components of a viable future AI-based supply chain predictive analytics platform operating in an ecosystem with the DIB. The Phase I project will lay the groundwork by thoroughly assessing current systems, identifying gaps, evaluating technological solutions, and developing a preliminary design and architecture to feed into a focused roadmap for implementation in Phase II. The analysis provided in Phase I will identify inefficiencies, bottlenecks, and areas for improvement and will engage with stakeholders to gather specific requirements and align the program's objectives with operational goals. Conducting small-scale tests during Phase I will help to test the feasibility and scalability of the proposed solutions. It is vital to identify and address technological and operational challenges early. An initial security assessment is vital to ensure that data permissions and compliance are met. A preliminary cost-benefit analysis will provide insights into the potential efficiency gains, cost savings, and return on investment. To this end, this Phase 1 project is designed to create an architecture that enhances the capabilities of Knowledge Graph (KG) databases so that scalable predictive analytics can be performed by DAF. The objective of this effort, therefore, is to explore KG functionality that can support large-scale Graph capabilities. To accomplish this, Phase I focuses on three main tasks: Automation: Develop a high-efficiency plugin to automate the file-to-graph processes, accommodating various input formats and enhancing data ingestion efficiency. Security: Introduce a state-of-the-art distributed permission layer that uses user credentials and "node verifiers" to enforce fine-grained, context-sensitive data access control. Scalability: Architect and demonstrate the feasibility of a sophisticated distributed cluster infrastructure to effectively handle vast data volumes while preserving exceptional query performance and data integrity. The expected outcome of Phase I is a solution that is ready to develop to provide the underlying infrastructural requirement greatly needed to ingest the data needed and scale the system. Phase I deliverables shall include: ● Comprehensive system design documentation using SysML ● Architecture and design demonstrating how enhanced performance metrics can be achieved. ● Regular progress reports and technical documentation. ● Final architecture to include the following features: (1) continuous operation under various data loads; (2) Advanced scalability capabilities; (3) Significant improvements in supply chain resilience and efficiency. ● Phase II Planning: A detailed roadmap for the Phase II research and development effort Identifying key milestones, deliverables, and resource requirements for Phase II. Throughout the project, the provider will work in close partnership with the DAF technical point of contact (TPOC), ensuring regular communication through scheduled meetings and comprehensive technical reports. These collective efforts will result in a robust, scalable, and secure Knowledge Graph system with advanced predictive analytics. The success criteria for Phase I are: 1. Demonstrated integration feasibility of Graph Technologies into DAF supply chain operations 2. Documentation of the approach for the in-graph permissioning and how it can be implemented 3. Demonstration of how the distributed scalability approach can be realized 4. Positive feedback from DAF stakeholders on proposed designs and concepts 5. Delivery of a comprehensive and actionable plan for Phase II development PHASE II: Building upon a successful outcome to Phase I which demonstrates the technical feasibility of approaches to scalability and data security within a graph database, the Phase II will extend the work to implement the architecture and design created in Phase I, advancing the capabilities of a Graph-based predictive analytics system. The Phase II will build on the current state of the art to advance Technology Readiness Level (TRL) in all technology areas by delivering designs and physical prototypes that demonstrate enhanced performance. It will focus on R&D, Quality Assurance and Testing, along with refinement based on regular meetings with the DAF TPOC. The outcome will be a well-defined prototype that meets the specified requirements and expectations that is ready to integrate within the Earth 616 platform. The period of performance for Phase II is 18 months. Objectives are as follows: Objective 1: Ingestion of Data Files. Performers shall develop a generic plugin approach to parse and build KGs for different file types e.g. CSV, RDF, JSON, XML. This should incorporate an efficient Extract, Transform and Load (ETL) process and not rely on slower out of the box tools [1]. The performer shall demonstrate that files can be stored off-chain and associated with an NFT on a Blockchain, which supports one file or multiple files, in order to provide a tamper evident approach. The files should use a defined ontology that is used to map from each file type to the KG for use in predictive analytics. Objective 2: Permissioning Graph Queries. The performer should assess Neo4j RBAC capabilities but create a more distributed approach to graph permissioning. Future USAF applications will have data coming from many disparate sources, including data from supply chain OEMs, which need to maintain privacy across multiple stakeholders that may want to query data. Extending concepts in [2], we need a distributed permissioning approach using credentials and attribute policies, which will lead to a new permission method for KGs. Objective 3: Distributed Queries. Performers should provide the capability to scale out as applications scale up to handle the increased load as data volumes continue to grow. Data integrity and performance must be maintained at scale, providing the foundation for performant applications. Neo4j’s high-performance distributed cluster architecture is reported to scale with the data across billions of nodes, and trillions of relationships. This distributed architecture should be investigated and used to address the issue of scalability [3]. Objective 4: System Integration and Interoperability. Performers should integrate the Graph extensions with existing DAF systems, ensuring seamless data flow and interoperability. Performers must conduct end-to-end testing to validate the integrated system’s functionality, performance, and reliability. Objective 5: User Training and Support. Performers need to develop comprehensive training materials to equip supply chain users with the skills to use these new extensions, along with hands-on training sessions and workshops. Objective 6: QA Testing and Evaluation. Performers should implement a comprehensive test plan and build QA tools to automate testing, and use this to test results against predefined success criteria. Objective 7: Ongoing Performance Metrics. Performers should define key performance indicators (KPIs) to measure the system’s effectiveness and gather feedback from end users to ensure the platform meets requirements. A successful Phase II effort will deliver developments of the graph plugins, ensuring they meet the scientific, technical, and commercial merits required for successful deployment. This phase aims to deliver a well-defined, operational prototype that offers DAF innovative new large scale graph capabilities. A successful Phase II effort will deliver the following: Ingestion of Data Files: An efficient ingestion mechanism for files, capable of integrating with a blockchain A defined ontology file type mapping. Permissioned Graph Queries: A distributed credential and attribute-based policy approach to graph permissioning. Distributed Queries: Architect a distributed solution that can scale KGs. Evaluation: Automated QA testing. Customer feedback. Meet the requirements of the TPOC Example Project Timeline and Milestones Months 1-2: Requirement Gathering and Updated Architecture and Design Documents and Propose dataset and gather data Months 2-9: Implement the blockchain off-chain file approach, design the KG based permissioning method for queries and implement the distributed KG architecture Months 10-12: QA Testing and customer feedback Months 13-17: Full-Scale Development and Deployment of final system, along with comprehensive user training and performance evaluation Months 18-20: Final QA testing, demonstration of system capabilities and project review PHASE III DUAL USE APPLICATIONS: Phase III will take the capabilities designed, developed and tested in Phase II and further align the technology and approach with the goals of Earth 616. The resultant output should be designed to slot in for integration to further that effort in transitioning into its own program. As a result, the Phase III effort will greatly extend the existing Earth 616 effort surrounding the KC-46 platform, and will have access to the Earth 616 framework for integration. The overarching goal is to connect data from suppliers, internal data systems and public data sources, such as news and social media. Such a data fabric layer needs to be scalable to automatically convert, cleanse and aggregate data across systems using Graph Databases to connect databases seamlessly. The wider Phase III effort will increase the confidence in data, enabling better decision making and supply chain readiness with a Technology Readiness Level (TRL) level of at least 7 for transitioning to a program. This effort will feed in the wider vision to develop solutions consistent with Executive Orders, NDAA, OSD, and DoD guidance to support the Digital Defense and Resiliency of the Supply Chain. Specifically, the Phase III effort will address the following goals: Maturing the Technology: This effort should start at TRL 6 and make improvements for the operational environments to undergo final refinements and validation. The focus will be on ensuring that the system can scale to real-world data ingestion and querying and then verify that it meets all operational requirements without significant modifications. Integration with Earth 616: The Strategic Funding Increase (STRATFI) effort, called Earth 616, is an existing effort, designed to support the DIB and enhance the operational readiness of platforms including the KC-46. The plugins and tools developed should be easily integrated to ensure that required scalability of Earth 616 can be achieved, in a seamless manner, in order to support the critical supply chain visibility and predictive analytics capabilities. Transition: After iterative improvements and integration efforts, the Phase III outcome should reach TRL 9, and the project will transition to the Air Force Futures Command’s Integrated Capabilities Command (ICC). The ICC will oversee its full-scale deployment and integration across the Air Force Materiel Command (AFMC) and potentially other Department of Air Force platforms, ensuring the widest possible use of this technology to enhance operational readiness. Scalability: Phase III will further focus on operational deployment across multiple Air Force bases and platforms, and across contributors from the DIB, and the technology will be required to be scalable to meet operational needs and environments. Government Approvals and Regulations: The Phase III effort will be required to meet government approvals and regulations to ensure compliance with federal and military standards, including securing certifications for cybersecurity, data handling, and privacy, critical for operational deployment within military networks. Additional DAF Customers: Beyond the initial KC-46 integration within Earth 616 STRATFI, there are multiple opportunities to expand the technology’s application to other Air Force Materiel Command platforms, including Foreign Military Sales (FMS). This expansion will leverage the plugins for scaling the KG infrastructure developed with this STTR to enhance logistical support and operational planning. REFERENCES: 1. H. Feng and M. Huang, "An Approach to Converting Relational Database to Graph Database: from MySQL to Neo4j," 2022 IEEE 2nd International Conference on Power, Electronics and Computer Applications (ICPECA), Shenyang, China, 2022, pp. 674-680, doi: 10.1109/ICPECA53709.2022.9719151; 2. Efficient Authorization of Graph-database Queries in an Attribute-supporting ReBAC Model https://doi.org/10.1145/340102; 3. Scaling Neo4j: https://neo4j.com/product/neo4j-graph-database/scalability/ KEYWORDS: Knowledge Graphs; Scalability; Distributed Graphs; Trusted AI; Data Integrity; Data Security; Blockchain; Distributed Ledger Technology