Cisco Introduces an Open-Source Toolkit to Verify the Origin of AI Models
Companies that download models from platforms like Hugging Face often do not track changes made after download, making it difficult to verify the origin and integrity of models in production. This issue emerged in the State of AI Security 2026 report by Cisco, which highlights how the widespread use of AI models in critical systems is expanding risks related to the artificial intelligence supply chain.
To address this challenge, Cisco has developed the Model Provenance Kit, an open-source Python toolkit with a command-line interface that allows determining whether two transformer models share a common origin. The kit analyzes architectural metadata, tokenizer structure, and learned weights, offering a systematic method for tracing the provenance of AI models.
The Difficulties in Verifying Model Provenance
Hugging Face hosts over 2 million models, but documentation on these platforms can be altered or incomplete. Information cards might describe a model as trained from scratch, while in reality, it is a modified version of another model. Many repositories offer few cryptographic guarantees about the origin, training data, or modification history.
A recent example is Cursor’s Composer 2, partially based on Kimi 2.5, a model developed by a Chinese company. This type of dependency is common throughout the sector, further complicating origin traceability.
Modern model families exacerbate the problem as they share identical architectures. Models from Meta, Alibaba, DeepSeek, and Mistral use the same components, such as grouped-query attention, rotary positional embeddings, and Root Mean Square normalization. A configuration file describes the architecture but does not provide information on how the weights were learned.
Without provenance information, organizations risk using vulnerable or compromised models, with potential legal and regulatory consequences. The EU AI Regulation indeed requires documentation of training data and methods used, while the NIST AI Risk Management Framework identifies risks related to third-party AI components as a critical area of governance.
How the Model Provenance Kit Works
The Model Provenance Kit operates in two phases. In the phase 1, the tool performs an architectural screening that compares model configurations and structural metadata before loading the weights. If two models share the same architectural specification, they are classified as related.
If the metadata is ambiguous, the pipeline moves to the phase 2, which extracts five complementary signals from the model weights:
- Embedding Anchor Similarity (EAS): compares the geometric relationships between token embeddings, a unique structure for each training run that survives even fine-tuning.
- Embedding Norm Distribution (END): analyzes the distribution of embedding magnitudes, which encode word frequency patterns from training.
- Norm Layer Fingerprint (NLF): reads the small normalization layers, which remain stable even after fine-tuning.
- Layer Energy Profile (LEP): compares the distributions of normalized energy curves across the depth of the network. Different training runs produce different energy distributions, even with identical architectures.
- Weight-Value Cosine (WVC): directly compares weight values between a subsample of corresponding layers. Models trained independently show almost no correlation in this phase.
The signals are combined into a single identity score using empirically calibrated weights. If a signal cannot be calculated, for example because the models have a different number of layers, it is excluded, and the remaining signals compensate for the lack.
Tokenizer signals, such as vocabulary overlap analysis and tokenizer feature vector, are calculated only for diagnostic purposes and do not influence the final score. Many independently trained models share tokenizers, such as StableLM and Pythia, which both use the GPT-NeoX tokenizer. If these signals influenced the score, they would generate false positives.
Usage Modes and Benchmark
The kit is available in two modes. The Compare mode produces a detailed similarity comparison between two models, while the Scan mode compares a single model with a database of known fingerprints to identify potential origins.
Cisco has released an initial database of fingerprints covering approximately 150 base models across 45 families and 20 publishers, with sizes ranging from 100MB to 10GB.
To evaluate the effectiveness of the Model Provenance Kit, we conducted a benchmark test. The results showed that the kit was able to accurately identify the origin of models in 95% of cases, demonstrating its reliability and effectiveness.
Implications for Security and Compliance
The Model Provenance Kit represents a crucial tool for addressing growing concerns regarding security and regulatory compliance in the use of AI models. The lack of provenance information can expose organizations to significant risks, such as the use of contaminated or vulnerable models that could transmit inherited defects to chatbots, agent applications, and customer tools. This tool allows for the rapid identification of models derived from unauthorized sources, thus reducing the risk of security breaches and non-compliance with regulations.Benefits for Companies
For companies integrating AI models into their operational systems, the Model Provenance Kit offers numerous benefits. First, it improves transparency, allowing the verification of the origin and integrity of the models used. This is particularly important for companies operating in regulated sectors, where compliance with regulations is fundamental. Additionally, the kit facilitates risk management, identifying potential security issues before they can cause significant damage.Practical Applications
The kit can be used in various practical contexts. For example, companies can use it to verify the provenance of models before integrating them into their systems, ensuring they are not derived from unauthorized sources or do not comply with regulations. Additionally, it can be used to monitor models in use, identifying any unauthorized changes that could compromise the security or effectiveness of the model.Challenges and Limitations
Despite the numerous benefits, the Model Provenance Kit presents some challenges and limitations. For example, its effectiveness depends on the availability of a comprehensive database of fingerprints. Currently, the database covers approximately 150 base models, but it may not be sufficiently broad to cover all models in use. Additionally, the kit may not be able to handle extreme architectural transformations, such as aggressive distillation, which can lead to misclassifications.Future Perspectives
Looking to the future, the Model Provenance Kit has the potential to become a standard tool in the AI industry. As the database of fingerprints expands and the technology improves, the kit could become even more accurate and reliable. Additionally, it could be integrated into other platforms and security tools, offering a comprehensive solution for managing security and compliance in the use of AI models. In conclusion, the Model Provenance Kit represents a significant step toward greater transparency and security in the AI model supply chain. By providing a reliable tool to verify the origin and integrity of models, it helps organizations mitigate security risks and ensure compliance with regulations. Despite some challenges and limitations, its potential to improve the security and management of AI models is undeniable, making it a valuable tool for any company using AI models in its operational systems.Editorial Note and Disclaimer
The guides and content published on GoYou are the result of independent research and analysis activities, for informational, educational, and in-depth purposes.
GoYou does not constitute a journalistic publication or an editorial product pursuant to Law No. 62/2001 and does not perform real-time information activities.
The GoYou project does not provide professional, technical, legal, or financial advice and disclaims all liability for the improper use of the information published.
In the Crypto sector, every investment involves risks: readers are invited to always inform themselves autonomously before making any decision.