SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From

2025-10-01

Summary

The article introduces SeedPrints, a novel method for fingerprinting Large Language Models (LLMs) based on random initialization biases, which act as unique identifiers from the model's inception. Unlike traditional methods that rely on properties emerging during training, SeedPrints can distinguish models from the outset and remain effective across all training stages, even under domain shifts or parameter changes.

Why This Matters

This research is significant because it enhances the ability to verify and attribute LLMs, offering reliable lineage tracking from the model's initialization through its lifecycle. This capability is crucial for model ownership verification, intellectual property protection, and ensuring ethical AI usage, particularly as more models are trained and deployed in diverse environments.

How You Can Use This Info

Professionals involved in AI model development can use SeedPrints to ensure their models are uniquely identifiable and protected against unauthorized use. Organizations can implement this method as part of their compliance and auditing processes to maintain control over model distribution and use. Additionally, it can be a valuable tool for researchers focusing on AI ethics and data provenance.

Read the full article