Artificial intelligence (AI) is rapidly transforming the world, and the field of non-clinical Computed Tomography (CT) is no exception. With AI, we can analyze and interpret CT data in ways never before possible. This technology has the potential to revolutionize how we use CT for various purposes, from archaeology and paleontology to materials science and security. Findability NoCTURNals wanted to have a little holiday fun and mischievously decided to prompt ChatGPT, Google’s “Gemini”, and Microsoft’s “Copilot” to create a Naughty vs Nice list of persistent identifiers (PIDs) that are non-DOI centric so that you can see firsthand how different AI Companions produce varied outputs from the same prompt. Training AI companions is crucial for their ability to provide meaningful interactions and personalized responses. Different companions may offer varying responses to the same prompt due to factors such as their training data, algorithms, and intended personality. These “Naughty vs. Nice” lists aim to provide some insight into the world of non-DOI PIDs, where some systems offer universal, reliable resolution and persistence, while others are either niche, lacking in infrastructure, or simply not as widely adopted or reliable. Keep in mind that the best PID for your needs will depend on a variety of factors as well, including the type of resource you’re identifying, the level of persistence required, and the technical expertise of your organization.
Below are three holiday-themed “Naughty vs. Nice” lists of persistent identifiers (PIDs) that are non-DOI centric. The idea is to distinguish between PIDs that are “nice” (well-established, broadly used, or helpful) and “naughty” (less reliable, problematic, or limited in scope).
ChatGPT results:
💀 Naughty List 💀
These are PIDs that either lack widespread adoption, have inherent limitations, or create issues for long-term persistence.
- URN (Uniform Resource Name)
- A type of URI (Uniform Resource Identifier) that can theoretically provide persistence but requires proper infrastructure and management, making it unreliable in many contexts.
- Issue: Many implementations are not well-maintained or linked to resolution services.
- Example: urn:example:12345
- UUIDs Without a Resolution System
- While UUIDs themselves are globally unique, without an associated resolution mechanism or system, they can’t be reliably used to retrieve or access a resource.
- Issue: Lacks resolution mechanisms, making them non-persistent in the practical sense.
- Example: e11f8b5d-5ac9-4cfc-a5b6-d5b8e89fbe1d (without a reliable resolver system)
- IATI Identifier
- A persistent identifier used in the International Aid Transparency Initiative, but only applicable in a niche context.
- Issue: Primarily used within the aid and development sector, limiting its application.
- Example: iati.org/12345
- Git Commit Hashes
- While a form of persistent identifier for specific commits in version-controlled repositories, they are not globally unique across different repositories and are tied to the software’s internal structure.
- Issue: Limited to code and doesn’t provide a uniform method of resolution across systems.
- Example: abcdef1234567890
- ISBN (International Standard Book Number)
- Though useful for identifying books, it is not persistent for other types of content (e.g., datasets) and is increasingly considered outdated in some contexts.
- Issue: ISBNs are limited to books and don’t support other types of research outputs.
- Example: 978-3-16-148410-0
- DOI-like System with Limited Scope
- A DOI-like identifier used in certain small-scale systems or repositories (not managed by established institutions), which may lack robust governance or resolution.
- Issue: May lack the universal reach or resolution infrastructure seen in the DOI system.
- Example: xyz.org/12345 (non-standard)
🎅 Nice List 🎅
These are the PIDs that are widely adopted, trusted, and beneficial for ensuring persistence and access to data.
- ARK (Archival Resource Key)
- A persistent identifier system primarily used for digital objects and resources, including data, images, and other content.
- Use case: High use in archives, libraries, and digital repositories.
- Example: ark:/12345/abcd
- Handle System
- Managed by the Corporation for National Research Initiatives (CNRI), it’s the technology behind DOI but can be used outside the DOI context to identify digital objects, datasets, and more.
- Use case: Common in research and higher education for identifying anything from software to documents.
- Example: hdl.handle.net/1234/abcd
- UUID (Universally Unique Identifier)
- A globally unique identifier used in many systems for objects that need a consistent ID across various databases and applications.
- Use case: Widely used in software development and databases.
- Example: e11f8b5d-5ac9-4cfc-a5b6-d5b8e89fbe1d
- LSID (Life Science Identifier)
- A global unique identifier for life science data, allowing for the linking of data to a universal identity.
- Use case: Particularly useful in the life sciences, such as genetics and biodiversity databases.
- Example: urn:lsid:bioinformatics.org:sequence:1234
- ORCID (Open Researcher and Contributor ID)
- A persistent identifier that uniquely identifies authors and researchers, linking their works and contributions.
- Use case: Used to ensure proper attribution in academic publishing.
- Example: https://orcid.org/0000-0002-1825-0097
- EPI (Entity Persistence Identifier)
- A flexible identifier used to point to entities in any domain.
- Use case: Used in various fields, including research, repositories, and even non-scientific contexts like public records.
- Example: https://www.example.com/epid/1234567890
- PURL (Persistent Uniform Resource Locator)
- A type of URL that can be redirected and remains functional even if the target location changes, providing long-term access to digital objects.
- Use case: Frequently used by digital libraries and archives.
- Example: http://purl.org/abc/12345
Gemini results:
💀 Naughty List 💀
- ARKs: While ARKs can be persistent, their resolution and long-term maintenance can be unreliable if not managed properly.
- Handles: Handles can be a bit tricky to implement and require technical expertise. Plus, their resolution can sometimes be slow.
- URLs: While URLs can be persistent, they’re often fragile and can change over time, especially for websites that undergo updates or redesigns.
🎅 Nice List 🎅
- DOIs: While technically not non-DOI, DOIs are the gold standard for persistent identifiers and are widely recognized and supported.
- URIs: Well-structured URIs, particularly those using protocols like HTTP or HTTPS, can be highly persistent and reliable.
- PURLs: PURLs are easy to use and can be customized to point to specific resources. They’re a great option for individuals and small organizations.
- LSIDs: LSIDs are a good choice for scientific data and can be used to identify specific data objects within larger datasets.
Copilot results:
💀 Naughty List 💀
- Proprietary Database IDs:
- Description: Identifiers used only within specific proprietary systems.
- Why Naughty: Lack of interoperability and limited accessibility outside their native platforms.
- Temporary Session IDs:
- Description: IDs created for temporary use during online sessions.
- Why Naughty: Lack of persistence and utility beyond the immediate session.
- URL Shorteners (e.g., bit.ly):
- Description: Tools that create shortened versions of URLs.
- Why Naughty: Prone to link rot and lack persistence, making them unreliable for long-term citation.
- Internal Project IDs:
- Description: Identifiers used solely within internal systems of organizations.
- Why Naughty: No standardization and limited usefulness outside the organization.
- Ad Hoc Identifiers:
- Description: Randomly assigned identifiers without any consistent structure.
- Why Naughty: Lack of standardization and difficulty in tracking or referencing consistently.
🎅 Nice List 🎅
- ORCID (Open Researcher and Contributor ID):
- Description: An alphanumeric code to uniquely identify scientific and other academic authors.
- Why Nice: Ensures researchers receive proper credit for their work and helps with accurate author identification.
- ISBN (International Standard Book Number):
- Description: A unique identifier for books.
- Why Nice: Facilitates easy tracking and purchasing of books worldwide.
- Handle System:
- Description: A system for assigning unique identifiers to digital objects.
- Why Nice: Ensures persistent access to digital resources, even if URLs change.
- VIAF (Virtual International Authority File):
- Description: An international authority file for the names of persons, organizations, etc.
- Why Nice: Helps unify multiple name variations and improves information retrieval across databases.
- ISNI (International Standard Name Identifier):
- Description: A unique identifier for the public identities of individuals and organizations.
- Why Nice: Assists in the disambiguation of names across various databases and supports rights management.
We hope these lists bring some holiday cheer and helpful insights into which PIDs to consider for use and why some PIDS can be problematic to use. Stay tuned for more holiday cheer this week brought to you by the Findability NoCTURNals.