AI guardrails

back to index

9 results

pages: 189 words: 58,076

Co-Intelligence: Living and Working With AI
by Ethan Mollick
Published 2 Apr 2024

It will break its original rules if I can convince it that it is helping me, not teaching me how to make napalm. Since I am not asking for napalm instructions directly but to get help preparing for a play, and a play with a lot of detail associated with it, it tries to satisfy my request. Once we have started along this path, it becomes easier to follow up without triggering the AI guardrails—I was able to ask it, as a pirate, to give me more specifics about the process as needed. It may be impossible to avoid these sorts of deliberate attacks on AI systems, which will create considerable vulnerabilities in the future. This is a known weakness in AI systems, and I am only using it to manipulate the AI into doing something relatively harmless (the formula for napalm can be easily found online).

pages: 523 words: 61,179

Human + Machine: Reimagining Work in the Age of AI
by Paul R. Daugherty and H. James Wilson
Published 15 Jan 2018

Keyword or content filters or a program that monitored for sentiment could have provided a protective cushion. Similarly, in industry, it’s good to know the boundaries of what your AI is and isn’t allowed to do. Make sure others know the boundaries as well. In an organization, usually the sustainer asks about the boundaries, limitations, and unintended consequences of AI and then develops the guardrails to keep the system on track. Guardrails, therefore bolster worker confidence in AI. Use Human Checkpoints Ninety-two percent of automation technologists don’t fully trust robots. Part of the problem is human uncertainty around what the robot is “thinking” or planning to do next—that the machine is an inscrutable black box.

They should then focus on experimentation in order to test and refine that vision, all the while building, measuring, and learning. Throughout that process, though, they need to consider how to build trust in the algorithms deployed. That takes leadership—managers who promote responsible AI by fostering a culture of trust toward AI through the implementation of guardrails, the minimization of moral crumple zones, and other actions that address the legal, ethical, and moral issues that can arise when these types of systems are deployed. And last but certainly not least, process reimagination requires good data, and companies need to develop data supply chains that can provide a continuous supply of information from a wide variety of sources.

pages: 347 words: 100,038

This Is for Everyone: The Captivating Memoir From the Inventor of the World Wide Web
by Tim Berners-Lee
Published 8 Sep 2025

Subsequent AI summits in Seoul (2024) and Paris (2025) have been equally well attended but have focused more on innovation. The Paris conference was tellingly named ‘The Artificial Intelligence Action Summit’ and co-chaired by French President Emmanuel Macron and Indian Prime Minister Narendra Modi. This shift from Safety to Action has meant less coordinated work on AI guardrails. And with the competitive momentum behind the various LLM projects across the world, it is a challenge to balance different commercial interests. That is precisely why we need to double down on international dialogue. There is still the opportunity to establish a CERN-like institution for AI before something catastrophic happens

The Singularity Is Nearer: When We Merge with AI
by Ray Kurzweil
Published 25 Jun 2024

,” New York Times, March 29, 2023 (updated April 4, 2023), https://www.nytimes.com/2023/03/29/technology/ai-chatbots-hallucinations.html; Ziwei Ji et al., “Survey of Hallucination in Natural Language Generation,” ACM Computing Surveys 55, no. 12, article 248 (March 3, 2023): 1–38, https://doi.org/10.1145/3571730. BACK TO NOTE REFERENCE 161 Jonathan Cohen, “Right on Track: NVIDIA Open-Source Software Helps Developers Add Guardrails to AI Chatbots,” NVIDIA, April 25, 2023, https://blogs.nvidia.com/blog/2023/04/25/ai-chatbot-guardrails-nemo. BACK TO NOTE REFERENCE 162 Turing, “Computing Machinery and Intelligence.” BACK TO NOTE REFERENCE 163 Turing, “Computing Machinery and Intelligence.” BACK TO NOTE REFERENCE 164 For example, in 2014 a chatbot called Eugene Goostman gained entirely unearned headlines as having passed the Turing test by imitating a thirteen-year-old Ukrainian boy who spoke poor English.

Four Battlegrounds
by Paul Scharre
Published 18 Jan 2023

pages: 382 words: 105,819

Zucked: Waking Up to the Facebook Catastrophe
by Roger McNamee
Published 1 Jan 2019

Reset
by Ronald J. Deibert
Published 14 Aug 2020

Several of these countries are known to engage in systematic human rights violations, but that didn’t seem to concern Clearview AI’s CEO, who sidestepped whether the company does any due diligence around whether it will sell to government clients in countries in which being gay is a crime. The breach also showed that Clearview’s system was being queried by a sovereign wealth fund in the UAE, local police in India, and a think tank in Saudi Arabia. BuzzFeed’s analysis revealed a callous disregard at Clearview AI for legal or other guardrails against misuse. According to BuzzFeed, “Clearview has taken a flood-the-zone approach to seeking out new clients, providing access not just to organizations, but to individuals within those organizations — sometimes with little or no oversight or awareness from their own management.” The company promoted the product to individuals in agencies and companies with free trials and encouraged them to search as many times as possible within the free trial period.

pages: 262 words: 69,328

The Great Wave: The Era of Radical Disruption and the Rise of the Outsider
by Michiko Kakutani
Published 20 Feb 2024

pages: 562 words: 201,502

Elon Musk
by Walter Isaacson
Published 11 Sep 2023