백악관의 A는 어떤가요.I. 약속이 쌓이는가?

[NYT]How Do the White House’s A.I. Commitments Stack Up?

이번 주, 백악관은 7명의 주요 A로부터 “자발적 약속”을 확보했다고 발표했습니다.I. 인공지능이 야기하는 위험을 관리하기 위한 기업.

회사 구하기 – 아마존, 인류학, 구글, 변곡, 메타, 마이크로소프트 및 오픈인공지능 – 무엇이든 동의하는 것은 한 걸음 더 나아가는 것입니다. 그들은 그들이 A에 접근하는 방식에 있어서 미묘하지만 중요한 차이를 가진 앙숙들을 포함합니다.I. 연구 개발.

예를 들어, 메타는 그것의 A를 얻기를 매우 열망합니다.I. 개발자의 손에 모델링을 적용하여 많은 개발자가 사용할 수 있도록 코드를 공개합니다. Antheric과 같은 다른 연구소들은 기술을 제한적인 방법으로 출시하면서 보다 신중한 접근법을 취했습니다.

하지만 이러한 약속은 실제로 무엇을 의미합니까? 그리고 그들은 A가 어떻게 되는지에 대해 많은 것을 바꿀 것 같습니다.I. 기업들은 법의 지원을 받지 않는 상황에서 운영합니까?

A의 잠재적인 지분을 고려할 때.I. 규정, 세부 사항이 중요합니다. 그럼 여기서 합의된 내용을 자세히 살펴보고 잠재적인 영향을 파악해 보겠습니다.

약속 1: 회사는 A의 내부 및 외부 보안 테스트에 전념합니다.I. 출시 전 시스템.

이 A들은 각각.I. 기업들은 이미 출시되기 전에 모델에 대한 보안 테스트(흔히 “레드 팀”)를 수행합니다. 한 차원에서 보면, 이것은 새로운 약속이 아닙니다. 그리고 그것은 모호한 약속입니다. 어떤 종류의 테스트가 필요한지, 누가 테스트를 할 것인지에 대한 많은 세부 정보가 제공되지 않습니다.

백악관은 약속과 함께 제출한 성명에서 A에 대한 테스트만을 언급했습니다.I. 모델은 “부분적으로 독립적인 전문가에 의해 수행될 것”이며 A에 집중합니다.I. “생물 보안 및 사이버 보안과 같은 위험 및 더 넓은 사회적 영향”

A를 받는 것은 좋은 생각입니다.I. 기업은 이러한 종류의 테스트를 계속 수행하고 테스트 과정의 투명성을 장려하기 위해 공개적으로 약속합니다. 그리고 A에는 몇 가지 유형이 있습니다.I. 위험 – A의 위험과 같은 위험.I. 모델은 바이오 무기를 개발하는 데 사용될 수 있습니다. 정부와 군 관계자들이 평가하는 것보다 더 적합할 것입니다.

저는 A를 보고 싶습니다.I. 업계는 Alignment Research Center가 Open에 의해 사전 공개된 모델에 대해 수행하는 “자율 복제” 테스트와 같은 안전 테스트의 표준 배터리에 동의합니다인공지능과 인류학. 저는 또한 연방정부가 비용이 많이 들고 상당한 기술적 전문성을 가진 엔지니어를 필요로 하는 이런 종류의 테스트에 자금을 지원하는 것을 보고 싶습니다. 현재, 많은 안전 테스트는 회사가 자금을 지원하고 감독하는데, 이것은 명백한 이해 상충 문제를 제기합니다.

약속 2: 기업은 업계 전반에 걸쳐 정보를 공유하고 정부, 시민 사회 및 학계와 A를 관리하기로 약속합니다.I. 위험.

이 약속은 또한 약간 모호합니다. 이 회사들 중 몇몇은 이미 그들의 A에 대한 정보를 발표하고 있습니다.I. 모델 – 일반적으로 학술 논문이나 기업 블로그 게시물에 있습니다. 오픈(Open)을 포함한 몇 가지 항목AI와 인류학은 또한 “시스템 카드”라고 불리는 문서를 발표하는데, 이 문서는 모델을 더 안전하게 만들기 위해 취한 조치를 설명합니다.

하지만 그들은 또한 안전 문제를 이유로 가끔 정보를 자제해왔습니다. 열 때AI가 최신 A를 출시했습니다.I. 모델인 GPT-4는 올해 업계 관습을 깨고 얼마나 많은 데이터에 대해 교육을 받았는지 또는 모델의 크기(“파라미터”로 알려진 메트릭)를 공개하지 않기로 결정했습니다. 그것은 경쟁과 안전에 대한 우려 때문에 이 정보를 공개하는 것을 거부했다고 말했습니다. 또한 기술 회사들이 경쟁자들로부터 멀리하기를 좋아하는 종류의 데이터이기도 합니다.

이러한 새로운 약속에 따라, A는 그렇게 될 것입니다.I. 기업들은 그런 종류의 정보를 공개해야 합니까? 그렇게 하면 A가 가속화될 위험이 있습니다.I. 군비 경쟁?

저는 백악관의 목표가 기업이 매개 변수 수를 공개하도록 강요하는 것보다는 모델이 제기하는 (또는 제기하지 않는) 위험에 대한 정보를 서로 교환하도록 장려하는 것에 있다고 의심합니다.

하지만 그런 종류의 정보 공유도 위험할 수 있습니다. 구글이 A라면.I. 팀은 출시 전 테스트에서 새로운 모델이 치명적인 바이오 무기를 엔지니어링하는 데 사용되지 않도록 방지했습니다. 구글 외부에서 해당 정보를 공유해야 합니까? 그것이 나쁜 배우들에게 어떻게 그들이 같은 일을 수행하도록 덜 보호받는 모델을 얻을 수 있는지에 대한 아이디어를 줄 위험이 있습니까?

약속 3: 회사는 독점 모델 및 미공개 모델 가중치를 보호하기 위해 사이버 보안 및 내부자 위협 안전 조치에 투자하기로 약속합니다.

이것은 매우 간단하며, A사이에서 논란의 여지가 없습니다.I. 내가 얘기했던 내부자들. “모델 가중치”는 A를 주는 수학적 명령어의 전문 용어입니다.I. 기능하는 능력을 모델링합니다. 가중치는 당신이 당신만의 버전의 ChatGPT 또는 다른 A를 만들고 싶어하는 외국 정부(또는 경쟁 회사)의 에이전트라면 훔치고 싶은 것입니다.I. 제품. 그리고 그것은 A입니다.I. 기업은 엄격한 통제를 유지하는 데 기득권을 가지고 있습니다.

이미 널리 알려진 모델 가중치 유출 문제가 있었습니다. 예를 들어, 메타의 원래 LLAMA 언어 모델의 가중치는 모델이 공개된 지 며칠 만에 4chan 및 기타 웹 사이트에서 유출되었습니다. 더 많은 유출의 위험과 다른 나라들이 미국 회사들로부터 이 기술을 훔치는 데 관심이 있을 수 있다는 점을 고려하여 A에게 요청합니다.I. 기업은 자체 보안에 더 많은 투자를 하는 것이 쉬운 방법처럼 느껴집니다.

약속 4: 회사는 타사가 자사 A의 취약점을 발견하고 보고할 수 있도록 지원하기로 약속합니다.I. 시스템.

이게 무슨 뜻인지 잘 모르겠어요. A마다.I. 회사는 모델을 출시한 후 모델의 취약성을 발견했습니다. 일반적으로 사용자가 모델로 나쁜 일을 하거나 회사가 예상하지 못한 방식으로 가드레일을 우회하려고 하기 때문입니다.

백악관의 약속은 기업들이 이러한 취약성에 대한 “강력한 보고 메커니즘”을 구축할 것을 요구하고 있지만, 그것이 무엇을 의미하는지는 명확하지 않습니다. 페이스북 및 트위터 사용자가 규칙을 위반하는 게시물을 보고할 수 있는 것과 유사한 인앱 피드백 버튼? 오픈과 같은 버그 현상금 프로그램AI는 올해 시스템의 결함을 발견한 사용자에게 보상하기 위해 시작되었습니다? 다른 거? 자세한 사항은 더 기다려야 할 것 같습니다.

약속 5: 회사는 사용자가 콘텐츠가 A인 경우를 확실히 알 수 있도록 강력한 기술 메커니즘을 개발하는 데 전념합니다.I. 워터마킹 시스템과 같은 생성된 I.

이것은 흥미로운 생각이지만 해석의 여지를 많이 남깁니다. 지금까지 AI 회사들은 사람들이 A를 보고 있는지 아닌지를 구분할 수 있는 도구를 고안하기 위해 애를 썼습니다.I. 생성된 콘텐츠. 여기에는 좋은 기술적인 이유가 있지만, 사람들이 AI가 만들어내는 일을 자신의 것으로 치부할 수 있을 때는 정말 문제입니다. (어떤 고등학교 선생님이든 물어보세요.) 그리고 현재 많은 도구들이 A를 감지할 수 있다고 홍보하고 있습니다.I. 출력은 어느 정도의 정확도로는 그렇게 할 수 없습니다.

저는 이 문제가 완전히 해결될 수 있다고 낙관하지 않습니다. 하지만 저는 회사들이 그것을 위해 노력하겠다고 약속하고 있어서 기쁩니다.

약속 6: 회사는 A를 공개적으로 보고할 것을 약속합니다.I. 시스템의 기능, 한계 및 적절하고 부적절한 사용 영역.

많은 공간이 있는 또 다른 합리적인 소리의 서약. 회사는 얼마나 자주 시스템의 기능과 제한 사항을 보고해야 합니까? 그 정보는 얼마나 상세해야 합니까? 그리고 많은 회사들이 A를 짓는 것을 감안하면.I. 시스템은 이후 자체 시스템의 기능에 놀라움을 금치 못했습니다. 사전에 시스템을 얼마나 잘 설명할 것으로 기대할 수 있습니까?

약속 7: 회사는 A의 사회적 위험에 대한 연구의 우선 순위를 정하는 데 전념합니다.I. 시스템은 유해한 편견과 차별을 피하고 사생활을 보호하는 것을 포함하여 포즈를 취할 수 있습니다.

“연구의 우선순위를 정하는 것”을 약속하는 것은 약속만큼 모호합니다. 그럼에도 불구하고, 저는 이 약속이 A에서 많은 사람들에게 잘 받아들여질 것이라고 확신합니다.I. 윤리 군중, A를 원하는 사람들.I. 기업은 A와 같이 최후의 날 시나리오에 대한 걱정보다 편견과 차별과 같은 단기적인 피해를 예방하는 것을 우선시합니다.I. 안전 담당자들은 그렇습니다.

만약 당신이 “A”의 차이 때문에 혼란스럽다면요.I. 윤리”와 “A.I. safety,” 단지 A 내에 두 개의 교전 파벌이 있다는 것을 알고 있을 뿐입니다.I. 연구 공동체는 서로가 잘못된 종류의 해를 예방하는 데 초점을 맞추고 있다고 생각합니다.

약속 8: 회사는 고급 A를 개발하고 배치하기로 약속합니다.I. 사회의 가장 큰 도전 과제를 해결하는 데 도움이 되는 시스템.

저는 많은 사람들이 고급 A라고 주장할 것이라고 생각하지 않습니다.I. 사회의 가장 큰 도전을 해결하는 데 도움을 주는 데 사용되어서는 안 됩니다. 백악관은 A를 좋아할 만한 분야 중 두 가지로 “암 예방”과 “기후 변화 완화”를 꼽았습니다.I. 기업들은 그들의 노력을 집중해야 하며, 그것은 거기에서 나의 반대를 받지 않을 것입니다.

그러나 이 목표를 다소 복잡하게 만드는 것은 A에서입니다.I. 연구, 하찮은 것으로 시작하는 것은 종종 더 심각한 영향을 미치는 것으로 밝혀집니다. 딥마인드의 알파고에 들어간 기술 중 일부는 A입니다.보드 게임 바둑을 두도록 훈련된 I. 시스템은 단백질의 3차원 구조를 예측하는 데 유용한 것으로 밝혀졌는데, 이것은 기초 과학 연구를 부흥시킨 주요한 발견입니다.

전체적으로, 백악관과 A의 거래.I. 기업은 실질적이라기보다는 상징적인 것 같습니다. 기업들이 이러한 약속을 준수하도록 보장하기 위한 이행 메커니즘은 없으며, 그 중 많은 부분이 A의 주의사항을 반영합니다.I. 회사들은 이미 인수하고 있습니다.

그래도 합리적인 첫걸음입니다. 그리고 이 규칙들을 따르는데 동의하는 것은 A를 보여줍니다.I. 기업들은 이전 기술 기업들의 실패로부터 교훈을 얻었고, 그들은 정부와 관계를 맺기를 기다렸으며, 그들이 문제에 휘말릴 때까지 말이죠. 워싱턴에서는 적어도 기술 규제와 관련된 곳에서는 일찍 나타나는 것이 비용이 듭니다.

This week, the White House announced that it had secured “voluntary commitments” from seven leading A.I. companies to manage the risks posed by artificial intelligence.

Getting the companies — Amazon, Anthropic, Google, Inflection, Meta, Microsoft and OpenAI — to agree to anything is a step forward. They include bitter rivals with subtle but important differences in the ways they’re approaching A.I. research and development.

Meta, for example, is so eager to get its A.I. models into developers’ hands that it has open-sourced many of them, putting their code out into the open for anyone to use. Other labs, such as Anthropic, have taken a more cautious approach, releasing their technology in more limited ways.

But what do these commitments actually mean? And are they likely to change much about how A.I. companies operate, given that they aren’t backed by the force of law?

Given the potential stakes of A.I. regulation, the details matter. So let’s take a closer look at what’s being agreed to here and size up the potential impact.

Commitment 1: The companies commit to internal and external security testing of their A.I. systems before their release.

Each of these A.I. companies already does security testing — what is often called “red-teaming” — of its models before they’re released. On one level, this isn’t really a new commitment. And it’s a vague promise. It doesn’t come with many details about what kind of testing is required, or who will do the testing.

In a statement accompanying the commitments, the White House said only that testing of A.I. models “will be carried out in part by independent experts” and focus on A.I. risks “such as biosecurity and cybersecurity, as well as its broader societal effects.”

It’s a good idea to get A.I. companies to publicly commit to continue doing this kind of testing, and to encourage more transparency in the testing process. And there are some types of A.I. risk — such as the danger that A.I. models could be used to develop bioweapons — that government and military officials are probably better suited than companies to evaluate.

I’d love to see the A.I. industry agree on a standard battery of safety tests, such as the “autonomous replication” tests that the Alignment Research Center conducts on prereleased models by OpenAI and Anthropic. I’d also like to see the federal government fund these kinds of tests, which can be expensive and require engineers with significant technical expertise. Right now, many safety tests are funded and overseen by the companies, which raises obvious conflict-of-interest questions.

Commitment 2: The companies commit to sharing information across the industry and with governments, civil society and academia on managing A.I. risks.

This commitment is also a bit vague. Several of these companies already publish information about their A.I. models — typically in academic papers or corporate blog posts. A few of them, including OpenAI and Anthropic, also publish documents called “system cards,” which outline the steps they’ve taken to make those models safer.

But they have also held back information on occasion, citing safety concerns. When OpenAI released its latest A.I. model, GPT-4, this year, it broke with industry customs and chose not to disclose how much data it was trained on, or how big the model was (a metric known as “parameters”). It said it declined to release this information because of concerns about competition and safety. It also happens to be the kind of data that tech companies like to keep away from competitors.

Under these new commitments, will A.I. companies be compelled to make that kind of information public? What if doing so risks accelerating the A.I. arms race?

I suspect that the White House’s goal is less about forcing companies to disclose their parameter counts and more about encouraging them to trade information with one another about the risks that their models do (or don’t) pose.

But even that kind of information-sharing can be risky. If Google’s A.I. team prevented a new model from being used to engineer a deadly bioweapon during prerelease testing, should it share that information outside Google? Would that risk giving bad actors ideas about how they might get a less guarded model to perform the same task?

Commitment 3: The companies commit to investing in cybersecurity and insider-threat safeguards to protect proprietary and unreleased model weights.

This one is pretty straightforward, and uncontroversial among the A.I. insiders I’ve talked to. “Model weights” is a technical term for the mathematical instructions that give A.I. models the ability to function. Weights are what you’d want to steal if you were an agent of a foreign government (or a rival corporation) who wanted to build your own version of ChatGPT or another A.I. product. And it’s something A.I. companies have a vested interest in keeping tightly controlled.

There have already been well-publicized issues with model weights leaking. The weights for Meta’s original LLaMA language model, for example, were leaked on 4chan and other websites just days after the model was publicly released. Given the risks of more leaks — and the interest that other nations may have in stealing this technology from U.S. companies — asking A.I. companies to invest more in their own security feels like a no-brainer.

Commitment 4: The companies commit to facilitating third-party discovery and reporting of vulnerabilities in their A.I. systems.

I’m not really sure what this means. Every A.I. company has discovered vulnerabilities in its models after releasing them, usually because users try to do bad things with the models or circumvent their guardrails (a practice known as “jailbreaking”) in ways the companies hadn’t foreseen.

The White House’s commitment calls for companies to establish a “robust reporting mechanism” for these vulnerabilities, but it’s not clear what that might mean. An in-app feedback button, similar to the ones that allow Facebook and Twitter users to report rule-violating posts? A bug bounty program, like the one OpenAI started this year to reward users who find flaws in its systems? Something else? We’ll have to wait for more details.

Commitment 5: The companies commit to developing robust technical mechanisms to ensure that users know when content is A.I. generated, such as a watermarking system.

This is an interesting idea but leaves a lot of room for interpretation. So far, A.I. companies have struggled to devise tools that allow people to tell whether or not they’re looking at A.I. generated content. There are good technical reasons for this, but it’s a real problem when people can pass off A.I.-generated work as their own. (Ask any high school teacher.) And many of the tools currently promoted as being able to detect A.I. outputs really can’t do so with any degree of accuracy.

I’m not optimistic that this problem is fully fixable. But I’m glad that companies are pledging to work on it.

Commitment 6: The companies commit to publicly reporting their A.I. systems’ capabilities, limitations, and areas of appropriate and inappropriate use.

Another sensible-sounding pledge with lots of wiggle room. How often will companies be required to report on their systems’ capabilities and limitations? How detailed will that information have to be? And given that many of the companies building A.I. systems have been surprised by their own systems’ capabilities after the fact, how well can they really be expected to describe them in advance?

Commitment 7: The companies commit to prioritizing research on the societal risks that A.I. systems can pose, including on avoiding harmful bias and discrimination and protecting privacy.

Committing to “prioritizing research” is about as fuzzy as a commitment gets. Still, I’m sure this commitment will be received well by many in the A.I. ethics crowd, who want A.I. companies to make preventing near-term harms like bias and discrimination a priority over worrying about doomsday scenarios, as the A.I. safety folks do.

If you’re confused by the difference between “A.I. ethics” and “A.I. safety,” just know that there are two warring factions within the A.I. research community, each of which thinks the other is focused on preventing the wrong kinds of harms.

Commitment 8: The companies commit to develop and deploy advanced A.I. systems to help address society’s greatest challenges.

I don’t think many people would argue that advanced A.I. should not be used to help address society’s greatest challenges. The White House lists “cancer prevention” and “mitigating climate change” as two of the areas where it would like A.I. companies to focus their efforts, and it will get no disagreement from me there.

What makes this goal somewhat complicated, though, is that in A.I. research, what starts off looking frivolous often turns out to have more serious implications. Some of the technology that went into DeepMind’s AlphaGo — an A.I. system that was trained to play the board game Go — turned out to be useful in predicting the three-dimensional structures of proteins, a major discovery that boosted basic scientific research.

Overall, the White House’s deal with A.I. companies seems more symbolic than substantive. There is no enforcement mechanism to make sure companies follow these commitments, and many of them reflect precautions that A.I. companies are already taking.

Still, it’s a reasonable first step. And agreeing to follow these rules shows that the A.I. companies have learned from the failures of earlier tech companies, which waited to engage with the government until they got into trouble. In Washington, at least where tech regulation is concerned, it pays to show up early.

📰 관련 뉴스

댓글 남기기 취소