What happens when toxic online behavior enters the metaverse?

Watch the panel

360/Open Summit: Contested Realities | Connected Futures

June 6-7, 2022

The Atlantic Council’s Digital Forensic Research Lab (DFRLab) hosts 360/Open Summit: Contested Realities | Connected Futures in Brussels, Belgium.

Event transcript

Uncorrected transcript: Check against delivery

Speakers

Daniel Castaño
Founding Partner, Mokzy

Katherine Lo
Content Moderation Lead, Meedan (US)

Kimberly Voll
Co-Founder, Fair Play Alliance

Moderator

Brittan Heller
Nonresident Fellow, Digital Forensic Research Lab, Atlantic Council

BRITTAN HELLER: So welcome to the last panel of the day. I stand between you and beer, chocolate, and waffles. And so we’re going to have a good time, we’re going to talk about this, we’re going to get excited, and then we’re going to go talk about it afterwards together.

My name is Brittan Heller. I am a fellow at the Digital Forensic Research Lab focusing on AR, VR, and technology…

I have brought some panelists with me here today, one of them virtually when we can get her up here. And I’m going to ask the panelists to introduce themselves so that this is more conversational. So, Kat, why don’t you go first?

KATHERINE LO: I’m Kat Lo. I’m content moderation lead at Meedan and I work with fact-checkers, human rights defenders, journalists, and targets of harassment and hate to think about what the product implication of content moderation decisions are and how to translate, you know, policy decisions into making—designing a product that would actually protect people because it often doesn’t happen despite our best attempts.

DANIEL CASTAÑO: Hi, everyone. My name is Daniel Castaño. I’m from Colombia. I’m a law professor at the Universidad Nacional de Colombia, and I’m also a consultant working for big tech and emerging technologies privacy policy and digital ethics.

BRITTAN HELLER: And Kim, why don’t you go now.

KIMBERLY VOLL: Hello. Can you hear me OK?

BRITTAN HELLER: Yeah.

KIMBERLY VOLL: Excellent. Hello from Vancouver, Canada. Thanks for having me from all the way over here.

As mentioned, my name is Kimberly Voll. I co-founded and today co-run the Fair Play Alliance, which is a cross industry initiative of over 250 gaming companies around the world. We focus on using game development to encourage healthier behavior, reduce disruptive or harmful behavior online—in games and in online spaces more broadly.

By day I’m also the studio head at Brace Yourself Games here in Vancouver, Canada, and my kind of background is a mix of—I’m a researcher, designer, developer, and a long-time game maker. I focus a lot on digital social dynamics; what it means to thrive in digital spaces. And on the formal side, I have a PhD in Computer Science, specializing in artificial intelligence (AI), as well as an Honors Degree in Cognitive Science. I’m very happy to be here.

BRITTAN HELLER: Thank you. So, I’m going to give you a trigger warning, where when you talk about content moderation and online harms, sometimes it can involve very sensitive issues. So, if you feel like you are upset, please feel free to go get some water. Please feel free to step out, and please feel free to talk to us afterwards about it.

So, with that, we’re going to dive right in. My first questions for the panelists are—what is the metaverse? There’s many definitions going—going around. What is the metaverse? What are the major differences between AR and VR technology, and what does the hardware scape look like?

So, Kim, do you want to start?

KIMBERLY VOLL: Sure, yeah. I mean, for me I think it’s important to talk from the perspective of what I think about as metaversal technology. I think there’s a lot of oversimplification of things like saying the metaverse is VR or vice versa. But I think, zooming back, there’s really sort of three key pieces to what we’re seeing in terms of metaversal tech and our move toward this concept of the metaverse.

The first is interactivity. So, I think we’re seeing a dramatic increase of the fidelity of the ways in which we can interact with each other in digital spaces. So, notably, this is where VR plays a very important role, but I think it is not exclusive to VR. I think we’re going to see a lot of broadening of technologies and ways in which we interact. It’s just that right now this sort of frontier of fidelity sits with the modern VR technology.

I think the second one is economic. So, obviously, we talk a lot about NFTs and those sorts of technologies kind of pushing the boundaries of economies, and regardless of where you fall on that scale—I don’t want to derail us going down that road—but I think what we’re seeing is a shift in how we think about the economics in online spaces and a more fully formed concept of digital ownership, or lack thereof, depending upon how we want to take it.

And then, the third one is just scale. So, we’re talking platform scalability. You know, taking us from, say, 10,000 concurrent users to, you know, hundreds of millions or, even by some conceptions, infinite sized communities.

So, I think this really has shifted—we’re no longer broken up into small communities, but we have the potential to interact at huge, huge scales that we just haven’t really seen before.

BRITTAN HELLER: So, how many here have you—how many people here have used virtual reality? We have a smattering. How many people have used augmented reality? How many people have used snap lenses on your phone? How many people have used Instagram filters? How many people have used QR codes?

So, everyone who put their hand up at least once has used immersive technology, or metaversal technology, or digital worlds, AR and VR. The age of this immersive technology is here. It’s just many people don’t realize it yet.

So, Daniel, do you want to talk a little bit about the hardware scape, what the hardware looks like now?

DANIEL CASTAÑO: Sure. Well, that’s a big difference that we need to make when we talk about the metaverse and immersive technology.

So, we have basically two different technologies that can be combined. We have augmented reality. We have virtual reality, and we have mixed reality.

Basically, like a very broad definition would be a combination between hardware and software, whose synergy is able to produce immersive ecosystems. And when we have such a broad definition, it might lay in the intersection with other two different concepts that comes from the 1990s. So that’s something very interesting, Brittan.

The metaverse has been around for over thirty years. The thing is that the gateways are unequally distributed. And that’s what’s happening right now. But we have other two concepts. The first one is the cyberspace, and that takes us to the famous discussion, Larry Lessig, that gave rise to the Law of the Horse. So how difference is really the cyberspace from what’s happening right now. And now we have a more recent concept, coined by Luciano Floridi from Oxford, which is the infosphere.

So, basically, we’re living in an infosphere where our reality is augmented by different technologies. So every time we use ways to go from a point A to a point B, can we say that we’re navigating through the metaverse? And I want to end the question.

BRITTAN HELLER: When I think about the metaverse, I actually think about this. The metaverse is a pervasive social-computing-based platform designed to replace the functionality of your cell phone. It will be constantly on. You will not be able to turn it off. And so if you think about the type of interfaces that are coming out, they’re similar to Apple watches or smart glasses, with the capability to have you take pictures without hands, to use your voice to control calls, to post things to social media with just a touch. This is what the metaverse is going to be. It’s going to be the next generation of hardware that we use to access online spaces.

And so, Kat, what do you think the major differences between AR and VR are?

KATHERINE LO: That’s a good question. I think, for me, VR—at least how people conceptualize it is very distinct from what you see as, like, your embodiment in real life. There’s a separate sense of embodiment in VR and AR. Yeah, I think AR to many people is how you feel embodied in the real world and how you use technology as, like, a lens to see, like, the real world.

I don’t think that’s necessarily what they are differently, but I think in conversations that people have, there is this very clear distinction for some that I guess will be kind of combined with a lot of new products coming out, I guess, like Project Cambria and things like that. And so I guess it’ll be interesting to see when they become a lot more matched together.

But, yeah, I think what’s interesting here is the distinctions between how people are talking about the metaverse, say, on Twitter or something and how we’re conceiving of it as this very all-encompassing experience where, yeah, AR and VR aren’t necessarily distinct concepts.

For me, the most interesting thing about the metaverse is that the hardware is not set yet. And so what this means is that you can access augmented reality, which I define as a digital overlay onto present space, which is why I said a QR code is augmented reality. You know, an Instagram filter is augmented reality. The type of functions you used during the pandemic to try on clothes, and then you get it and think, oh, that doesn’t look quite right, does it, but it lets you try it on virtually, that’s augmented reality.

Virtual reality is more what you access through an all-encompassing headset at this point. And that’s characterized by things researchers call immersion and presence, which means you feel like you are really there. If you haven’t tried this, I put my 101-year-old grandmother in my headset over Thanksgiving and I asked her what happened. And she said it’s incredible. I was at the bottom of the ocean and a blue whale came by and we made eye contact, and there were these swarms of turtles and fish. It was one of the most amazing things of my life. And I said, Grandma, that’s fascinating, because what happened was I put a helmet on you and you listened to a soundtrack coming out of the straps and you watched images go before your eyes. That’s what happened. But the way she described it was the content like she was really there.

So immersion is created by all of these elements that make it feel like your real, actual reality. Mixed reality is going to be when you can blink between the two. And so I think that they’re converging to that point, but we’re not quite there yet. The reason I bring this up is because it’s going to have very substantial distinctions when we get to questions like online harms and safety risks and challenges—something that’s an overlay on your real world will have very different threat factors than something that you are in comprehensively, if that makes any sense.

So if you haven’t tried virtual reality and augmented reality, I highly recommend trying it. You don’t really understand the persuasiveness and magic of this stuff until you do it. One of the first things that I did was flying through a redwood forest and being able to look at it above and through and going down to the roots and seeing it from all those perspectives, and then I jumped off a building, so—and I have to tell you, that felt really real. So let’s now move from setting the scene to talking about the problems.

Kat, what are some of the safety risks and challenges that you see emerging from users in VR?

KATHERINE LO: You know, I think talking about the distinction between AR and VR has reminded me of a major risk that I think a lot of people discount, so earlier I tweeted about going in a VR chat, which is a social VR world—

BRITTAN HELLER: Don’t start there.

KATHERINE LO: Yeah. Don’t go there. Don’t go there first. Play the fun music games; those are great. But I went into a VR chat, which is known for being, like, very customizable, very open world, and within the first minute or so, the first thing I saw were people running around and chanting the N-word, and the second thing, about ten minutes later, were a bunch of people swarming a girl, and I clicked on her profile and she said she was fifteen. And that tweet, for some reason, went viral and the responses kind of were twofold; it was a bunch of people saying that happened to me too, this is why I don’t go in that space anymore, I don’t feel safe, and then a bunch of people saying, well, just take the VR—just take the headset off; why don’t you just take it off? And I think a lot of people don’t really recognize VR as being such a real, embodied experience, and as a result they don’t take it seriously the same way.

I think—so it’s like the difference between being in a park—like, are people yelling a bunch of slurs at you versus being in VR or being on social media? People seem to often treat the VR space as being on social media more often where they’re just sort of yelling at you and they say, well, just log off. Now it’s, just take the VR headset off. And I think that kind of is a blanket concept for a lot of these issues where you have things like grooming of children where now with the Quest 2, which is a headset that is much cheaper—now has come out parents are buying these for their kids like in the thousands and you have a bunch of kids hopping on VR chat, which is—and many other platforms that are technically [for adults] but kids can go anywhere. Like, kids will find a way to get on any platform and there’s just no regulations around it. You see people groping women in these spaces and, yeah, the problem is that people have bodies in these spaces but they don’t have autonomy, like they can’t push back, necessarily. I mean, some platforms have now instituted, like, boundaries, like, by default, that make it so that people can’t get into your space, although simultaneously some of the social norms in VR is asking people to take that boundary off so that you can properly socialize.

So there are just a lot of challenges. I’m trying not to enumerate too many.

BRITTAN HELLER: Yeah, in VR chat your avatar can be anything. Mine is a flying toaster for a Windows 95 callback but—

KATHERINE LO: Nice.

BRITTAN HELLER:—you can be anything and let your mind go. That’s where people go.

KATHERINE LO: Yeah. And unfortunately, people use it also for extremist imagery or even less extremist but things that are a bit more innocuous, and unfortunately, since things aren’t, like, in text form, it’s very hard to detect it, to have a paper trail to even prove it. So people—you know, people don’t even seem to report things that often on these platforms, but if they do report things, what kind of evidence do you provide? And a lot of platforms have a lot of answers to it and none of them are terribly effective, necessarily. But yeah, so you have, I guess, the whole gamut.

BRITTAN HELLER: Kim, can you explain for us some of the risks that are endemic to AR as opposed to VR?

KIMBERLY VOLL: Yeah. I mean, you know, plus-one to everything Kat just said in terms of those difficulties and the huge range of challenges that we see in these spaces. And VR is interesting because it brings that huge amount of fidelity to the experience. And so it mimics a lot of our human-to-human interactions in ways that we don’t have the social infrastructure or social protocols, or all of the fixings, if you will, that we have in meat space. We don’t have those in these high-fidelity experiences. And yet, the fidelity prompts us to behave, in a way, as if we do. And so that’s one of the big fundamental, I think, breakdowns in VR.

When you take a look at AR, though, you know, these are not as high-fidelity experiences, in a sense that what we are trying to do is take something that is, you know, artificial or digital, and supplant it on our otherwise offline reality, and mix those two things together. And that gives rise to, I think, a bunch of different interesting other problems. So, you know, some, I think, overlap, but they take different forms. So in AR, you know, thinking about things like privacy and profiling, because we’re mixing our realities in ways that point to us in very specific senses, so people can get more information about us, for example.

And on the flipside, we can actually present ourselves in ways where, you know, I might have an advantage over you because I have more information about you, I just happen to be wearing something that is feeding me some level of information that you don’t necessarily know about. So it can create these power imbalances in strange and interesting ways. I think we’re just not used to thinking about that in our—in our day-to-day lives. And then I think, like, all of the other things from VR, even though the fidelity of the experience is different in nature, I think those come over as well.

So, you know, you see the harassment. You see the abuse. You see potential for predatory conduct—grooming, extremism, those more extreme things. And then you also see the wide range of just, if you will, social foibles. You know, like, I think one of—at the heart of a lot of this is the ambiguity that exits between figuring out people’s intent, the intersection of cultures or subcultures or norms, even just recognizing what’s happening, that situations can be fundamentally more ambiguous in these spaces. So I think there’s a lot more care that needs to go into just how we architect these spaces, how we moderate these spaces, and how we train ourselves as human beings now operating in these hybridized, digitized, non-digitized spaces.

BRITTAN HELLER: That’s really astute. Some of the work that I’ve done has focused on location-based stalking through augmented reality, because a lot of times when you play these games on your phone, you’re actually creating a real-time map of where you are and broadcasting it out to the world. So be a little wary. It’s fun, but think about the dynamics of how the game works. Also, stalking and impersonation-based harms, because there’s no way to really authenticate yet that you are who you are, even if your avatar looks like that person.

I think there’s two points that I wanted to bring up before moving on. One—and I have a bit of momnesia, so we may just stick with one—one is that you don’t have to be photorealistic in AR and VR for it to feel real to you. Researchers actually use AR and VR for PTSD treatment. And they do that for veterans. And they find that it’s more effective to have it be representative or cartoony, because your brain fills in the gaps. So when people think about it being real, it doesn’t have to look real for it to feel like it’s actually happening to you.

And, second, the way that you experience events or people in AR and VR is like your reality. These experiences, like when I talked about my grandmother, are processed through your hippocampus. And what that means is that it’s imprinted on your brain in the same way that you create memories. So when people say that they were sexually assaulted or sexually harassed in AR and VR, it’s because it feels like they were. It feels like somebody came into their living room and groped them. It does not feel like they read a harsh Twitter thread. It’s very, very different. So I am a former prosecutor. And the way that people describe these experiences, for me, are the same way that people who were physically or sexually assaulted would describe it, when I’d be doing cases.

So let’s talk a little bit about content moderation now that people are looking at me with eyes like this. Daniel, what do you—what do you think are the important differences between social media content moderation and AR/VR moderation, other than the—like, the risks of it feeling real?

DANIEL CASTAÑO: Well, I think it hits really different because this kind of technology has a direct incidence on the plasticity of the human mind, so you feel it directly.

The other big problem is: What kind of tools are you going to use? What kind of tools are you going to develop to moderate content? It is enough to have, like, human content moderators, or do we need to use AI? But I won’t jump into the AI question yet.

So, first of all, it’s very difficult to define the rules because it is a totally different world. So even if you recap on the definition of the metaverse, it is a digital world that is beyond our analog world and that is unbound from any social or value that we have in our current cultures.

So our first approach will be: What kind of methodology, what kind of instruments should we develop and should we devise to convey standards of conduct? Is it enough just to publish, you know, like a laundry list when you start your immersive experience, or should we come up with a different way to convey those standards?

And then the second problem will be: How should we enforce them? Should we use AI? Should we just—we were just looking to a video before coming to our talk. In the United Arab Emirates, they are putting police officers in the metaverse. So is that a way to enforce that—like beyond the community standards, just moving to, you know, state law?

So I would say that those are the biggest challenges of content moderation in social media and then in VR.

BRITTAN HELLER: Kim, would you be able to go into more detail about how content moderation currently works in some of the larger experiences that many people go to, so Horizon Worlds or Ray-Ban Stories?

KIMBERLY VOLL: Yeah. I mean, I think that—like, Daniel covered a lot of I think what the key challenges are of these spaces. It’s not the same as being able to just—to just do, if you will, text moderation, not to imply that that’s easy. It’s a very difficult space and we’re struggling as an industry to get traction. It’s getting better, but we’ve got a lot to do there. So, you know, the high fidelity of the experiences means that it is very, very difficult for us to track things. And even if—the flipside—we were to get really good at tracking high-fidelity experiences, there is a whole question that arises around privacy and surveillance because now we’re very good at detecting the things that we do in our worlds, and we don’t necessarily want to go down that road. So figuring out sort of what that happy medium is, I think, is one of the key pieces.

A lot of what we see today is taking advantage of things that we use in flat spaces. So if there are tech-space experiences or increasingly voice chats, you know, trying to moderate, use the flat experience moderation tools in these more high-fidelity experiences, that’s very limited because there’s so much more going on in these spaces, as was mentioned earlier. You know, like your proximity to people, how you’re moving about that world is a vector for causing harassment and stress in other individuals.

So probably the most effective form that we have today is actually having some form of human observation. So chaperones in spaces or bouncers, if you like, like people that are there with the express purpose of monitoring what is transpiring in these spaces and taking action to, you know, cut off someone’s access if they think they’re doing something inappropriate, et cetera.

That’s, obviously, very difficult because it doesn’t scale. You know, we actually need a human who’s in the rig who’s monitoring these things. It also doesn’t scale in some respects because it can be very difficult to differentiate what’s happening in these spaces. You know, part of the richness of these worlds is the ability to take on different identities. This is actually a really important part of this space and the cultures evolving around these, and it also attracts a wide range of people who have more or less experience in these spaces, come from a variety of different cultures. That mix of things means that we’re stumbling all over ourselves. Like, things are going wrong at a pretty frequent basis, and it can sometimes be hard to differentiate what is an innocent mistake versus something that is more intentional or even more sinister.

And so, like, that scaling question, then, is one of having enough experience to start to detect the nuances to know that, oh, that person is doing what they’re doing because their headset crashed or, you know, they clearly don’t know what they’re doing and they’re bumbling about. Versus, you know, that person has now clicked into that person’s space so many times, and at a particular angle, that they’re definitely doing something inappropriate there. So there’s a lot of those, I think, really interesting challenges that make this space so incredibly difficult.

BRITTAN HELLER: Yeah. When I advise AR and VR companies, the one thing I wish they remembered is that this is not social media. You cannot just take community standards or codes of conduct from a 2D content and conduct code and transfer it into a 3D environment, where you’re going to need content, conduct, and environment. There’s kind of a joke that when Lego made a virtual world it almost bankrupted itself trying to keep teenage boys from terraforming phalluses into the world. And it’s kind of foreseeable, but it illustrates that you really need to think about what user-generated content is going to look like in this space. And it’s going to be environmental. And how are you going to do that?

Right now content moderation in social media is predicated on AI classifiers. We do not have classifiers for human-to-human behavioral interactions, for environmental interactions and, Kim said, the talk to text transcription is lagging. So it can’t be done in real time in the same way. This is—this is the Rubicon that AR and VR will have to cross before content moderation is effective. We might have the rules, but we won’t have the enforcement regime that needs to follow.

So I talked a little bit about the “Children of the Corn” and the Lego world. Daniel, is this just a part of online culture? And Kim mentioned global communities. How are we supposed to create a metaverse for all people?

DANIEL CASTAÑO: Well, that’s actually a great question. I think we should start from global rules, but they should be locally informed. And what should be the language that the metaverse should be speaking? I suggest that it’s something that I call the language of legality. The enlightened language of legality, which is a combination of different things. First thing, we need a constitution for the Metaverse—a binding constitution for the Metaverse. The second thing, we need that constitution to have a separation of powers. We need someone that makes the rules. We need someone enforcing the rules. And then if controversy arises, we need someone that adjudicates that kind of disputes.

Third, we need the legality. So we need to bind the behavior of everyone that is partaking in immersive spaces, afford, of course, human rights protection. So I think that if we speak the language of legality, then we can have a common ground for building a universal metaverse. But then one thing is to coin the principles. That’s very easy and that’s something that is happening today with AI. We have a bunch of AI principles, but then it’s very difficult to bring them to practice.

So we need to figure out ways on how we can actually build a metaverse that conveys the language of legality. And that’s something that we should count, for example, constitutionality by design or rule of law by design. And it’s basically teaching everyone—engineers, developers, lawyers, policymakers, civil society—how we can actually embed those principles in the architecture of the metaverse. I think that’s the only way in which we can build a metaverse for everyone…

Watch the full event

Image: Francis Mwangi, 13, uses an Oculus virtual reality (VR) headset, to virtually visit Buckingham Palace during the celebration of Britain's Queen Elizabeth's Platinum Jubilee, in Nyeri, Kenya June 2, 2022. REUTERS/Thomas Mukoya TPX IMAGES OF THE DAY