인간 상호 작용의 미래에 대한 탐구

"분산 시스템, 생성 AI, 음성 처리의 실시간 솔루션에 대한 Roblox의 근본적인 연구는 혁신에 대한 획기적인 투자입니다. 우리는 과학자, 엔지니어, 학계 간 협력을 통해 수십억 명의 긍정적이고 매너 있는 온라인 커뮤니티 조성을 위한 기술을 주도하고 있습니다." - 모건 맥과이어(Morgan McGuire), 최고 과학자

연구 저작물

Shortest Path to Boundary for Self-Intersecting Meshes

He Chen (U. Utah), Elie Diaz (U. Utah), Cem Yuksel (U. Utah, Roblox), SIGGRAPH 2023

We introduce a method for efficiently computing the exact shortest path to the boundary of a mesh from a given internal point in the presence of self-intersections. We provide a formal definition of shortest boundary paths for self-intersecting objects and present a robust algorithm for computing the actual shortest boundary path. The resulting method offers an effective solution for collision and self-collision handling while simulating deformable volumetric objects, using fast simulation techniques that provide no guarantees on collision resolution. Our evaluation includes complex self-collision scenarios with a large number of active contacts, showing that our method can successfully handle them by introducing a relatively minor computational overhead.

Surface Simplification using Intrinsic Error Metrics

Hsueh-Ti Derek Liu (U. Toronto, Roblox) , Mark Gillespie (Carnegie Mellon) , Benjamin Chislett (U. Toronto), Nicholas Sharp (U. Toronto, NVIDIA), Alec Jacobson (U. Toronto, Adobe Research), Keenan Crane (Carnegie Mellon), SIGGRAPH 2023

This paper describes a method for fast simplification of surface meshes. Whereas past methods focus on visual appearance, our goal is to solve equations on the surface. Hence, rather than approximate the extrinsic geometry, we construct a coarse intrinsic triangulation of the input domain. In the spirit of the quadric error metric (QEM), we perform greedy decimation while agglomerating global information about approximation error. In lieu of extrinsic quadrics, however, we store intrinsic tangent vectors that track how far curvature “drifts” during simplification. This process also yields a bijective map between the fine and coarse mesh, and prolongation operators for both scalar- and vector-valued data. Moreover, we obtain hard guarantees on element quality via intrinsic retriangulation – a feature unique to the intrinsic setting. The overall payoff is a “black box” approach to geometry processing, which decouples mesh resolution from the size of matrices used to solve equations. We show how our method benefits several fundamental tasks, including geometric multigrid, all-pairs geodesic distance, mean curvature flow, geodesic Voronoi diagrams, and the discrete exponential map.

Composite Motion Learning with Task Control

Pei Xu (Clemson, Roblox), Xiumin Shang (U. California, Merced), Victor Zordan (Roblox, Clemson), Ioannis Karamouzas (Clemson, U. California, Riverside), SIGGRAPH 2023

We present a deep learning method for composite and task-driven motion control for physically simulated characters. In contrast to existing data-driven approaches using reinforcement learning that imitate full-body motions, we learn decoupled motions for specific body parts from multiple reference motions simultaneously and directly by leveraging the use of multiple discriminators in a GAN-like setup. In this process, there is no need of any manual work to produce composite reference motions for learning. Instead, the control policy explores by itself how the composite motions can be combined automatically. We further account for multiple task-specific rewards and train a single, multi-objective control policy. To this end, we propose a novel framework for multi-objective learning that adaptively balances the learning of disparate motions from multiple sources and multiple goal-directed control objectives. In addition, as composite motions are typically augmentations of simpler behaviors, we introduce a sample-efficient method for training composite control policies in an incremental manner, where we reuse a pre-trained policy as the meta policy and train a cooperative policy that adapts the meta one for new composite tasks. We show the applicability of our approach on a variety of challenging multi-objective tasks involving both composite motion imitation and multiple goal-directed control.

StarCoder: may the source be with you!

Raymond Li (ServiceNow Research), Loubna Ben Allal (Hugging Face), Yangtian Zi (Northeastern), Niklas Muennighoff (Hugging Face), Denis Kocetkov (ServiceNow Research), Chenghao Mou (Independent), Marc Marone (Johns Hopkins), Christopher Akiki (Leipzig U., ScaDS.AI), Jia Li (Independent), Jenny Chim (Queen Mary U. of London), Qian Liu (Sea AI Lab), Evgenii Zheltonozhskii (Technion IIT), Terry Yue Zhuo (Monash U., CSIRO's Data61), Thomas Wang (Hugging Face), Olivier Dehaene (Hugging Face), Mishig Davaadorj (Hugging Face), Joel Lamy-Poirier (ServiceNow Research), João Monteiro (ServiceNow Research), Oleh Shliazhko (ServiceNow Research), Nicolas Gontier (ServiceNow Research), Nicholas Meade (Mila, McGill), Armel Zebaze (Hugging Face), Ming-Ho Yee (Northeastern), Logesh Kumar Umapathi (Saama AI), Jian Zhu (U. British Columbia), Benjamin Lipkin (MIT), Muhtasham Oblokulov (Technical U. of Munich), Zhiruo Wang (Carnegie Mellon), Rudra Murthy (IBM Research), Jason Stillerman (U. Vermont), Siva Sankalp Patel (IBM Research), Dmitry Abulkhanov (Independent), Marco Zocca (UnfoldML), Manan Dey (SAP), Zhihan Zhang (U. Notre Dame), Nour Fahmy (Columbia U.), Urvashi Bhattacharyya (Discover Dollar), Wenhao Yu (U. Notre Dame), Swayam Singh (U. Allahabad), Sasha Luccioni (Hugging Face), Paulo Villegas (Telefonica I+D), Maxim Kunakov (Toloka), Fedor Zhdanov (Toloka), Manuel Romero (Independent), Tony Lee (Stanford U.), Nadav Timor (Weizmann Institute of Science), Jennifer Ding (Alan Turing Institute), Claire Schlesinger (Northeastern), Hailey Schoelkopf (Eleuther AI), Jan Ebert (Forschungszentrum Julich), Tri Dao (Stanford U.), Mayank Mishra (IBM Research), Alex Gu (MIT), Jennifer Robinson (ServiceNow), Carolyn Jane Anderson (Wellesley College), Brendan Dolan-Gavitt (NYU), Danish Contractor (Independent), Siva Reddy (ServiceNow Research, Mila), Daniel Fried (Carnegie Mellon), Dzmitry Bahdanau (ServiceNow Research), Yacine Jernite (Hugging Face), Carlos Muñoz Ferrandis (Hugging Face), Sean Hughes (ServiceNow), Thomas Wolf (Hugging Face), Arjun Guha (Northeastern, Roblox), Leandro von Werra (Hugging Face), Harm de Vries (ServiceNow Research), arXiv

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention. StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large collection of permissively licensed GitHub repositories with inspection tools and an opt-out process. We fine-tuned StarCoderBase on 35B Python tokens, resulting in the creation of StarCoder. We perform the most comprehensive evaluation of Code LLMs to date and show that StarCoderBase outperforms every open Code LLM that supports multiple programming languages and matches or outperforms the OpenAI code-cushman-001 model. Furthermore, StarCoder outperforms every model that is fine-tuned on Python, can be prompted to achieve 40% pass@1 on HumanEval, and still retains its performance on other programming languages. We take several important steps towards a safe open-access model release, including an improved PII redaction pipeline and a novel attribution tracing tool, and make the StarCoder models publicly available under a more commercially viable version of the Open Responsible AI Model license.

A Dataset and Explorer for 3D Signed Distance Functions

Towaki Takikawa (NVIDIA and University of Waterloo), Andrew Glassner (Unity/Weta Digital), Morgan McGuire (Roblox and University of Waterloo), Journal of Computer Graphics Techniques / i3D 2022

Reference datasets are a key tool in the creation of new algorithms. They allow us to compare different existing solutions and identify problems and weaknesses during the development of new algorithms. The signed distance function (SDF) is enjoying a renewed focus of research activity in computer graphics, but until now there has been no standard reference dataset of such functions. We present a database of 63 curated, optimized, and regularized functions of varying complexity. Our functions are provided as analytic expressions that can be efficiently evaluated on a GPU at any point in space. We also present a viewing and inspection tool and software for producing SDF samples appropriate for both traditional graphics and training neural networks.

Acceleration Skinning: Kinematics-Driven Cartoon Effects for Articulated Characters

Niranjan Kalyasundaram (Clemson), Damien Rohmer (LIX, Ecole Polytechnique/CNRS, IP Paris), Victor Zordan (Roblox and Clemson), Graphics Interface 2022

Cartoon effects described in animation principles are key to adding fluidity and style to animated characters. This paper extends the existing framework of Velocity Skinning to use skeletal acceleration, in addition to velocity, for cartoon-style effects on rigged characters. This Acceleration Skinning is able to produce a variety of cartoon effects from highly efficient closed-form deformers while remaining compatible with standard production pipelines for rigged characters. This paper showcases the introduction of the framework along with providing applications through three new deformers. Specifically, a followthrough effect is obtained from the combination of skeletal acceleration and velocity. Also, centrifugal stretch and centrifugal lift effects are introduced using rotational acceleration to model radial stretching and lifting effects. The paper also explores the application of effect-specific time filtering when combining deformations together allow for more stylization and artist control over the results.

Opportunities for the National Assessment of Educational Progress in an Age of AI and Pervasive Computation: A Pragmatic Vision

Karen J. Mitchell (Association of American Medical Colleges), Issac I. Bejar (Educational Testing Service), Jack Buckley (Roblox), Brian Gong (Center for Assessment), Andrew D. Ho (Harvard), Stephen Lazer (Questar Assessment), Susan M. Lottridge (Cambium Assessment), Richard M. Luecht (University of North Carolina, Greensboro), Rochelle S. Michel (Curriculum Associates), Scott Norton (Council of Chief State School Officers), John Whitmer (Federation of American Scientists), National Academies of Sciences, Engineering, and Medicine Panel Consensus Report 2022

The National Academies of Sciences, Engineering, and Medicine will appoint an ad hoc panel to consider several innovations that could substantially reduce the cost structure of NAEP while maintaining its technical quality and value in informing the public about education progress. The panel will review the major cost components of NAEP and related assessment programs and consider the following possible changes to the NAEP program: 1) automatic item generation; 2) remote test administration; 3) computer adaptive testing; and 4) consolidation and elimination of substantive overlaps between NAEP assessments and between NAEP and other assessments, such as PISA, TIMSS, and PIRLS. The panel will also solicit and consider suggestions of other major changes that reflect modern methods of assessment and that could substantially reduce NAEP costs while largely preserving its technical quality and informative value. The panel will review relevant research and industry practice to draw conclusions about the likely effects of these potential changes on the cost, technical quality, and informative value of NAEP.

The panel will produce a short and broadly accessible report that summarizes its findings and conclusions about these potential changes to NAEP and recommends potential assessment or programmatic changes and research needed for NAEP to explore innovations while balancing the competing objectives of cost reduction, technical quality and informative value.

Fast Facial Animation from Video

Iñaki Navarro, Dario Kneubuehler, Tijmen Verhulsdonck, Eloi du Bois, William Welch, Vivek Verma, Ian Sachs, Kiran Bhat, ACM SIGGRAPH 2021 Talk

Real time facial animation for virtual 3D characters has important applications such as AR/VR, interactive 3D entertainment, pre-visualization and video conferencing. Yet despite important research breakthroughs in facial tracking and performance capture, there are very few commercial examples of real-time facial animation applications in the consumer market. Mass adoption requires realtime performance on commodity hardware and visually pleasing animation that is robust to real world conditions, without requiring manual calibration. We present an end-to-end deep learning framework for regressing facial animation weights from video that addresses most of these challenges. Our formulation is fast (3.2 ms), utilizes images of real human faces along with millions of synthetic rendered frames to train the network on real-world scenarios, and produces jitter-free visually pleasing animations.

How We Scaled Bert to Serve 1+ Billion Daily Requests on CPU

Quoc Le and Kip Kaehler, Data + AI Summit 2021

Machine learning is a key part of our ability to scale important services to our massive community. In this talk, we share our journey of scaling our deep learning text classifiers to process 50k+ requests per second at latencies under 20ms. We will share how we were able to not only make BERT fast enough for our users, but also economical enough to run in production at a manageable cost on CPU.

Dynamic Diffuse Global Illumination Resampling

Zander Majercik (NVIDIA), Thomas Muller (NVIDIA), Alexander Keller (NVIDIA), Derek Nowrouzezahrai (McGill), Morgan McGuire (Roblox and McGill), ACM SIGGRAPH 2021 Talk

Interactive global illumination remains a challenge in radiometrically- and geometrically-complex scenes. Specialized sampling strategies are effective for specular and near-specular transport because the scattering has relatively low directional variance per scattering event. In contrast, the high variance from transport paths comprising multiple rough glossy or diffuse scattering events remains notoriously difficult to resolve with a small number of samples. We extend unidirectional path tracing to address this by combining screen-space reservoir resampling and sparse world-space probes, significantly improving sample efficiency for transport contributions that terminate on diffuse scattering events. Our experiments demonstrate a clear improvement — at equal time and equal quality — over purely path traced and purely probe-based baselines. Moreover, when combined with commodity denoisers, we are able to interactively render global illumination in complex scenes.

Game-based Assessment for Education

Jack Buckley, Laura Colosimo, Rebecca Kantar, Marty McCall and Erica Snow, OECD Digital Education Outlook 2021

This chapter discusses how recent advancements in digital technology could lead to a new generation of game-based standardised assessments in education, providing education systems with assessments that can test more complex skills than traditional standardised tests can. After highlighting some of the advantages of game-based standardised assessment compared to traditional ones, this chapter discusses how these tests are built, how they work, but also some of their limitations. While games have strong potential to improve the quality of testing and expand assessment to complex skills in the future, they will likely supplement traditional tests, which also have their advantages. Three examples of game-based assessments integrating a range of advanced technologies illustrate this perspective.

Robust Vision-Based Cheat Detection in Competitive Gaming

Aditya Jonnalagadda (University of California, Santa Barbara), Iuri Frosio (NVIDIA), Seth Schneider (NVIDIA), Morgan McGuire (NVIDIA; now at Roblox), and Joohwan Kim (NVIDIA)

Game publishers and anti-cheat companies have been unsuccessful in blocking cheating in online gaming. We propose a novel, vision-based approach that captures the final state of the frame buffer and detects illicit overlays. To this aim, we train and evaluate a DNN detector on a new dataset, collected using two first-person shooter games and three cheating software. We study the advantages and disadvantages of different DNN architectures operating on a local or global scale. We use output confidence analysis to avoid unreliable detections and inform when network retraining is required. In an ablation study, we show how to use Interval Bound Propagation to build a detector that is also resistant to potential adversarial attacks and study its interaction with confidence analysis. Our results show that robust and effective anti-cheating through machine learning is practically feasible and can be used to guarantee fair play in online gaming.

Goals of the Luau Type System

Lily Brown, Andy Friesen, and Alan Jeffery, Human Aspects of Types and Reasoning Assistants 2021

Luau is the scripting language that powers user-generated experiences on the Roblox platform. It is a statically-typed language, based on the dynamically-typed Lua language, with type inference. These types are used for providing editor assistance in Roblox Studio, the IDE for authoring Roblox experiences. Due to Roblox’s uniquely heterogeneous developer community, Luau must operate in a somewhat different fashion than a traditional statically-typed language. In this paper, we describe some of the goals of the Luau type system, focusing on where the goals differ from those of other type systems.

A Distributed, Decoupled System for Losslessly Streaming Dynamic Light Probes to Thin Clients

Michael Stengel (NVIDIA), Zander Majercik (NVIDIA), Benjamin Boudaoud (NVIDIA), Morgan McGuire (NVIDIA; now at Roblox), ACM Multimedia Systems Conference 2021

We present a networked, high-performance graphics system that combines dynamic, high-quality, ray traced global illumination computed on a server with direct illumination and primary visibility computed on a client. This approach provides many of the image quality benefits of real-time ray tracing on low-power and legacy hardware, while maintaining a low latency response and mobile form factor.

As opposed to streaming full frames from rendering servers to end clients, our system distributes the graphics pipeline over a network by computing diffuse global illumination on a remote machine. Diffuse global illumination is computed using a recent irradiance volume representation combined with a new lossless, HEVC-based, hardware-accelerated encoding, and a perceptually-motivated update scheme.

Our experimental implementation streams thousands of irradiance probes per second and requires less than 50 Mbps of throughput, reducing the consumed bandwidth by 99.4% when streaming at 60 Hz compared to traditional lossless texture compression.

The bandwidth reduction achieved with our approach allows higher quality and lower latency graphics than state-of-the-art remote rendering via video streaming. In addition, our split-rendering solution decouples remote computation from local rendering and so does not limit local display update rate or display resolution.

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Towaki Takikawa (University of Toronto, Vector Institute, and NVIDIA), Joey Litalien (NVIDIA and McGill), Kangxue Yin (NVIDIA), Karsten Kreis (NVIDIA), Charles Loop (NVIDIA), Derek Nowrouzezahrai (McGill), Alec Jacobson (University of Toronto), Morgan McGuire (McGill and NVIDIA; now at Roblox), Sanja Fidler (University of Toronto, Vector Institute, and NVIDIA), IEEE Computer Vision and Pattern Recognition 2021

Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. SDFs encode 3D surfaces with a function of position that returns the closest distance to a surface. State-of-the-art methods typically encode the SDF with a large, fixed-size neural network to approximate complex shapes with implicit surfaces. Rendering these large networks is, however, computationally expensive since it requires many forward passes through the network for every pixel, making these representations impractical for real-time graphics applications.

We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs, while achieving state-of-the-art geometry reconstruction quality. We represent implicit surfaces using an octree-based feature volume which adaptively fits shapes with multiple discrete levels of detail (LODs), and enables continuous LOD with SDF interpolation. We further develop an efficient algorithm to directly render our novel neural SDF representation in real-time by querying only the necessary LODs with sparse octree traversal. We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works. Furthermore, it produces state-of-the-art reconstruction quality for complex shapes under both 3D geometric and 2D image-space metrics.

The Leaky Semicolon

Mak Batty and Simon Cooksey (UKC), Alan Jeffrey (Roblox), Ilya Kaysin and Anton Podkopaev (JetBrains), James Riely (DePaul U)

Program logics and semantics tell a pleasant story about sequential composition: when executing (S1; S2), we first execute S1 then S2. To improve performance, however, processors execute instructions out of order, and compilers reorder programs even more dramatically. By design, single-threaded systems cannot observe these reorderings; however, multiple-threaded systems can, making the story considerably less pleasant. A formal attempt to understand the resulting mess is known as a “relaxed memory model.’’ Prior models either fail to address sequential composition directly, or overly restrict processors and compilers, or permit nonsense thin-air behaviors which are unobservable in practice.

To support sequential composition while targeting modern hardware, we enrich the standard event-based approach with preconditions and families of predicate transformers. When calculating the meaning of (S1;S2), the predicate transformer applied to the precondition of an event e from S2 is chosen based on the set of events in S1 upon which e depends. We apply this approach to two existing memory models.

인재를 찾습니다

Roblox는 언제나 연구 팀과 함께할 인재들을 찾고 있습니다. 연구 팀의 일원이 되면, 독특한 문제를 해결하는 가운데 개방적인 과학 연구, 공동 탐구를 진행하며 인간 상호 작용의 미래를 만들어 가는 데 동참하게 될 것입니다.