Explorando o futuro da interação humana
"A pesquisa fundamental da Roblox sobre soluções em tempo real para sistemas distribuídos, AI generativa e processamento de voz é um investimento transformador na inovação. Colaboração entre nossos cientistas, engenheiros e acadêmicos impulsiona a tecnologia para uma comunidade online positiva e civil de bilhões de pessoas." -Morgan McGuire, cientista-chefe
Procura-se mentes brilhantes
Estamos sempre procurando de mentes brilhantes para se juntar a nossa equipe de pesquisa. Como membro da equipe, você trabalhará em problemas únicos e traçará o futuro da interação humana por meio de pesquisas científicas abertas e exploração colaborativa.Explorar
We introduce a method for efficiently computing the exact shortest path to the boundary of a mesh from a given internal point in the presence of self-intersections. We provide a formal definition of shortest boundary paths for self-intersecting objects and present a robust algorithm for computing the actual shortest boundary path. The resulting method offers an effective solution for collision and self-collision handling while simulating deformable volumetric objects, using fast simulation techniques that provide no guarantees on collision resolution. Our evaluation includes complex self-collision scenarios with a large number of active contacts, showing that our method can successfully handle them by introducing a relatively minor computational overhead.
This paper describes a method for fast simplification of surface meshes. Whereas past methods focus on visual appearance, our goal is to solve equations on the surface. Hence, rather than approximate the extrinsic geometry, we construct a coarse intrinsic triangulation of the input domain. In the spirit of the quadric error metric (QEM), we perform greedy decimation while agglomerating global information about approximation error. In lieu of extrinsic quadrics, however, we store intrinsic tangent vectors that track how far curvature “drifts” during simplification. This process also yields a bijective map between the fine and coarse mesh, and prolongation operators for both scalar- and vector-valued data. Moreover, we obtain hard guarantees on element quality via intrinsic retriangulation – a feature unique to the intrinsic setting. The overall payoff is a “black box” approach to geometry processing, which decouples mesh resolution from the size of matrices used to solve equations. We show how our method benefits several fundamental tasks, including geometric multigrid, all-pairs geodesic distance, mean curvature flow, geodesic Voronoi diagrams, and the discrete exponential map.
We present a deep learning method for composite and task-driven motion control for physically simulated characters. In contrast to existing data-driven approaches using reinforcement learning that imitate full-body motions, we learn decoupled motions for specific body parts from multiple reference motions simultaneously and directly by leveraging the use of multiple discriminators in a GAN-like setup. In this process, there is no need of any manual work to produce composite reference motions for learning. Instead, the control policy explores by itself how the composite motions can be combined automatically. We further account for multiple task-specific rewards and train a single, multi-objective control policy. To this end, we propose a novel framework for multi-objective learning that adaptively balances the learning of disparate motions from multiple sources and multiple goal-directed control objectives. In addition, as composite motions are typically augmentations of simpler behaviors, we introduce a sample-efficient method for training composite control policies in an incremental manner, where we reuse a pre-trained policy as the meta policy and train a cooperative policy that adapts the meta one for new composite tasks. We show the applicability of our approach on a variety of challenging multi-objective tasks involving both composite motion imitation and multiple goal-directed control.
The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention. StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large collection of permissively licensed GitHub repositories with inspection tools and an opt-out process. We fine-tuned StarCoderBase on 35B Python tokens, resulting in the creation of StarCoder. We perform the most comprehensive evaluation of Code LLMs to date and show that StarCoderBase outperforms every open Code LLM that supports multiple programming languages and matches or outperforms the OpenAI code-cushman-001 model. Furthermore, StarCoder outperforms every model that is fine-tuned on Python, can be prompted to achieve 40% pass@1 on HumanEval, and still retains its performance on other programming languages. We take several important steps towards a safe open-access model release, including an improved PII redaction pipeline and a novel attribution tracing tool, and make the StarCoder models publicly available under a more commercially viable version of the Open Responsible AI Model license.
Reference datasets are a key tool in the creation of new algorithms. They allow us to compare different existing solutions and identify problems and weaknesses during the development of new algorithms. The signed distance function (SDF) is enjoying a renewed focus of research activity in computer graphics, but until now there has been no standard reference dataset of such functions. We present a database of 63 curated, optimized, and regularized functions of varying complexity. Our functions are provided as analytic expressions that can be efficiently evaluated on a GPU at any point in space. We also present a viewing and inspection tool and software for producing SDF samples appropriate for both traditional graphics and training neural networks.
Cartoon effects described in animation principles are key to adding fluidity and style to animated characters. This paper extends the existing framework of Velocity Skinning to use skeletal acceleration, in addition to velocity, for cartoon-style effects on rigged characters. This Acceleration Skinning is able to produce a variety of cartoon effects from highly efficient closed-form deformers while remaining compatible with standard production pipelines for rigged characters. This paper showcases the introduction of the framework along with providing applications through three new deformers. Specifically, a followthrough effect is obtained from the combination of skeletal acceleration and velocity. Also, centrifugal stretch and centrifugal lift effects are introduced using rotational acceleration to model radial stretching and lifting effects. The paper also explores the application of effect-specific time filtering when combining deformations together allow for more stylization and artist control over the results.
The National Academies of Sciences, Engineering, and Medicine will appoint an ad hoc panel to consider several innovations that could substantially reduce the cost structure of NAEP while maintaining its technical quality and value in informing the public about education progress. The panel will review the major cost components of NAEP and related assessment programs and consider the following possible changes to the NAEP program: 1) automatic item generation; 2) remote test administration; 3) computer adaptive testing; and 4) consolidation and elimination of substantive overlaps between NAEP assessments and between NAEP and other assessments, such as PISA, TIMSS, and PIRLS. The panel will also solicit and consider suggestions of other major changes that reflect modern methods of assessment and that could substantially reduce NAEP costs while largely preserving its technical quality and informative value. The panel will review relevant research and industry practice to draw conclusions about the likely effects of these potential changes on the cost, technical quality, and informative value of NAEP.
The panel will produce a short and broadly accessible report that summarizes its findings and conclusions about these potential changes to NAEP and recommends potential assessment or programmatic changes and research needed for NAEP to explore innovations while balancing the competing objectives of cost reduction, technical quality and informative value.
Real time facial animation for virtual 3D characters has important applications such as AR/VR, interactive 3D entertainment, pre-visualization and video conferencing. Yet despite important research breakthroughs in facial tracking and performance capture, there are very few commercial examples of real-time facial animation applications in the consumer market. Mass adoption requires realtime performance on commodity hardware and visually pleasing animation that is robust to real world conditions, without requiring manual calibration. We present an end-to-end deep learning framework for regressing facial animation weights from video that addresses most of these challenges. Our formulation is fast (3.2 ms), utilizes images of real human faces along with millions of synthetic rendered frames to train the network on real-world scenarios, and produces jitter-free visually pleasing animations.
Machine learning is a key part of our ability to scale important services to our massive community. In this talk, we share our journey of scaling our deep learning text classifiers to process 50k+ requests per second at latencies under 20ms. We will share how we were able to not only make BERT fast enough for our users, but also economical enough to run in production at a manageable cost on CPU.
Interactive global illumination remains a challenge in radiometrically- and geometrically-complex scenes. Specialized sampling strategies are effective for specular and near-specular transport because the scattering has relatively low directional variance per scattering event. In contrast, the high variance from transport paths comprising multiple rough glossy or diffuse scattering events remains notoriously difficult to resolve with a small number of samples. We extend unidirectional path tracing to address this by combining screen-space reservoir resampling and sparse world-space probes, significantly improving sample efficiency for transport contributions that terminate on diffuse scattering events. Our experiments demonstrate a clear improvement — at equal time and equal quality — over purely path traced and purely probe-based baselines. Moreover, when combined with commodity denoisers, we are able to interactively render global illumination in complex scenes.
This chapter discusses how recent advancements in digital technology could lead to a new generation of game-based standardised assessments in education, providing education systems with assessments that can test more complex skills than traditional standardised tests can. After highlighting some of the advantages of game-based standardised assessment compared to traditional ones, this chapter discusses how these tests are built, how they work, but also some of their limitations. While games have strong potential to improve the quality of testing and expand assessment to complex skills in the future, they will likely supplement traditional tests, which also have their advantages. Three examples of game-based assessments integrating a range of advanced technologies illustrate this perspective.
Game publishers and anti-cheat companies have been unsuccessful in blocking cheating in online gaming. We propose a novel, vision-based approach that captures the final state of the frame buffer and detects illicit overlays. To this aim, we train and evaluate a DNN detector on a new dataset, collected using two first-person shooter games and three cheating software. We study the advantages and disadvantages of different DNN architectures operating on a local or global scale. We use output confidence analysis to avoid unreliable detections and inform when network retraining is required. In an ablation study, we show how to use Interval Bound Propagation to build a detector that is also resistant to potential adversarial attacks and study its interaction with confidence analysis. Our results show that robust and effective anti-cheating through machine learning is practically feasible and can be used to guarantee fair play in online gaming.
Luau is the scripting language that powers user-generated experiences on the Roblox platform. It is a statically-typed language, based on the dynamically-typed Lua language, with type inference. These types are used for providing editor assistance in Roblox Studio, the IDE for authoring Roblox experiences. Due to Roblox’s uniquely heterogeneous developer community, Luau must operate in a somewhat different fashion than a traditional statically-typed language. In this paper, we describe some of the goals of the Luau type system, focusing on where the goals differ from those of other type systems.
We present a networked, high-performance graphics system that combines dynamic, high-quality, ray traced global illumination computed on a server with direct illumination and primary visibility computed on a client. This approach provides many of the image quality benefits of real-time ray tracing on low-power and legacy hardware, while maintaining a low latency response and mobile form factor.
As opposed to streaming full frames from rendering servers to end clients, our system distributes the graphics pipeline over a network by computing diffuse global illumination on a remote machine. Diffuse global illumination is computed using a recent irradiance volume representation combined with a new lossless, HEVC-based, hardware-accelerated encoding, and a perceptually-motivated update scheme.
Our experimental implementation streams thousands of irradiance probes per second and requires less than 50 Mbps of throughput, reducing the consumed bandwidth by 99.4% when streaming at 60 Hz compared to traditional lossless texture compression.
The bandwidth reduction achieved with our approach allows higher quality and lower latency graphics than state-of-the-art remote rendering via video streaming. In addition, our split-rendering solution decouples remote computation from local rendering and so does not limit local display update rate or display resolution.
Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. SDFs encode 3D surfaces with a function of position that returns the closest distance to a surface. State-of-the-art methods typically encode the SDF with a large, fixed-size neural network to approximate complex shapes with implicit surfaces. Rendering these large networks is, however, computationally expensive since it requires many forward passes through the network for every pixel, making these representations impractical for real-time graphics applications.
We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs, while achieving state-of-the-art geometry reconstruction quality. We represent implicit surfaces using an octree-based feature volume which adaptively fits shapes with multiple discrete levels of detail (LODs), and enables continuous LOD with SDF interpolation. We further develop an efficient algorithm to directly render our novel neural SDF representation in real-time by querying only the necessary LODs with sparse octree traversal. We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works. Furthermore, it produces state-of-the-art reconstruction quality for complex shapes under both 3D geometric and 2D image-space metrics.
Program logics and semantics tell a pleasant story about sequential composition: when executing (S1; S2), we first execute S1 then S2. To improve performance, however, processors execute instructions out of order, and compilers reorder programs even more dramatically. By design, single-threaded systems cannot observe these reorderings; however, multiple-threaded systems can, making the story considerably less pleasant. A formal attempt to understand the resulting mess is known as a “relaxed memory model.’’ Prior models either fail to address sequential composition directly, or overly restrict processors and compilers, or permit nonsense thin-air behaviors which are unobservable in practice.
To support sequential composition while targeting modern hardware, we enrich the standard event-based approach with preconditions and families of predicate transformers. When calculating the meaning of (S1;S2), the predicate transformer applied to the precondition of an event e from S2 is chosen based on the set of events in S1 upon which e depends. We apply this approach to two existing memory models.
Procura-se mentes brilhantes
Estamos sempre procurando de mentes brilhantes para se juntar a nossa equipe de pesquisa. Como membro da equipe, você trabalhará em problemas únicos e traçará o futuro da interação humana através da pesquisa científica aberta e exploração colaborativa.