Apparatuses, systems, and techniques to generate an image. In at least one embodiment, one or more neural networks are to generate a second image based, at least in part, on a first image and information indicating zero or more differences between the first and second image.
G06V 20/40 - RECONNAISSANCE OU COMPRÉHENSION D’IMAGES OU DE VIDÉOS Éléments spécifiques à la scène dans le contenu vidéo
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
2.
PERFORMING SCRAMBLING AND/OR DESCRAMBLING ON PARALLEL COMPUTING ARCHITECTURES
Apparatuses, systems, and techniques to descramble or scramble data use a graphics processing unit (GPU) to perform descrambling. For example, in at least one embodiment, generation of a descrambling sequence is distributed among GPU threads for parallel calculation of the descrambling sequence and/or descrambling is distributed among GPU threads for descrambling.
G06F 7/58 - Générateurs de nombres aléatoires ou pseudo-aléatoires
H04N 21/8352 - Génération de données de protection, p.ex. certificats impliquant des données d’identification du contenu ou de la source, p.ex. "identificateur unique de matériel" [UMID]
3.
APPLICATION PROGRAMMING INTERFACE TO IDENTIFY LOCATION OF PROGRAM PORTIONS
Apparatuses, systems, and techniques to selectively load data required to use one or more functions. In at least one embodiment, selective load for one or more functions to be used is performed by one or more application programming interface for efficient use of memory on a system comprising a processor and a graphics processor.
Apparatuses, systems, and techniques to selectively load data required to use one or more functions. In at least one embodiment, selective load for one or more functions to be used is performed by one or more application programming interface for efficient use of memory on a system comprising a processor and a graphics processor.
Apparatuses, systems, and techniques to cause data to be selectively stored in one or more memory locations. In at least one embodiment, a processor is to cause data to be selectively stored in one or more memory locations based, at least in part, on one or more threads to use the data.
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a number of multi-dimensional column-based heat dissipation features enable cooling by a cooling media flowing there through so that an individual heat dissipation column having a first dimension and a second dimension may be supported, with the first dimension being normal relative to an axial flow path of the cooling media, with the second dimension being parallel or offset from parallel relative to the axial flow path and with the second dimension being more than the first dimension.
Various embodiments include techniques for managing cache memory in a computing system. The computing system includes a sectored cache memory that provides a mechanism for software applications to directly invalidate data items stored in the cache memory on a sector-by-sector basis, where a sector is smaller than a cache line. When all sectors in a cache line have been invalidated, the cache line is implicitly invalidated, freeing the cache line to be reallocated for other purposes. In cases where the data items to be invalidated can be aligned to sector boundaries, the disclosed techniques effectively use status indicators in the cache tag memory to track which sectors, and corresponding data items, have been invalidated by the software application. Thus, the disclosed techniques thereby enable a low-overhead solution for invalidating individual data items that are smaller than a cache line without additional tracking data structures or consuming additional memory transfer bandwidth.
G06F 12/0802 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache
8.
APPLICATION PROGRAMMING INTERFACE TO PERFORM OPERATION WITH REUSABLE THREAD
Apparatuses, systems, and techniques to perform collective operations using parallel processing. In at least one embodiment, a non-blocking application programming interface allow programs to improve performance of one or more collective operations on a GPU.
A target image corresponding to a novel view may be synthesized from two source images, corresponding source camera poses, and pixel attribute correspondences between the two source images. A particular object in the target image need only be visible in one of the two source images for successful synthesis. Each pixel in the target image is defined according to an identified pixel in one of the two source images. The identified source pixel provides attributes such as color, texture, and feature descriptors for the target pixel. The source and target camera poses are used to define geometric relationships for identifying the source pixels. In an embodiment, the pixel attribute correspondences are optical flow that defines movement of attributes from a first image of the two source images to a second image of the two source images.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06T 7/70 - Détermination de la position ou de l'orientation des objets ou des caméras
G06T 1/20 - Architectures de processeurs; Configuration de processeurs p.ex. configuration en pipeline
Apparatuses, systems, and techniques to perform collective operations using parallel processing. In at least one embodiment, a non-blocking application programming interface allow programs to improve performance of one or more collective operations on a GPU.
Apparatuses, systems, and techniques to selectively load data required to use one or more functions. In at least one embodiment, selective load for one or more functions to be used is performed by one or more application programming interface for efficient use of memory on a system comprising a processor and a graphics processor.
Various techniques for accelerating dynamic programming algorithms are provided. For example, a fused addition and comparison instruction, a three-operand comparison instruction, and a two-operand comparison instruction are used to accelerate a Needleman-Wunsch algorithm that determines an optimized global alignment of subsequences over two entire sequences. In another example, the fused addition and comparison instruction is used in an innermost loop of a Floyd-Warshall algorithm to reduce the number of instructions required to determine shortest paths between pairs of vertices in a graph. In another example, a two-way single instruction multiple data (SIMD) floating point variant of the three-operand comparison instruction is used to reduce the number of instructions required to determine the median of an array of floating point values.
Apparatuses, systems, and techniques to selectively load data required to use one or more functions. In at least one embodiment, selective load for one or more functions to be used is performed by one or more application programming interface for efficient use of memory on a system comprising a processor and a graphics processor.
Apparatuses, systems, and techniques to select streams to run inference based at least in part on heuristics. In at least one embodiment, heurstics are generated based at least in part on information inferred using one or more machine learning models applied to streams.
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a cold plate includes folded heat dissipation features to be cooled by at least one two-phase fluid via folded heat dissipation features having first and second channels of different widths; having first mechanical couplings for top portions of such folded heat dissipation features to an upper section of a cold pate; and having second mechanical couplings for bottom portions of the folded heat dissipation features to a lower section of the cold plate.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to deselect storage selected to be used to transfer information between a plurality of fifth generation new radio (5G-NR) computing resources.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to store data in storage selected to be used to transfer information between a plurality of fifth generation new radio (5G-NR) computing using different transport protocols.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to obtain data from storage selected to be used to transfer information between a plurality of fifth generation new radio (5G-NR) computing resources.
Systems and methods implement a technique for altering the shape of the cells by shifting coordinates of points along cell boundaries using a set of periodic functions. To avoid having cell boundaries along the scene surfaces, wavelengths of those periodic functions are selected so they are not a multiple of an original discretization. The coordinates may be shifted along different axes of the cells and may generate different cells having a variety of different outlines to reduce a likelihood of a cell boundary being positioned along a scene boundary.
G06T 19/00 - Transformation de modèles ou d'images tridimensionnels [3D] pour infographie
G06T 7/586 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir de plusieurs sources de lumière, p.ex. stéréophotométrie
G06F 30/20 - Optimisation, vérification ou simulation de l’objet conçu
20.
REACTIVE INTERACTIONS FOR ROBOTIC APPLICATIONS AND OTHER AUTOMATED SYSTEMS
Approaches presented herein provide for predictive control of a robot or automated assembly in performing a specific task. A task to be performed may depend on the location and orientation of the robot performing that task. A predictive control system can determine a state of a physical environment at each of a series of time steps, and can select an appropriate location and orientation at each of those time steps. At individual time steps, an optimization process can determine a sequence of future motions or accelerations to be taken that comply with one or more constraints on that motion. For example, at individual time steps, a respective action in the sequence may be performed, then another motion sequence predicted for a next time step, which can help drive robot motion based upon predicted future motion and allow for quick reactions.
G05B 19/4155 - Commande numérique (CN), c.à d. machines fonctionnant automatiquement, en particulier machines-outils, p.ex. dans un milieu de fabrication industriel, afin d'effectuer un positionnement, un mouvement ou des actions coordonnées au moyen de données d'u caractérisée par le déroulement du programme, c.à d. le déroulement d'un programme de pièce ou le déroulement d'une fonction machine, p.ex. choix d'un programme
21.
HAZARD DETECTION USING OCCUPANCY GRIDS FOR AUTONOMOUS SYSTEMS AND APPLICATIONS
In various examples, a hazard detection system plots hazard indicators from multiple detection sensors to grid cells of an occupancy grid corresponding to a driving environment. For example, as the ego-machine travels along a roadway, one or more sensors of the ego-machine may capture sensor data representing the driving environment. A system of the ego-machine may then analyze the sensor data to determine the existence and/or location of the one or more hazards within an occupancy grid—and thus within the environment. When a hazard is detected using a respective sensor, the system may plot an indicator of the hazard to one or more grid cells that correspond to the detected location of the hazard. Based, at least in part, on a fused or combined confidence of the hazard indicators for each grid cell, the system may predict whether the corresponding grid cell is occupied by a hazard.
Various embodiments include techniques for utilizing resources on a processing unit. Thread groups executing on a processor begin execution with specified resources, such as a number of registers and an amount of shared memory. During execution, one or more thread groups may determine that the thread groups have excess resources needed to execute the current functions. Such thread groups can deallocate the excess resources to a free pool. Similarly, during execution, one or more thread groups may determine that the thread groups have fewer resources needed to execute the current functions. Such thread groups can allocate the needed resources from the free pool. Further, producer thread groups that generate data for consumer thread groups can deallocate excess resources prior to completion. The consumer thread groups can allocate the excess resources and initiate execution while the producer thread groups complete execution, thereby decreasing latency between producer and consumer thread groups.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to prevent deselection of storage to be used to transfer information between a plurality of fifth generation new radio (5G-NR) computing using different transport protocols.
In various embodiments, a comparison circuit compares voltages within an integrated circuit. The comparison circuit includes a comparison capacitor, an inverter, and multiple switches. A first terminal of the comparison capacitor is coupled to both a first terminal of a first switch and a first terminal of a second switch. A second terminal of the comparison capacitor is coupled to both a first terminal of a third switch and an input of the inverter. An output of the inverter is coupled to both a second terminal of the third switch and a first terminal of a fourth switch. A second terminal of the fourth switch is coupled to a first terminal of a fifth switch and a first output of the comparison circuit. At least a portion of the switches are turned on during a comparison model and are turned off during a reset mode.
H03K 5/24 - Circuits présentant plusieurs entrées et une sortie pour comparer des impulsions ou des trains d'impulsions entre eux en ce qui concerne certaines caractéristiques du signal d'entrée, p.ex. la pente, l'intégrale la caractéristique étant l'amplitude
25.
APPLICATION PROGRAMMING INTERFACE TO SELECT STORAGE
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to transfer information between a plurality of fifth generation new radio (5G-NR) computing using different transport protocols.
Apparatuses, systems, and techniques to cool computing devices. In at least one embodiment, a system includes a heatsink including one or more connector pins to laterally couple the heatsink to one or more computing devices.
Apparatuses, systems, and techniques are presented to recognize speech in an audio signal. In particular, various embodiments can indicate an end of one or more speech segments based, at least in part, on one or more characters predicted to be within these one or more speech segments.
Data bits are encoded in an eleven bit seven pulse amplitude modulated three-level (PAM-3) symbol format on a plurality of data channels and two auxiliary data channels, and one or more of a cyclic redundancy check (CRC) value, a poison value, and a severity value are encoded as PAM-3 symbols on an error correction channel.
G06F 11/10 - Détection ou correction d'erreur par introduction de redondance dans la représentation des données, p.ex. en utilisant des codes de contrôle en ajoutant des chiffres binaires ou des symboles particuliers aux données exprimées suivant un code, p.ex. contrôle de parité, exclusion des 9 ou des 11
G06F 11/07 - Réaction à l'apparition d'un défaut, p.ex. tolérance de certains défauts
29.
NON-RECTANGULAR MATRIX COMPUTATIONS AND DATA PATTERN PROCESSING USING TENSOR CORES
Matrix multiplication operations can be implemented, at least in part, on one or more tensor cores of a parallel processing unit. An efficiency of the matrix multiplication operations can be improved in cases where one of the input operands or the output operand of the matrix multiplication operation is a square matrix having a triangular data pattern. In such cases, the number of computations performed by the tensor cores of the parallel processing unit can be reduced by dropping computations and/or masking out elements of the square matrix input operand on one side of the main diagonal of the square matrix. In other cases where the output operand exhibits the triangular data pattern, computations can be dropped or masked out for the invalid side of the main diagonal of the square matrix. In an embodiment, a library implementing the matrix multiplication operations is provided.
G06F 7/483 - Calculs avec des nombres représentés par une combinaison non linéaire de nombres codés, p.ex. nombres rationnels, système de numération logarithmique ou nombres à virgule flottante
30.
ENVIRONMENT RECONSTRUCTION AND PATH PLANNING FOR AUTONOMOUS SYSTEMS AND APPLICATIONS
Approaches for environment reconstruction and path planning for autonomous machine systems and applications are described. An iterative volumetric mapping function for an ego-machine may compute a distance field, and from the distance field derive a cost map representing a volumetric reconstruction of the physical environment around the ego-machine. The cost map may be used for collision avoidance and path planning. The iterative volumetric mapping function may also optionally compute a color integration map and visualization mesh from the distance field that can be used for visualization of the physical environment around the ego-machine. The cost map may be computed as a Euclidean Signed Distance Field (ESDF) and the distance field from which the cost map is computed may include a Truncated Signed Distance Field (TSDF). The distance field, cost map, color integration map and visualization mesh may each be stored in memory as maps of a plurality of map layers.
G06T 17/20 - Description filaire, p.ex. polygonalisation ou tessellation
G06V 20/58 - Reconnaissance d’objets en mouvement ou d’obstacles, p.ex. véhicules ou piétons; Reconnaissance des objets de la circulation, p.ex. signalisation routière, feux de signalisation ou routes
G06T 7/50 - Récupération de la profondeur ou de la forme
G06T 1/20 - Architectures de processeurs; Configuration de processeurs p.ex. configuration en pipeline
B60W 30/09 - Entreprenant une action automatiquement pour éviter la collision, p.ex. en freinant ou tournant
B60W 60/00 - Systèmes d’aide à la conduite spécialement adaptés aux véhicules routiers autonomes
G06F 16/22 - Indexation; Structures de données à cet effet; Structures de stockage
31.
CONFIDENTIAL COMPUTING USING MULTI-INSTANCING OF PARALLEL PROCESSORS
In examples, trusted execution environments (TEE) are provided for an instance of a parallel processing unit (PPU) as PPU TEEs. Different instances of a PPU correspond to different PPU TEEs, and provide accelerated confidential computing to a corresponding TEE. The processors of each PPU instance have separate and isolated paths through the memory system of the PPU which are assigned uniquely to an individual PPU instance. Data in device memory of the PPU may be isolated and access controlled amongst the PPU instances using one or more hardware firewalls. A GPU hypervisor assigns hardware resources to runtimes and performs access control and context switching for the runtimes. A PPU instance uses a cryptographic key to protect data for secure communication. Compute engines of the PPU instance are prevented from writing outside of a protected memory region. Access to a write protected region in PPU memory is blocked from other computing devices and/or device instances.
G06F 9/455 - Dispositions pour exécuter des programmes spécifiques Émulation; Interprétation; Simulation de logiciel, p.ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
32.
SIMULATING PHYSICAL INTERACTIONS FOR AUTOMATED SYSTEMS
Approaches presented herein provide for simulation of human motion for human-robot interactions, such as may involve a handover of an object. Motion capture can be performed for a hand grasping and moving an object to a location and orientation appropriate for a handover, without a need for a robot to be present or an actual handover to occur. This motion data can be used to separately model the hand and the object for use in a handover simulation, where a component such as a physics engine may be used to ensure realistic modeling of the motion or behavior. During a simulation, a robot control model or algorithm can predict an optimal location and orientation to grasp an object, and an optimal path to move to that location and orientation, using a control model or algorithm trained, based at least in part, using the motion models for the hand and object.
In various examples, image space coordinates of an image from a video may be labeled, projected to determine 3D vehicle space coordinates, then transformed to 3D world space coordinates using known 3D world space coordinates and relative positioning between the coordinate spaces. For example, 3D vehicle space coordinates may be temporally correlated with known 3D world space coordinates measured while capturing the video. The known 3D world space coordinates and known relative positioning between the coordinate spaces may be used to offset or otherwise define a transform for the 3D vehicle space coordinates to world space. Resultant 3D world space coordinates may be used for one or more labeled frames to generate ground truth data. For example, 3D world space coordinates for left and right lane lines from multiple frames may be used to define lane lines for any given frame.
G06V 20/56 - Contexte ou environnement de l’image à l’extérieur d’un véhicule à partir de capteurs embarqués
G06V 10/774 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source méthodes de Bootstrap, p.ex. "bagging” ou “boosting”
G06V 10/94 - Architectures logicielles ou matérielles spécialement adaptées à la compréhension d’images ou de vidéos
G06T 7/73 - Détermination de la position ou de l'orientation des objets ou des caméras utilisant des procédés basés sur les caractéristiques
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
Apparatuses, systems, and techniques to predict a probability of an error or anomay in processing units, such as those of a data center. In at least one embodiment, the probability of an error occuring in a proccessing unit is identified using multiple trained machine learning models, in which the trained machine learning models each outputs, for example, the probability of an error occuring within a different predetermined time period.
Approaches provide for performance of a complex (e.g., compound) task that may involve multiple discrete tasks not obvious from an instruction to perform the complex task. A set of conditions for an environment can be determined using captured image data, and the instruction analyzed to determine a set of final conditions to exist in the environment after performance of the instruction. These initial and end conditions are used to determine a sequence of discrete tasks to be performed to cause a robot or automated device to perform the instruction. This can involve use of a symbolic or visual planner in at least some embodiments, as well as a search of possible sequences of actions available for the robot or automated device. A robot can be caused to perform the sequence of discrete tasks, and feedback provided such that the sequence of tasks can be modified as appropriate.
G05B 19/4155 - Commande numérique (CN), c.à d. machines fonctionnant automatiquement, en particulier machines-outils, p.ex. dans un milieu de fabrication industriel, afin d'effectuer un positionnement, un mouvement ou des actions coordonnées au moyen de données d'u caractérisée par le déroulement du programme, c.à d. le déroulement d'un programme de pièce ou le déroulement d'une fonction machine, p.ex. choix d'un programme
Various embodiments include a system for generating performance monitoring data in a computing system. The system includes a unit level counter with a set of counters, where each counter increments during each clock cycle in which a corresponding electronic signal is at a first state, such as a high or low logic level state. Periodically, the unit level counter transmits the counter values to a corresponding counter collection unit. The counter collection unit includes a set of counters that aggregates the values of the counters in multiple unit level counters. Based on certain trigger conditions, the counter collection unit transmits records to a reduction channel. The reduction channel includes a set of counters that aggregates the values of the counters in multiple counter collection units. Each virtual machine executing on the system can access a different corresponding reduction channel, providing secure performance metric data for each virtual machine.
Approaches in accordance with various embodiments can perform spatial hash map updates while ensuring the atomicity of the updates for arbitrary data structures. A hash map can be generated for a dataset where entries in the hash map may correspond to multiple independent values, such as pixels of an image to be rendered. Update requests for independent values may be received on multiple concurrent threads, but change requests for independent values corresponding to a hash map entry can be aggregated from a buffer and processed iteratively in a single thread for a given hash map entry. In the case of multi-resolution spatial hashing where data can be stored at various discretization levels, this operation can be repeated to propagate changes from one level to another.
A mapper within a single-level memory system may facilitate memory localization to reduce the energy and latency of memory accesses within the single-level memory system. The mapper may translate a memory request received from a processor for implementation at a data storage entity, where the translating identifies a data storage entity and a starting location within the data storage entity where the data associated with the memory request is located. This data storage entity may be co-located with the processor that sent the request, which may enable the localization of memory and significantly improve the performance of memory usage by reducing an energy of data access and increasing data bandwidth.
In examples, a parallel processing unit (PPU) operates within a trusted execution environment (TEE) implemented using a central processing unit (CPU). A virtual machine (VM) executing within the TEE is provided access to the PPU by a hypervisor. However, data of an application executed by the VM is inaccessible to the hypervisor and other untrusted entities outside of the TEE. To protect the data in transit, the VM and the PPU may encrypt or decrypt the data for secure communication between the devices. To protect the data within the PPU, a protected memory region may be created in PPU memory where compute engines of the PPU are prevented from writing outside of the protected memory region. A write protect memory region is generated where access to the PPU memory is blocked from other computing devices and/or device instances.
G06F 9/455 - Dispositions pour exécuter des programmes spécifiques Émulation; Interprétation; Simulation de logiciel, p.ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
G06F 21/57 - Certification ou préservation de plates-formes informatiques fiables, p.ex. démarrages ou arrêts sécurisés, suivis de version, contrôles de logiciel système, mises à jour sécurisées ou évaluation de vulnérabilité
40.
APPLICATION PROGRAMMING INTERFACE TO SELECT STORAGE
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to select storage to be used to transfer information between a plurality of fifth generation new radio (5G-NR) computing using different transport protocols.
H04L 67/1097 - Protocoles dans lesquels une application est distribuée parmi les nœuds du réseau pour le stockage distribué de données dans des réseaux, p.ex. dispositions de transport pour le système de fichiers réseau [NFS], réseaux de stockage [SAN] ou stockage en réseau [NAS]
A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to properly handle numerically challenging computations at or near edges and/or vertices of primitives and/or ensure that a single intersection is reported when a ray intersects a surface formed by primitives at or near edges and/or vertices of the primitives.
Apparatuses, systems, and techniques to generate a robust representation of an image. In at least one embodiment, input tokens of an input image are received, and an inference about the input image is generated based on a vision transformer (ViT) system comprising at least one self-attention module to perform token mixing and a channel self-attention module to perform channel processing.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/77 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source
G06V 10/778 - Apprentissage de profils actif, p.ex. apprentissage en ligne des caractéristiques d’images ou de vidéos
In one embodiment, a system receives pixel data from a pair of regions of an image generated by an imaging device, the pair of regions includes a first region and a second region, where the first region includes a first plurality of pixels and the second region includes a second plurality of pixels. The system determines a plurality of pixel pairs, where a pixel pair includes a first pixel from the first plurality of pixels and a second pixel from the second plurality of pixels. The system calculates a plurality of contrasts based on the plurality of pixel pairs. The system determines a contrast distribution based on the plurality of contrasts. The system calculates a value representative of a capability of the imaging device to detect contrast based on the contrast distribution. The system determines a reduction in contrast detectability of the imaging device based on the value.
G05D 1/02 - Commande de la position ou du cap par référence à un système à deux dimensions
G05D 1/00 - Commande de la position, du cap, de l'altitude ou de l'attitude des véhicules terrestres, aquatiques, aériens ou spatiaux, p.ex. pilote automatique
Various approaches are disclosed to temporally and spatially filter noisy image data—generated using one or more ray-tracing effects—in a graphically rendered image. Rather than fully sampling data values using spatial filters, the data values may be sparsely sampled using filter taps within the spatial filters. To account for the sparse sampling, locations of filter taps may be jittered spatially and/or temporally. For filtering efficiency, a size of a spatial filter may be reduced when historical data values are used to temporally filter pixels. Further, data values filtered using a temporal filter may be clamped to avoid ghosting. For further filtering efficiency, a spatial filter may be applied as a separable filter in which the filtering for a filter direction may be performed over multiple iterations using reducing filter widths, decreasing the chance of visual artifacts when the spatial filter does not follow a true Gaussian distribution.
A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.
In various examples, a method includes computing a current keyframe, the current keyframe being representative of an area around an autonomous vehicle at a current time based on map data. The method includes transforming a preceding keyframe to a coordinate frame of the autonomous vehicle at a first time prior to completing computation of the current keyframe to generate a first world model frame. The method includes transforming the preceding keyframe to the coordinate frame of the autonomous vehicle at a second time after the first time and prior to completing computation of the current keyframe to generate a second world model frame.
This specification describes a programmatic multicast technique enabling one thread (for example, in a cooperative group array (CGA) on a GPU) to request data on behalf of one or more other threads (for example, executing on respective processor cores of the GPU). The multicast is supported by tracking circuitry that interfaces between multicast requests received from processor cores and the available memory. The multicast is designed to reduce cache (for example, layer 2 cache) bandwidth utilization enabling strong scaling and smaller tile sizes.
G06F 13/16 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus de mémoire
H04L 49/101 - TRANSMISSION D'INFORMATION NUMÉRIQUE, p.ex. COMMUNICATION TÉLÉGRAPHIQUE Éléments de commutation de paquets caractérisés par la construction de la matrice de commutation utilisant un crossbar ou une matrice
G06T 1/20 - Architectures de processeurs; Configuration de processeurs p.ex. configuration en pipeline
A new level(s) of hierarchy—Cooperate Group Arrays (CGAs)—and an associated new hardware-based work distribution/execution model is described. A CGA is a grid of thread blocks (also referred to as cooperative thread arrays (CTAs)). CGAs provide co-scheduling, e.g., control over where CTAs are placed/executed in a processor (such as a GPU), relative to the memory required by an application and relative to each other. Hardware support for such CGAs guarantees concurrency and enables applications to see more data locality, reduced latency, and better synchronization between all the threads in tightly cooperating collections of CTAs programmably distributed across different (e.g., hierarchical) hardware domains or partitions.
Processing hardware of a processor is virtualized to provide a façade between a consistent programming interface and specific hardware instances. Hardware processor components can be permanently or temporarily disabled when not needed to support the consistent programming interface and/or to balance hardware processing across a hardware arrangement such as an integrated circuit. Executing software can be migrated from one hardware arrangement to another without need to reset the hardware.
Processing hardware of a processor is virtualized to provide a façade between a consistent programming interface and specific hardware instances. Hardware processor components can be permanently or temporarily disabled when not needed to support the consistent programming interface and/or to balance hardware processing across a hardware arrangement such as an integrated circuit. Executing software can be migrated from one hardware arrangement to another without need to reset the hardware.
A processor supports new thread group hierarchies by centralizing work distribution to provide hardware-guaranteed concurrent execution of thread groups in a thread group array through speculative launch and load balancing across processing cores. Efficiencies are realized by distributing grid rasterization among the processing cores.
Apparatuses, systems, and techniques to generate a robust representation of an image. Input tokens (104) of an input image are received, and an inference (110) about the input image is generated based on a vision transformer (ViT) system comprising at least one self-attention (106) module to perform token mixing and a channel self-attention (108) module to perform channel processing.
Apparatuses, systems, and techniques for supporting fairness of multiple context sharing cryptographic hardware. An accelerator circuit includes a copy engine (CE) with AES-GCM hardware configured to perform both encryption and authentication of data transfers for multiple applications or multiple data streams in a single application or belonging to a single user. The CE splits a data transfer of a specified size into a set of partial transfers. The CE sequentially executes the set of partial transfers using a context for a period of time (e.g., a timeslice) for an application. The CE stores in a secure memory for the application one or more data for encryption or decryption (e.g., a hash key, a block counter, etc.) computed from a last partial transfer. The one or more data for encryption or decryption are retrieved and used when data transfers for the application is resumed by the CE.
G06F 21/79 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du stockage de données dans les supports de stockage à semi-conducteurs, p.ex. les mémoires adressables directement
H04L 9/06 - Dispositions pour les communications secrètes ou protégées; Protocoles réseaux de sécurité l'appareil de chiffrement utilisant des registres à décalage ou des mémoires pour le codage par blocs, p.ex. système DES
G06F 13/16 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus de mémoire
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p.ex. acces direct à la mémoire, vol de cycle
In various examples, map data or geospatial data is used to identify a subset of sensor data having a higher likelihood of including representations of a target object of interest from a larger set of sensor data. Feature vectors corresponding to the subset of sensor data may then be compared to template feature vectors corresponding to the target object in order to confirm the depiction of the target object in the sensor data. The identified sensor data may be used to train one or more machine learning model to compute outputs that correspond to object identification. The trained machine learning models may be used to identify objects in order to aid an autonomous or semi-autonomous machine in a surrounding environment.
A parallel processing unit comprises a plurality of processors each being coupled to a memory access hardware circuitry. Each memory access hardware circuitry is configured to receive, from the coupled processor, a memory access request specifying a coordinate of a multidimensional data structure, wherein the memory access hardware circuit is one of a plurality of memory access circuitry each coupled to a respective one of the processors; and, in response to the memory access request, translate the coordinate of the multidimensional data structure into plural memory addresses for the multidimensional data structure and using the plural memory addresses, asynchronously transfer at least a portion of the multidimensional data structure for processing by at least the coupled processor. The memory locations may be in the shared memory of the coupled processor and/or an external memory.
A new transaction barrier synchronization primitive enables executing threads and asynchronous transactions to synchronize across parallel processors. The asynchronous transactions may include transactions resulting from, for example, hardware data movement units such as direct memory units, etc. A hardware synchronization circuit may provide for the synchronization primitive to be stored in a cache memory so that barrier operations may be accelerated by the circuit. A new wait mechanism reduces software overhead associated with waiting on a barrier.
This specification describes techniques for implementing matrix multiply and add (MMA) operations in graphics processing units (GPU)s and other processors. The implementations provide for a plurality of warps of threads to collaborate in generating the result matrix by enabling each thread to share its respective register files to be accessed by the datapaths associated with other threads in the group of warps. A state machine circuit controls a MMA execution among the warps executing on asynchronous computation units. A group MMA (GMMA) instruction provides for a descriptor to be provided as parameter where the descriptor may include information regarding size and formats of input data to be loaded into shared memory and/or the datapath.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p.ex. décodage d'instructions
G06F 7/544 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p.ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul
58.
METHOD AND APPARATUS FOR EFFICIENT ACCESS TO MULTIDIMENSIONAL DATA STRUCTURES AND/OR OTHER LARGE DATA BLOCKS
A parallel processing unit comprises a plurality of processors each being coupled to a memory access hardware circuitry. Each memory access hardware circuitry is configured to receive, from the coupled processor, a memory access request specifying a coordinate of a multidimensional data structure, wherein the memory access hardware circuit is one of a plurality of memory access circuitry each coupled to a respective one of the processors; and, in response to the memory access request, translate the coordinate of the multidimensional data structure into plural memory addresses for the multidimensional data structure and using the plural memory addresses, asynchronously transfer at least a portion of the multidimensional data structure for processing by at least the coupled processor. The memory locations may be in the shared memory of the coupled processor and/or an external memory.
G06F 12/0875 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache avec mémoire cache dédiée, p.ex. instruction ou pile
Distributed shared memory (DSMEM) comprises blocks of memory that are distributed or scattered across a processor (such as a GPU). Threads executing on a processing core local to one memory block are able to access a memory block local to a different processing core. In one embodiment, shared access to these DSMEM allocations distributed across a collection of processing cores is implemented by communications between the processing cores. Such distributed shared memory provides very low latency memory access for processing cores located in proximity to the memory blocks, and also provides a way for more distant processing cores to also access the memory blocks in a manner and using interconnects that do not interfere with the processing cores' access to main or global memory such as hacked by an L2 cache. Such distributed shared memory supports cooperative parallelism and strong scaling across multiple processing cores by permitting data sharing and communications previously possible only within the same processing core.
One or more machine learning models (MLMs) may learn implicit 3D representations of geometry of an object and of dynamics of the object from performing an action on the object. Implicit neural representations may be used to reconstruct high-fidelity full geometry of the object and predict a flow-based dynamics field from one or more images, which may provide a partial view of the object. Correspondences between locations of an object may be learned based at least on distances between the locations on a surface corresponding to the object, such as geodesic distances. The distances may be incorporated into a contrastive learning loss function to train one or more MLMs to learn correspondences between locations of the object, such as a correspondence embedding field. The correspondences may be used to evaluate state changes when evaluating one or more actions that may be performed on the object.
G06T 17/10 - Description de volumes, p.ex. de cylindres, de cubes ou utilisant la GSC [géométrie solide constructive]
G06T 19/20 - Transformation de modèles ou d'images tridimensionnels [3D] pour infographie Édition d'images tridimensionnelles [3D], p.ex. modification de formes ou de couleurs, alignement d'objets ou positionnements de parties
G06N 20/20 - Techniques d’ensemble en apprentissage automatique
During a testing of a circuit design, an adaptive clock model and a voltage noise model are utilized within the computer implemented method of the testing environment in order to determine the dynamic effects of voltage variation and adaptive clock on the timing of the circuit design. The computer implemented method uses a hybrid stage that incorporates both a graph-based approach and a path-based approach may also be incorporated into the testing environment in order to maximize a performance of the testing of the circuit design.
G06F 30/367 - Vérification de la conception, p.ex. par simulation, programme de simulation avec emphase de circuit intégré [SPICE], méthodes directes ou de relaxation
62.
PARALLEL MASK RULE CHECKING ON EVOLVING MASK SHAPES IN OPTICAL PROXIMITY CORRECTION FLOWS
Embodiments of the present disclosure relate to parallel mask rule checking on evolving mask shapes in optical proximity correction (OPC) flows for integrated circuit designs. Systems and methods are disclosed that perform mask (manufacturing) rule checks (MRC) in parallel, sharing information to maintain symmetry when violations are corrected. In an embodiment the shared information is also used to minimize changes to the geometric area of proposed mask shapes resulting from the OPC. In contrast to conventional systems, MRC is performed for multiple edges in parallel, sharing information between the different edges to encourage symmetry. In an embodiment, all edges may be adjusted in parallel to reduce mask-edge traversal bias.
G03F 1/36 - Masques à correction d'effets de proximité; Leur préparation, p.ex. procédés de conception à correction d'effets de proximité [OPC optical proximity correction]
In a method for encryption of sensitive data, an encrypted user private key is received in a Trusted Execution Environment (TEE) in a worker node in a container management system, the encrypted user private key being an encrypted version of a user private key for decrypting a message from a user in the container management system. The user private key is obtained in the TEE, and the encrypted user private key being decrypted into the user private key with a provider private key that is received from an encryption manager for managing the container management system. The user private key may be transmitted to the worker node safely, such that the worker node may use the user private key to decrypt messages from the user. Therefore, the security level of the container management system may be increased.
A datacenter liquid cooling system is disclosed. A processor detects whether one or more liquid distribution units (LDUs) of the datacenter cooling system can be deactivated based on one or more workloads of one or more servers cooled by the datacenter cooling system.
Embodiments of the present disclosure are directed to apparatuses, systems, and techniques of offloading shader program compilation at a computing system. A detection is made that a set of shader programs are to be compiled for an application executing at a computing system using a first set of processing devices. A second set of processing devices to compile the set of shader programs is identified. Each of the second set of processing devices is different from any processing device of the first set of processing devices. The set of shader programs is provided for compilation using the second set of processing devices in view of state data associated with the computing system to obtain a set of complied shader programs. The set of compiled shader programs is executed using the first set of processing devices.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to indicate a number of 5G-NR cells that are able to be performed concurrently by one or more processors; a processor is to perform an API to indicate whether one or more processors are able to perform a first number of 5G-NR cells concurrently; a processor comprising one or more circuits is to perform an API to indicate whether one or more resources of one or more processors are allocated to perform 5G-NR cells; and/or a processor comprises one or more circuits to perform an API to indicate one or more techniques to be used by one or more processors in performing one or more 5G-NR cells.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to indicate a number of 5G-NR cells that are able to be performed concurrently by one or more processors; a processor is to perform an API to indicate whether one or more processors are able to perform a first number of 5G-NR cells concurrently; a processor comprising one or more circuits is to perform an API to indicate whether one or more resources of one or more processors are allocated to perform 5G-NR cells; and/or a processor comprises one or more circuits to perform an API to indicate one or more techniques to be used by one or more processors in performing one or more 5G-NR cells.
The technology disclosed herein involves using a machine learning model (e.g., CNN) to expand lower dynamic-range image content (e.g., SDR images) into higher dynamic-range image content (e.g., HDR images). The machine learning model can take as input the lower dynamic-range image and can output multiple expansion maps that are used to make the expanded image appear more natural. The expansion maps may be used by image operators to smooth color banding and to dim overexposed regions or user interface elements in the expanded image. The expanded content (e.g., HDR image content) may then be provided to one or more devices for display or storage.
Apparatuses, systems, and techniques to allocate memory based on a part of a sequence of items. In at least one embodiment, memory is allocated based on the size of a sliding window used to analyze images with neural networks.
Systems and methods relate to the determination of accurate motion vectors, for rendering situations such as a noisy Monte Carlo integration where image object surfaces are at least partially translucent. To optimize the search for “real world” positions, this invention defines the background as first path vertices visible through multiple layers of refractive interfaces. To find matching world positions, the background is treated as a single layer morphing in a chaotic way, permitting the optimized algorithm to be executed only once. Further improving performance over the prior linear gradient descent, the present techniques can apply a cross function and numerical optimization, such as Newton's quadratic target or other convergence function, to locate pixels via a vector angle minimization. Determined motion vectors can then serve as input for services including image denoising.
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, one or more outlet reservoirs are associated with a stabilizing subsystem and a rack so that one or more outlet reservoirs can receive two-phase fluid that is outlet from a plurality of cold plates of a rack and so that a stabilizing subsystem can stabilize a quality factor of a two-phase fluid to a predetermined quality factor before heat is removed from a two-phase fluid and it is cycled to such cold plates.
A first intermediate representation of a first portion of a source code implementing an application and a second intermediate representation of a second portion of the source code is received by a processing device. The first intermediate representation and the second intermediate representation is merged, at run-time, into a merged intermediate representation, wherein the first intermediate representation includes a reference to a function in the second intermediate representation. An execution flow transfer instruction within the merged intermediate representation is identified based on a run-time value of a parameter of the application. The execution flow transfer instruction references the function. A set of executable instructions implementing the function is identified within the merged intermediate representation. The execution flow transfer instruction is replaced with a copy of the set of executable instructions implementing the function.
G06F 9/455 - Dispositions pour exécuter des programmes spécifiques Émulation; Interprétation; Simulation de logiciel, p.ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
According to various embodiments, a processing subsystem includes: a processor mounted on a first printed circuit board that is oriented parallel to a first plane; a heat sink thermally coupled to the processor; a second printed circuit board that is communicatively coupled to the first printed circuit board and oriented parallel to a second plane, wherein the second plane is not parallel with the first plane; and at least one cooling fan that is positioned to direct a cooling fluid through the heat sink in a direction parallel to the first plane.
Disclosed are apparatuses, systems, and techniques that may perform methods of pyramid optical flow processing with efficient identification and handling of object boundary pixels. In pyramid optical flow, motion vectors for pixels of image layers having a coarse resolution may be used as hints for identification of motion vectors for pixels of image layers having a higher resolution. Pixels that are located near apparent boundaries between foreground and background objects may receive multiple hints from lower-resolution image layers, for more accurate identification of matching pixels across different image levels of the pyramid.
In various examples, methods and systems are provided for estimating depth values for images (e.g., from a monocular sequence). Disclosed approaches may define a search space of potential pixel matches between two images using one or more depth hypothesis planes based at least on a camera pose associated with one or more cameras used to generate the images. A machine learning model(s) may use this search space to predict likelihoods of correspondence between one or more pixels in the images. The predicted likelihoods may be used to compute depth values for one or more of the images. The predicted depth values may be transmitted and used by a machine to perform one or more operations.
G06T 7/55 - Récupération de la profondeur ou de la forme à partir de plusieurs images
G06T 7/70 - Détermination de la position ou de l'orientation des objets ou des caméras
G06V 10/46 - Descripteurs pour la forme, descripteurs liés au contour ou aux points, p.ex. transformation de caractéristiques visuelles invariante à l’échelle [SIFT] ou sacs de mots [BoW]; Caractéristiques régionales saillantes
Apparatuses, systems, and methods to perform diagnostic evaluation of data center data. In at least one embodiment, one or more processors determine one or more diagnostic results based, at least in part, on a trained application model to receive homomorphically encrypted log data and to execute the model with the homomorphically encrypted log data.
A manipulation task may include operations performed by one or more manipulation entities on one or more objects. This manipulation task may be broken down into a plurality of sequential sub-tasks (policies). These policies may be fine-tuned so that a terminal state distribution of a given policy matches an initial state distribution of another policy that immediately follows the given policy within the plurality of policies. The fine-tuned plurality of policies may then be chained together and implemented within a manipulation environment.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to indicate a number of 5G-NR cells that are able to be performed concurrently by one or more processors; a processor is to perform an API to indicate whether one or more processors are able to perform a first number of 5G-NR cells concurrently; a processor comprising one or more circuits is to perform an API to indicate whether one or more resources of one or more processors are allocated to perform 5G-NR cells; and/or a processor comprises one or more circuits to perform an API to indicate one or more techniques to be used by one or more processors in performing one or more 5G-NR cells.
Apparatuses, systems, and techniques to perform one or more APIs. In at least one embodiment, a processor is to perform an API to indicate a number of 5G-NR cells that are able to be performed concurrently by one or more processors; a processor is to perform an API to indicate whether one or more processors are able to perform a first number of 5G-NR cells concurrently; a processor comprising one or more circuits is to perform an API to indicate whether one or more resources of one or more processors are allocated to perform 5G-NR cells; and/or a processor comprises one or more circuits to perform an API to indicate one or more techniques to be used by one or more processors in performing one or more 5G-NR cells.
H04L 67/61 - Ordonnancement ou organisation du service des demandes d'application, p.ex. demandes de transmission de données d'application en utilisant l'analyse et l'optimisation des ressources réseau requises en tenant compte de la qualité de service [QoS] ou des exigences de priorité
H04W 24/02 - Dispositions pour optimiser l'état de fonctionnement
H04L 67/63 - Ordonnancement ou organisation du service des demandes d'application, p.ex. demandes de transmission de données d'application en utilisant l'analyse et l'optimisation des ressources réseau requises en acheminant une demande de service en fonction du contenu ou du contexte de la demande
80.
INTERFACING FLOW CONTROLLERS FOR DATACENTER COOLING SYSTEMS
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a first interfacing flow controller includes a sensor and is associated with a first server tray of a rack, so that a first interfacing flow controller can receive sensor inputs and can communicate with a second interfacing flow controller by a communication line there between, where a second interfacing flow controller can be associated with a coolant distribution unit (CDU) to cause a balance of coolant flow to be provided from a CDU to one or more second server trays based in part on a change in a coolant flow to a first server tray as indicated by such sensor inputs.
Disclosed are apparatuses, systems, and techniques to perform and facilitate secure ladder computational operations whose iterative execution depends on secret values associated with input data. Disclosed embodiments balance execution of various iterations in a way that is balanced for different secret values, significantly reducing vulnerability of ladder computations to adversarial side-channel attacks.
Disclosed are apparatuses, systems, and techniques that may perform methods of pyramid optical flow processing with efficient identification and handling of object boundary pixels. In pyramid optical flow, motion vectors for pixels of image layers having a coarse resolution may be used as hints for identification of motion vectors for pixels of image layers having a higher resolution. Pixels that are located near apparent boundaries between foreground and background objects may receive multiple hints from lower-resolution image layers, for more accurate identification of matching pixels across different image levels of the pyramid.
In various examples, a multi-sensor fusion machine learning model – such as a deep neural network (DNN) – may be deployed to fuse data from a plurality of individual machine learning models. As such, the multi-sensor fusion network may use outputs from a plurality of machine learning models as input to generate a fused output that represents data from fields of view or sensory fields of each of the sensors supplying the machine learning models, while accounting for learned associations between boundary or overlap regions of the various fields of view of the source sensors. In this way, the fused output may be less likely to include duplicate, inaccurate, or noisy data with respect to objects or features in the environment, as the fusion network may be trained to account for multiple instances of a same object appearing in different input representations.
G06V 20/58 - Reconnaissance d’objets en mouvement ou d’obstacles, p.ex. véhicules ou piétons; Reconnaissance des objets de la circulation, p.ex. signalisation routière, feux de signalisation ou routes
Automated detection of events in content can be performed using regions of information associated with various user interface or display elements. Certain elements can be indicative of a type of event, and regions associated with these elements can be analyzed on a per-frame basis. If one of these primary regions shows a state or transition that is indicative of one of these events, one or more secondary regions can be analyzed as well to attempt to verify whether that event occurred, as well as whether that event qualifies for selection for additional use. Selected events can be used for purposes such as to generate highlight montages, training videos, or user profiles. These events may be positioned at different layers of an event hierarchy, where child regions are only analyzed for frames where a parent region is indicative of a type of event.
A63F 13/77 - Aspects de sécurité ou de gestion du jeu incluant les données relatives aux dispositifs ou aux serveurs de jeu, p.ex. données de configuration, version du logiciel ou quantité de mémoire
A63F 13/537 - Commande des signaux de sortie en fonction de la progression du jeu incluant des informations visuelles supplémentaires fournies à la scène de jeu, p.ex. en surimpression pour simuler un affichage tête haute [HUD] ou pour afficher une visée laser dans un jeu de tir utilisant des indicateurs, p.ex. en montrant l’état physique d’un personnage de jeu sur l’écran
H04N 21/478 - Services additionnels, p.ex. affichage de l'identification d'un appelant téléphonique ou application d'achat
H04N 21/439 - Traitement de flux audio élémentaires
85.
MOTION GENERATION USING ONE OR MORE NEURAL NETWORKS
Apparatuses, systems, and techniques are presented to generate one or more images. In at least one embodiment, one or more neural networks are used to generate one or more images of one or more objects based, at least in part, on a model of the one or more objects and texture information.
A time-to-digital converter (TDC) circuit includes self-referenced delay cell circuits each including: a first inverter coupled with a second inverter, the first inverter receiving a positive time signal representative of an incoming up signal; a third inverter coupled with a fourth inverter, the third inverter receiving a negative time signal representative of an incoming down signal; a first bank of capacitors coupled to a first node between the first/second inverters; and a second bank of capacitors coupled to a second node between the third/fourth inverters. Control logic generates first control signals, each with an up value, to selectively control the first bank of capacitors. Control logic generates second control signals, each with a down value, to selectively control the second bank of capacitors. The up values vary relative to the down values across the first control signals and the second control signals.
H03L 7/089 - Commande automatique de fréquence ou de phase; Synchronisation utilisant un signal de référence qui est appliqué à une boucle verrouillée en fréquence ou en phase - Détails de la boucle verrouillée en phase concernant principalement l'agencement de détection de phase ou de fréquence y compris le filtrage ou l'amplification de son signal de sortie le détecteur de phase ou de fréquence engendrant des impulsions d'augmentation ou de diminution
H03L 7/099 - Commande automatique de fréquence ou de phase; Synchronisation utilisant un signal de référence qui est appliqué à une boucle verrouillée en fréquence ou en phase - Détails de la boucle verrouillée en phase concernant principalement l'oscillateur commandé de la boucle
87.
TEXTURE TRANSFER AND SYNTHESIS USING ALIGNED MAPS IN IMAGE GENERATION SYSTEMS AND APPLICATIONS
Approaches presented herein can utilize a network that learns to embed three-dimensional (3D) coordinates on a surface of one or more 3D shapes into an aligned two-dimensional (2D) texture space, where corresponding parts of different 3D shapes can be mapped to the same location in a texture image. Alignment can be performed using a texture alignment module that generates a set of basis images for synthesizing textures. A trained network can generate a basis shared by all shape textures, and can predict input-specific coefficients to construct the output texture for each shape as a linear combination of the basis images, then deform the texture to match the pose of the input. Such an approach can ensure alignment of textures, even in situations with at least somewhat limited network capacity. To unwrap shapes of complex structure or topology, a masking network can be utilized that cuts the shape into multiple pieces to reduce the distortion in the 2D mapping.
A method for forming a printed circuit board includes: forming on a substrate a first conductive layer for a first edge connector pin and a first conductive layer for a second edge connector pin, wherein the first conductive layer for the first edge connector pin and the first conductive layer for the second edge connector pin are electrically coupled to one another via a first conductive layer for an electrical bridging element; electroplating a second conductive layer onto both the first conductive layer for the first edge connector pin and the first conductive layer for the second edge connector pin via a plating current conductor; and removing at least a portion of the electrical bridging element to electrically separate the first edge connector pin from the second edge connector pin.
A plurality of virtual processing units associated with a physical processing unit is identified. Each of the plurality of virtual processing units is associated with a virtual machine of a plurality of virtual machines that run on respective virtual processing units in round-robin order using respective assigned execution time periods. A first overhead time value associated with running of the first virtual machine on a first virtual processing unit of the plurality of virtual processing units is obtained for a first virtual machine of the plurality of virtual machines. A second overhead time value associated with running of the second virtual machine on a second virtual processing unit of the plurality of virtual processing units is obtained for a second virtual machine of the plurality of virtual machines. The first overhead time value associated with the running of the first virtual machine and the second overhead time value associated with running of the second virtual machine are compared. Whether the second overhead time value associated with the running of the second virtual machine satisfies a compensation threshold criterion is determined based on the comparing. Responsive to determining that the second overhead time value associated with the running of the second virtual machine satisfies the compensation threshold criterion, causing the running of the second virtual machine to be repeated prior to running any other of the plurality of virtual machines.
G06F 9/455 - Dispositions pour exécuter des programmes spécifiques Émulation; Interprétation; Simulation de logiciel, p.ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
G06F 9/48 - Lancement de programmes; Commutation de programmes, p.ex. par interruption
A computer based system and method for sending data packets over a data network may include: preparing data packets and packet descriptors on one or more graphical processing units (GPUs); associating packets with a packet descriptor, which may determine a desired transmission time of the packets associated with that descriptor; receiving an indication of a clock time; and physically transmitting packets via an output interface, at a clock time corresponding to the desired transmission time. A computer based system and method for GPU-initiated communication over a 5G data network may include allocating one or more memory buffers in GPU memory; performing at least one 5G signal processing procedure by a GPU; preparing descriptors for a plurality of packets, where each packet includes allocated memory buffers, and where the descriptors provide scheduling instructions for the packets; and triggering the sending of packets over the network based on prepared descriptors.
Approaches presented herein provide for a framework to integrate human provided feedback in natural language to update a robot planning cost or value. The natural language feedback may be modeled as a cost or value associated with completing a task assigned to the robot. This cost or value may then be added to an initial task cost or value to update one or more actions to be performed by the robot. The framework can be applied to both real work and simulated environments where the robot may receive instructions, in natural language, that either provide a goal, modify an existing goal, or provide constraints to actions to achieve an existing goal.
Apparatuses, systems, and techniques are presented to generate one or more images. One or more neural networks are used to generate one or more images of one or more objects based, at least in part, on a model of the one or more objects and texture information.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
93.
CIRCUIT STRUCTURES TO MEASURE FLIP-FLOP TIMING CHARACTERISTICS
A ring oscillator circuit with a frequency that is sensitive to the timing of a clock-to-Q (clk2Q) propagation delay of one or more flip-flops utilized in the ring oscillator. The clock2Q is the delay between the clock signal arriving at the clock pin on the flop and the Q output reflecting the state of the input data signal to the flop. Clk2q delay measurements are made based on measurement of the ring oscillator frequency, leading to more accurate estimates of clk2Q for different types of flip-flops and flip-flop combinations, which may in turn enable improvements in circuit layouts, performance, and area.
In various examples, the decoding and upscaling capabilities of a client device are analyzed to determine encoding parameters and operations used by a content streaming server to generate encoded video streams. The quality of the upscaled content of the client device may be monitored by the streaming servers such that the encoding parameters may be updated based on the monitored quality. In this way, the encoding operations of one or more streaming servers may be more effectively matched to the decoding and upscaling abilities of one or more client devise such that an increased number of client devices may be served by the streaming servers.
In various examples, a diagnostic circuit is connected to a target system to automatically trigger the target system to enter a diagnostic mode. The diagnostic circuit receives diagnostic data from the target system when the target system performs a diagnostic operation in the diagnostic mode.
Embodiments of the present disclosure relate to memory stacked on processor for high bandwidth. Systems and methods are disclosed for providing a one-level memory for a processing system by stacking bulk memory on a processor die. In an embodiment, one or more memory dies are stacked on the processor die. The processor die includes multiple processing tiles, where each tile includes a processing unit, mapper, and tile network. Each memory die includes multiple memory tiles. The processing tile is coupled to each memory tile that is above or below the processing tile. The vertically aligned memory tiles comprise the local memory block for the processing tile. The ratio of memory bandwidth (byte) to floating-point operation (B:F) may improve 50× for accessing the local memory block compared with conventional memory. Additionally, the energy consumed to transfer each bit may be reduced by 10×.
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans le même sous-groupe des groupes , ou dans une seule sous-classe de , , p.ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
A computer based system and method for sending data packets over a data network may include: preparing data packets and packet descriptors on one or more graphical processing units (GPUs); associating packets with a packet descriptor, which may determine a desired transmission time of the packets associated with that descriptor; receiving an indication of a clock time; and physically transmitting packets via an output interface, at a clock time corresponding to the desired transmission time. A computer based system and method for GPU-initiated communication over a 5G data network may include allocating one or more memory buffers in GPU memory; performing at least one 5G signal processing procedure by a GPU; preparing descriptors for a plurality of packets, where each packet includes allocated memory buffers, and where the descriptors provide scheduling instructions for the packets; and triggering the sending of packets over the network based on prepared descriptors.
Systems and methods for operating a datacenter are disclosed. In at least one embodiment, an apparatus comprises a controller to control a proportion of coolant provided by an air cooling unit and a liquid cooling unit based, at least in part, on temperature of one or more electronic components.
A transceiver circuit includes a receiver front end utilizing a ring oscillator, and a transmitter front end utilizing a pass-gate circuit in a first feedback path across a last-stage driver circuit. The transceiver circuit provides low impedance at low frequency and high impedance at high frequency, and desirable peaking behavior.
A method dynamically selects one of a first sampling order and a second sampling order for a ray trace of pixels in a tile where the selection is based on a motion vector for the tile. The sampling order may be a bowtie pattern or an hourglass pattern.
H04N 19/132 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’élément, le paramètre ou la sélection affectés ou contrôlés par le codage adaptatif Échantillonnage, masquage ou troncature d’unités de codage, p.ex. ré-échantillonnage adaptatif, saut de trames, interpolation de trames ou masquage de coefficients haute fréquence de transformée
H04N 19/423 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques - caractérisés par les détails de mise en œuvre ou le matériel spécialement adapté à la compression ou à la décompression vidéo, p.ex. la mise en œuvre de logiciels spécialisés caractérisés par les dispositions des mémoires
H04N 19/182 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’unité de codage, c. à d. la partie structurelle ou sémantique du signal vidéo étant l’objet ou le sujet du codage adaptatif l’unité étant un pixel