According to one implementation of the present disclosure, a method includes: receiving, by a hardware design generation circuit, a plurality of input signals of a software workload on a processing unit; training a power prediction model based on a toggling of the input signals accumulated over a training interval range; determining, by the hardware design generation circuit, a plurality of prediction proxies and respective weightings for the plurality of prediction proxies based at least partially on the trained power prediction model, wherein the plurality of weighted prediction proxies correspond to a power output of the hardware design generation circuit; and generating an updated circuit design of the processing unit based on the power output.
G06F 30/327 - Synthèse logique; Synthèse de comportement, p.ex. logique de correspondance, langage de description de matériel [HDL] à liste d’interconnections [Netlist], langage de haut niveau à langage de transfert entre registres [RTL] ou liste d’interconnections [Netlist]
G06F 30/27 - Optimisation, vérification ou simulation de l’objet conçu utilisant l’apprentissage automatique, p.ex. l’intelligence artificielle, les réseaux neuronaux, les machines à support de vecteur [MSV] ou l’apprentissage d’un modèle
G06F 30/3308 - Vérification de la conception, p.ex. simulation fonctionnelle ou vérification du modèle par simulation
2.
A DATA PROCESSING APPARATUS AND METHOD FOR ADDRESS TRANSLATION
An apparatus and method are provided for storing a plurality of translation entries in a cache, each translation entry corresponding to one of a plurality of page table entries and defining a translation between a first address and a second address, and encoding control information indicative of an attribute of each page table entry; returning, in response to a lookup querying a first lookup address, a corresponding second address when the first lookup address corresponds to one of the plurality of translation entries stored in the cache; modifying at least some of the control information in response to notification of a modification of the attribute in a page table entry; and retaining in the cache at least one translation entry corresponding to the page table entry for use in a subsequent address lookup querying a corresponding first lookup address in response to the notification of the modification of the attribute in the page table entry.
G06F 12/1009 - Traduction d'adresses avec tables de pages, p.ex. structures de table de page
G06F 12/1027 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p.ex. un répertoire de pages actives [TLB]
There is provided a graphics primitive assembly circuit comprising an early primitive assembly data generator operable to supply primitive input to a shader and a buffer operable to store early primitive assembly data during operation of the shader and to supply the early primitive assembly data to a late primitive assembly circuit element responsive to completion of operation of the shader. The circuit may also include a compressor that compresses the early primitive assembly data to reduce the amount of storage taken up by the buffer and the bandwidth required to transfer the early primitive assembly data.
A data processing apparatus includes control flow prediction circuitry that generates a control flow prediction in respect of a group of one or more instructions. Storage circuitry used by the control flow prediction circuitry stores data in association with groups of instructions used to generate the control flow prediction for each of the groups of instructions. Control flow prediction update circuitry inserts new data into the storage circuitry in association with a new group of one or more instructions in dependence on one or more conditions being met when a miss occurs for the group of one or more instructions in the storage circuitry.
A multiple-outer-product instruction specifies multiple first source vector operands, at least one second source vector operand and correlation information associated with the second source vector operand(s), each vector operand comprising multiple data elements and the correlation information indicating, for each data element of a given second source vector operand, a corresponding first source vector operand. In response to the multiple-outer-product instruction, instruction decoder circuitry (50) controls processing circuitry (60) to perform computations to implement outer product operations, the outer product operations comprising, for a given first source vector operand, performing an associated outer product of that first source vector operand with a subset of data elements of the second source vector operand(s). The processing circuitry selects, for each data element of the second source vector operand(s), a corresponding first source vector operand to be used when performing the associated outer product operation, in dependence on the correlation information.
A method of operating a network-reachable initiator computing entity, comprising: determining a requirement, at the initiator computing entity, for offloading data processing; initiating a message, to be sent on a network to a destination, requesting a response from network-reachable recipient computing entities along the network, from the network-reachable initiator computing entity to the destination, indicating a capacity to perform offloaded data processing; and receiving a response from a first network-reachable recipient computing entity indicating a capacity to perform offloaded data processing.
H04L 67/289 - Traitement intermédiaire fonctionnellement situé à proximité de l'application consommatrice de données, p.ex. dans la même machine, dans le même domicile ou dans le même sous-réseau
G06F 9/50 - Allocation de ressources, p.ex. de l'unité centrale de traitement [UCT]
H04L 67/51 - Découverte ou gestion de ceux-ci, p.ex. protocole de localisation de service [SLP] ou services du Web
H04L 67/59 - Approvisionnement des services mandataires en fournissant une assistance opérationnelle aux appareils terminaux par déchargement dans le réseau ou par émulation, p.ex. lorsqu'ils ne sont pas disponibles
H04L 43/10 - Surveillance active, p.ex. battement de cœur, utilitaire Ping ou trace-route
7.
ISSUING A SEQUENCE OF INSTRUCTIONS INCLUDING A CONDITION-DEPENDENT INSTRUCTION
An apparatus, method and computer program, the apparatus comprising processing circuitry to execute instructions, issue circuitry to issue the instructions for execution by the processing circuitry, and candidate instruction storage circuitry to store a plurality of condition-dependent instructions, each specifying at least one condition. The issue circuitry is configured to issue a given condition-dependent instruction in response to a determination or a prediction of the at least one condition specified by the given condition-dependent instruction being met, and when the given condition-dependent instruction is a sequence-start instruction, the issue circuitry is responsive to the determination or prediction to issue a sequence of instructions comprising the sequence-start instruction and at least one subsequent instruction.
An apparatus has processing circuitry with execution units to perform operations, physical registers to store data, and forwarding circuitry to forward the data from the physical registers to the execution units. The forwarding circuitry provides an incomplete set of connections between the physical registers and the execution units such that, for each of at least some of the physical registers, the physical register is connected to only a subset of the execution units. The apparatus also has register renaming circuitry to map logical registers identified by the operations to respective physical registers and register reorganisation circuitry to monitor upcoming operations and to determine, based on the upcoming operations and the connections provided by the forwarding circuitry, whether to perform a register reorganisation procedure to change a mapping between the logical registers and the physical registers. The register reorganisation circuitry is also configured to perform the register reorganisation procedure.
A data processing system that comprises a processing unit and a communications bus over which bus transactions to access memory can be performed is disclosed. The system includes a codec, and the processing unit can initiate over the communications bus, bus transactions that comprise the codec accessing the memory.
G06F 13/16 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus de mémoire
10.
AN APPARATUS, A METHOD OF OPERATING AN APPARATUS, AND A NON-TRANSITORY COMPUTER READABLE MEDIUM TO STORE COMPUTER-READABLE CODE FOR FABRICATION OF AN APPARATUS
There is provided an apparatus provided with counter control circuitry to maintain counters associated with data items including: minor counters, middle counters, and a major counter. The apparatus is also provided with a memory protection unit configured, in response to a transfer of a data item from secure storage to off-chip storage, to modify a minor counter associated with the data item, and to encrypt the data item based on counters associated with the data item. The memory protection unit is also responsive to an overflowing minor counter, to perform a middle re-encryption process comprising modifying a middle counter associated with the data item and re-encrypting data items associated with the middle counter. The memory protection unit is also responsive to an overflowing middle counter, to perform a major re- encryption process comprising modifying the major counter, and re-encrypting each of the data items.
G06F 12/14 - Protection contre l'utilisation non autorisée de mémoire
G06F 21/64 - Protection de l’intégrité des données, p.ex. par sommes de contrôle, certificats ou signatures
G06F 21/79 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du stockage de données dans les supports de stockage à semi-conducteurs, p.ex. les mémoires adressables directement
An apparatus comprises an instruction decoder 20 and processing circuitry 22. Monitoring circuitry 36 monitors one or more events indicative of a potential update to data associated with any of a monitored set of addresses, and makes accessible to software executing on the processing circuitry 22 a monitoring reporting indication indicative of whether any events has occurred for at least one of the monitored set of addresses. In response to decoding of an exclusive status setting instruction specifying a given address, the processing circuitry 22 sets an exclusive status associated with the given address. The exclusive status is cleared in response to detecting an event indicative of a conflicting memory access to the given address. In response to decoding of a monitor exclusive instruction, the processing circuitry 22: determines whether the exclusive status is associated with a target address, and if so allocates the target address to be one of the monitored set of addresses.
Key capability storage circuitry 90 is provided to store a key capability specifying key bounds indicating information indicative of permissible bounds for information specified by any one or more of: a non-capability operand, a capability, or the key capability itself. For a given software compartment executed by the processing circuitry, which lacks a key capability operating privilege associated with at least a portion of the key capability storage circuitry, the processing circuitry is configured to prohibit certain manipulations of the key capability, including a transfer between key capability storage and a memory location selected by the given software compartment. This can help to support temporal safety.
G06F 21/80 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du stockage de données dans les supports de stockage magnétique ou optique, p.ex. disques avec secteurs
G06F 21/52 - Contrôle des usagers, programmes ou dispositifs de préservation de l’intégrité des plates-formes, p.ex. des processeurs, des micrologiciels ou des systèmes d’exploitation au stade de l’exécution du programme, p.ex. intégrité de la pile, débordement de tampon ou prévention d'effacement involontaire de données
Data communication apparatus comprises a receiver comprising message receiver circuitry to receive payload messages and sender control messages from message sender circuitry, the message receiver circuitry comprising: communication circuitry to send receiver control messages to the message sender circuitry, the receiver control messages relating to actions by the message receiver circuitry in response to payload messages or sender control messages from the message sender circuitry; in which the communication circuitry is configured to selectively associate a respective indication with at least some of the receiver control messages sent to the message sender circuitry, the indication indicating whether a given receiver control message with which the indication is associated is a first receiver control message sent by the communication circuitry to the message sender circuitry after a reset of circuitry in the receiver.
There is provided an apparatus, method and medium for data processing. The apparatus comprises a register file comprising a plurality of data registers, and frontend circuitry responsive to an issued instruction, to control processing circuitry to perform a processing operation to process an input data item to generate an output data item. The processing circuitry is responsive to a first encoding of the issued instruction specifying a data register, to read the input data item from the data register, and/or write the output data item to the data register. The processing circuitry is responsive to a second encoding of the issued instruction specifying a buffer-region of the register file for storing a queue of data items, to perform the processing operation and to perform a dequeue operation to dequeue the input data item from the queue, and/or perform an enqueue operation to enqueue the output data item to the queue.
One or more triggered-instruction processing elements are provided, a given triggered-instruction processing element comprising execution circuitry to execute processing operations in response to instructions according to a triggered instruction architecture. Input channel processing circuitry receives a given tagged data item (comprising a data value and a tag value) for a given input channel, and in response controls enqueuing of the data value of the given tagged data item to a selected buffer structure selected from among at least two buffer structures mapped onto register storage accessible to one or more of the triggered-instruction processing elements in response to a computation instruction for controlling performance of a computation operation. The selected buffer structure is selected based at least on the tag value, so data values of tagged data items specifying different tag values for the given input channel are allocatable to different buffer structures.
A computer-implemented method of operating a device is provided. The method comprises operating a sensor to capture a data input, individuating an element of the data input, tagging an individuated element with metadata, matching the metadata with an associated permission set, and applying a restricting function defined in the associated permission set to the individuated element during a process flow to produce augmented reality output data restricted as required by the associated permission set. A device is also provided, comprising a sensor, an individuating component to individuate an element of sensor data from the sensor, a tagging component to tag the individuated element, a matching component to match a tag of the individuated element with a permission of a permission set, and a restricting function component to restrict an application's interaction with the individuated element.
Apparatus, methods, and software for protecting a plurality of memory locations are disclosed. Logical addresses are translated into physical addresses in dependence on one of a first translation function and a second translation function. A transitional logical address and an associated transitional value are locally held in circuitry which applies the translation functions. A remapping of first to second translation function usage is performed by determining a new transitional physical address by applying the second translation function to the transitional logical address; determining a new transitional logical address by applying an inverse of the first translation function to the new transitional physical address; retrieving a new transitional value using the new transitional physical address; storing the old transitional value to the memory location indicated by the new transitional physical address; and locally storing the new transitional value. This remapping can be interleaved with normal memory accesses.
Aspects of the present disclosure relate to apparatus comprising prediction circuitry comprising a plurality of prediction units, said plurality comprising a plurality of types of prediction unit. Each prediction unit is configured to perform a corresponding type of prediction in respect of operations that are to be executed by the apparatus. Shared prediction resource circuitry comprises shared prediction resources configurable to perform said types of prediction. Resource allocation circuitry is configured to determine an allocation of said shared prediction resources to one or more of said plurality of prediction units, and allocate the shared prediction resources according to the determination.
An apparatus has processing circuitry with one or more execution units to perform operations in response to instructions. The apparatus also has registers to store data accessed by the processing circuitry and forwarding circuitry to forward results of the operations from the execution units to be written back to the registers and to the execution units for use as operands of further operations. The apparatus also has write-back reschedule circuitry which for each operation causes an execution unit performing the operation to stall the operation prior to a write-back stage of the execution unit and determine, based on monitoring subsequent operations whether to forward the result of the operation to be written back to a register or to forward the result to an execution unit. The write-back reschedule circuitry also controls the forwarding circuitry to forward the result according to the determination.
There is provided an apparatus, method and medium. The apparatus comprises a store buffer to store a plurality of store requests, where each of the plurality of store requests identifies a storage address and a data item to be transferred to storage beginning at the storage address, where the data item comprises a predetermined number of bytes. The apparatus is responsive to a memory access instruction indicating a store operation specifying storage of N data items, to determine an address allocation order of N consecutive store requests based on a copy direction hint indicative of whether the memory access instruction is one of a sequence of memory access instructions each identifying one of a sequence of sequentially decreasing addresses, and to allocate the N consecutive store requests to the store buffer in the address allocation order.
An apparatus comprises counter tree circuitry configured to store, in a first node of a counter tree, a representation of a parent counter value and in a second node of the counter tree, wherein the second node is a child node of the first node, an encrypted representation of two or more counter values. The encryption operation for forming the encrypted representation of the two or more counter values takes as an input the parent counter value. The apparatus also comprises integrity checking circuitry to check the integrity of an item of data retrieved from memory based on a comparison between a stored authentication code and a generated authentication code generated based on the item of data and a decrypted counter value determined from an encrypted representation of a counter value retrieved from the second node, decrypted using a parent counter value retrieved from the first node.
G06F 21/74 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du calcul ou du traitement de l’information opérant en mode dual ou compartimenté, c. à d. avec au moins un mode sécurisé
An apparatus comprises counter integrity tree circuitry to maintain a counter integrity tree having a plurality of nodes. The counter integrity tree circuitry is configured to store, in a first node of the counter integrity tree, an encrypted representation of two or more non-repeating counters and in a second, parent, node, an indication of a function value equal to a non-repeating function of the two or more non-repeating counters of the first node. The apparatus comprises integrity checking circuitry configured to check the integrity of the first node using the function value retrieved from the second node.
H04L 9/32 - Dispositions pour les communications secrètes ou protégées; Protocoles réseaux de sécurité comprenant des moyens pour vérifier l'identité ou l'autorisation d'un utilisateur du système
A method of operating a cache system is disclosed. When it is desired to evict a cache entry from the cache, a cache entry to evict from the cache is selected using an age of any compression block that the cache is caching data for, and the selected cache entry is evicted from the cache.
G06F 12/0891 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache utilisant des moyens d’effacement, d’invalidation ou de réinitialisation
Disclose herein is a method of operating a graphics processor when performing ray tracing. During a traversal of the nodes of an acceleration data structure, when a parent node that encompasses multiple child node volumes is encountered, a group of rays is tested against the child node volumes to determine which child nodes may need to be visited next. Rather than simply visiting the nodes based on the order in which they are found to be interested, the node traversal order is instead determined based on the group of rays.
For a predetermined class of load/store operations, load/store processing circuitry buffers store data of predetermined-class store operations in a predetermined-class store buffer, and controls store-to-load forwarding of store data from that buffer to predetermined- class load operations. A predetermined-class-load/store synchronization instruction controls the load/store processing circuitry to enforce that, for a hazarding younger non -predetermined-class load/store operation occurring after the predetermined-class-load/store synchronization instruction in program order and a hazarding older predetermined-class store operation occurring before the predetermined-class-load/store synchronization instruction in program order, for which address ranges overlap, the hazarding younger non-predetermined-class load/store operation observes a result of the hazarding older predetermined-class store operation. In absence of any intervening predetermined-class-load/store synchronization instruction between a given older predetermined-class store operation and a given younger non-predetermined-class load/store operation with overlapping address range, the given younger non-predetermined-class load/store operation is permitted to fail to observe a result of the given older predetermined-class store operation.
According to one implementation of the present disclosure, a circuit structure is configured to store charge in a charge-based storage element, where the charge-based storage element is disposed at least partially in a shallow-trench-isolation (STI) region of the circuit. According to one implementation of the present disclosure, a method includes: providing a circuit structure disposed on a substrate and a shallow-trench-isolation (STI) region of a circuit; forming an opening of the substrate and the STI region by removing a portion of the substrate and STI region; placing a first liner material in the opening and on remaining portions of the substrate and the STI region; and depositing a first metal layer in the opening on the first liner material.
An apparatus and method are described for providing a trusted execution environment. The apparatus comprises processing circuitry to execute program code, and interrupt controller circuitry, responsive to receipt of one or more interrupt requests, to select a given interrupt request from amongst the one or more interrupt requests, and to issue an interrupt signal to the processing circuitry identifying a given interrupt service routine providing program code to be executed by the processing circuitry to service the given interrupt request. The interrupt controller circuitry is responsive to the given interrupt request being a trusted execution environment (TEE) interrupt request, to issue the interrupt signal to identify as the given interrupt service routine a TEE interrupt service routine, and to inhibit issuance of any further interrupt signal until the TEE interrupt service routine has been executed by the processing circuitry. The interrupt controller circuitry comprises code protection circuitry to inhibit unauthorised modification of the TEE interrupt service routine, and data protection circuitry to inhibit unauthorised access to confidential data processed by the TEE interrupt service routine.
G06F 21/53 - Contrôle des usagers, programmes ou dispositifs de préservation de l’intégrité des plates-formes, p.ex. des processeurs, des micrologiciels ou des systèmes d’exploitation au stade de l’exécution du programme, p.ex. intégrité de la pile, débordement de tampon ou prévention d'effacement involontaire de données par exécution dans un environnement restreint, p.ex. "boîte à sable" ou machine virtuelle sécurisée
28.
SYSTEM, DEVICES AND/OR PROCESSES FOR ADAPTIVE IMAGE RESOLUTION SCALING
Example methods, apparatuses, and/or articles of manufacture are disclosed that may implement, in whole or in part, techniques to process portions of an image frame according to a level of diminished signal information. Portions of an image frame experiencing diminished signal information may be sampled a lower rate/more sparsely to reduce impacts to downstream image processing resources.
H04N 19/59 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage prédictif mettant en œuvre un sous-échantillonnage spatial ou une interpolation spatiale, p.ex. modification de la taille de l’image ou de la résolution
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p.ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersections; Analyse de connectivité, p.ex. de composantes connectées
H04N 19/176 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’unité de codage, c. à d. la partie structurelle ou sémantique du signal vidéo étant l’objet ou le sujet du codage adaptatif l’unité étant une zone de l'image, p.ex. un objet la zone étant un bloc, p.ex. un macrobloc
A context-information-dependent instruction causes a context-information-dependent operation to be performed based on specified context information indicative of a specified execution context. A context information translation cache 10 stores context information translation entries each specifying untranslated context information and translated context information. Lookup circuitry 14 performs a lookup of the context information translation cache based on the specified context information, to identify whether the context information translation cache includes a matching context information translation entry which is valid and which specifies untranslated context information corresponding to the specified context information. When the matching context information translation entry is identified, the context-information-dependent operation is performed based on the translated context information specified by the matching context information translation entry.
G06F 12/0811 - Systèmes de mémoire cache multi-utilisateurs, multiprocesseurs ou multitraitement avec hiérarchies de mémoires cache multi-niveaux
G06F 12/0837 - Protocoles de cohérence de mémoire cache avec commande par logiciel, p.ex. données ne pouvant pas être mises en mémoire cache
G06F 12/0875 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache avec mémoire cache dédiée, p.ex. instruction ou pile
30.
METHOD, APPARATUS AND PROGRAM FOR ADJUSTING AN EXPOSURE LEVEL
A method for adjusting an exposure level of an image sensor. The method comprises receiving intensity values of a current image captured by the image sensor, the image sensor having an initial exposure level when the current image is captured. The method comprises determining an intensity status based on comparing at least one characteristic of at least the current intensity values to one or more criteria. The method comprises selecting an exposure convergence mode, from a plurality of exposure convergence modes, based on the intensity status. The method comprises calculating, based on the current intensity values and the exposure convergence mode, a new exposure level for use by the image sensor in capturing a subsequent image.
A data processing apparatus includes receive circuitry that receives an indication of a trigger block of instructions. Branch prediction circuitry provides, in response to the trigger block of instructions, branch predictions in respect of a branch within: a subsequent block of instructions subsequent to the trigger block of instructions in execution order, when in a 1-taken mode of operation and a later block of instructions subsequent to the subsequent block of instructions in execution order, when in a 2-taken mode of operation
Circuitry comprises memory address translation circuitry to access memory circuitry storing translation information defining memory address translations from input memory addresses to respective output memory addresses; in which the translation information stored by the memory circuitry comprises a hierarchy of page table levels from a highest page table level to a lowest page table level, each page table level having one or more level tables each comprising two or more entries, in which an entry of a level table at a page table level other than a last page table level of the hierarchy points to a level table at a next lower page table level in the hierarchy; the memory address translation circuitry being configured to select an entry of a level table at each page table level according to a selection value, the selection value being dependent upon a portion, applicable to that page table level, of a given input memory address; in which the memory circuitry is configured to store entries as groups of entries, a group of entries being accessible by a single memory retrieval operation; and in which, for at least a subset of the page table levels, a group of entries stored by the memory circuitry comprises a set of entries from two or more respective level tables.
A cache is provided having a plurality of entries for storing data. In response to a given access request, lookup circuitry performs a lookup operation in the cache to determine whether one of the entries in the cache is allocated to store data associated with the memory address indicated by the given access request, with a hit indication or a miss indication being generated dependent on the outcome of that lookup operation. During a single lookup period, the lookup circuitry is configured to perform lookup operations in parallel for up to N access requests. In addition, allocation circuitry is provided that is able to determine, during the single lookup period, at least N candidate entries for allocation from amongst the plurality of entries, and to cause one of the candidate entries to be allocated for each of the up to N access requests for which the lookup circuitry generates a miss indication.
Processing circuitry (16) and an instruction decoder (9) supports a load chunk instruction and a store chunk instruction which can be useful for implementing memory copy functions and other library functions for manipulating or comparing blocks of memory. Number of bytes to load or store in response to these instructions is determined based on an implementation specific condition. As well as loading or storing bytes of data, the load chunk instruction and (10) store chunk instruction also designated a load/store length value as data corresponding to an architecturally visible register, which provides an indication of a number of bytes loaded or stored.
A method for processing an image captured by an image sensor. The method comprises receiving first pre-tonemapped intensity values of a first image captured by the image sensor. The method comprises applying a first tonemapping function having a first tonemapping strength to the first pre-tonemapped intensity values to generate first tonemapped intensity values. The method comprises receiving second pre-tonemapped intensity values of a second image captured by the image sensor, the second image captured subsequent to the first image. The method comprises, based on a difference between the first tonemapped intensity values and a target for the first tonemapped intensity values, determining a second tonemapping function having a second tonemapping strength. The method comprises applying the second tonemapping function to the second pre-tonemapped intensity values to generate second tonemapped intensity values.
H04N 23/741 - Circuits de compensation de la variation de luminosité dans la scène en augmentant la plage dynamique de l'image par rapport à la plage dynamique des capteurs d'image électroniques
H04N 23/71 - Circuits d'évaluation de la variation de luminosité
H04N 23/76 - Circuits de compensation de la variation de luminosité dans la scène en agissant sur le signal d'image
36.
METHOD, APPARATUS AND PROGRAM FOR PROCESSING AN IMAGE
A method for processing an image. The method comprises receiving intensity values for each of a plurality of images captured by an image sensor during a detection window. The method comprises identifying, using the intensity values, one or more transitions occurring during the detection window, the one or more transitions each comprising a transition between an image in which a maximum clipping criterion is not satisfied and an image in which the maximum clipping criterion is satisfied. The method comprises, based on the identified transitions, at least one of: adjusting an exposure level of the image sensor; and determining a tonemapping function having a tonemapping strength, and applying the tonemapping function to the intensity values of a current image to generate tonemapped intensity values, wherein the image sensor has a current exposure level when the current image is captured.
Data processing systems, methods, and storage medium for implementing convolutional processes are provided. The data processing system includes a convolution engine and a set of weight decoders including a first weight decoder and a second weight decoder that implement a first decompression function and a second decompression function respectively. A weight decoder selection module for selecting a weight decoder from the set of weight decoders is provided. The data processing system, receives a compressed set of weight values, selects a weight decoder using the weight decoder selection module, and processes the compressed set of weight values using the selected weight decoder to obtain an uncompressed set of weight values. The uncompressed set of weight values are provided to the convolution engine. A corresponding data processing system, method, and storage medium for generating the compressed set of weight values is also provided.
There is provided an apparatus, method, and computer-readable medium. The apparatus comprises interconnect circuitry to couple a device to one or more processing elements and to one or more storage structures. The apparatus also comprises stashing circuitry configured to receive stashing transactions from the device, each stashing transaction comprising payload data and control data. The stashing circuitry is responsive to a given stashing transaction whose control data identifies a plurality of portions of the payload data, to perform a plurality of independent stashing decision operations, each of the plurality of independent stashing decision operations corresponding to a respective portion of the plurality of portions of payload data and comprising determining, with reference to the control data, whether to direct the respective portion to one of the one or more storage structures or whether to forward the respective portion to memory.
A behavioral sensor for creating consumable events can include: a feature extractor coupled to receive an event stream of events performed by a circuit, wherein the feature extractor identifies features of a particular event of the event stream and associates the particular event with a time; and a classifier coupled to receive the features of the particular event from the feature extractor, wherein the classifier classifies the particular event into a classified event associated with the time using predefined categories based on the received features of the particular event; whereby the classified event and subsequent classified events extracted from the event stream within a time frame are appended in a time series forming the consumable events.
An apparatus comprising a cache comprising a plurality of cache entries, cache access circuitry responsive to a cache access request to perform, based on a target memory address associated with the cache access request, a cache lookup operation, tracking circuitry to track pending requests to modify cache entries of the cache, and prediction circuitry responsive to the cache access request to make a prediction of whether the pending requests tracked by the tracking circuitry include a pending request to modify a cache entry associated with the target memory address, wherein the cache access circuitry is responsive to the cache access request to determine, based on the prediction, whether to perform an additional lookup of the tracking circuitry. A method and a non-transitory computer-readable medium to store computer-readable code for fabrication of the apparatus are also provided.
G06F 12/0864 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache utilisant des moyens pseudo-associatifs, p.ex. associatifs d’ensemble ou de hachage
41.
SYSTEM, METHOD AND/OR APPARATUS FOR CONTROLLING A POWER SIGNAL
Briefly, embodiments, such as methods, systems and/or circuits for controlling a power signal to be supplied to a processing device. In one aspect, a magnitude of a power supplied to a processing device may be changed based, at least in part on an estimated and/or predicted load.
An on-chip memory is provided. The memory includes wordline sections, input/output (I/O) circuitry, and control circuitry. Each wordline section includes a number of wordlines, and each wordline section is coupled to a different wordline control circuitry. The control circuitry is configured to, in response to receiving an access request including an address, decode the address including determine, based on the address, an associated wordline, and determine, based on the associated wordline, an associated wordline section. The control circuitry is further configured to apply power to wordline control circuitry coupled to the associated wordline section, and access the address.
G11C 8/08 - Circuits de commande de lignes de mots, p.ex. circuits d'attaque, de puissance, de tirage vers le haut, d'abaissement, circuits de précharge, pour lignes de mots
G11C 7/10 - Dispositions d'interface d'entrée/sortie [E/S, I/O] de données, p.ex. circuits de commande E/S de données, mémoires tampon de données E/S
G11C 8/18 - Circuits de synchronisation ou d'horloge; Génération ou gestion de signaux de commande d'adresse, p.ex. pour des signaux d'échantillonnage d'adresse de ligne [RAS] ou d'échantillonnage d'adresse de colonne [CAS]
A behavioral sensor for creating consumable events can include: a feature extractor coupled to receive an event stream of events performed by a circuit, wherein the feature extractor identifies features of a particular event of the event stream and associates the particular event with a time; and a classifier coupled to receive the features of the particular event from the feature extractor, wherein the classifier classifies the particular event into a classified event associated with the time using predefined categories based on the received features of the particular event; whereby the classified event and subsequent classified events extracted from the event stream within a time frame are appended in a time series forming the consumable events.
A burst read with flexible burst length for on-chip memory, such as, for example, system cache memory, hierarchical cache memory, system memory, etc. is provided. Advantageously, successive burst reads are performed with less signal toggling and fewer bitline swings.
G11C 11/4096 - Circuits de commande ou de gestion d'entrée/sortie [E/S, I/O] de données, p.ex. circuits pour la lecture ou l'écriture, circuits d'attaque d'entrée/sortie ou commutateurs de lignes de bits
G11C 11/4094 - Circuits de commande ou de gestion de lignes de bits
G11C 11/4091 - Amplificateurs de lecture ou de lecture/rafraîchissement, ou circuits de lecture associés, p.ex. pour la précharge, la compensation ou l'isolation des lignes de bits couplées
Dynamic power management for an on-chip memory, such as a system cache memory as well as other memories, is provided. The memory includes wordline sections, input/output (I/O) circuitry, and control circuitry. Each wordline section includes a number of wordlines, and each wordline section is coupled to a different wordline control circuitry. The control circuitry is configured to, in response to receiving an access request including an address, decode the address including determine, based on the address, an associated wordline, and determine, based on the associated wordline, an associated wordline section. The control circuitry is further configured to apply power to wordline control circuitry coupled to the associated wordline section.
G11C 8/08 - Circuits de commande de lignes de mots, p.ex. circuits d'attaque, de puissance, de tirage vers le haut, d'abaissement, circuits de précharge, pour lignes de mots
Circuitry including cache storage and control circuitry is provided. The cache storage includes an array of random access memory storage elements, and is configured to store data in multiple cache sectors, each cache sector including a number of cache storage data units. The control circuitry is configured to control access to the cache storage including, for example, accessing the cache storage data units in the cache sectors. After accessing a cache storage data unit in a cache sector, the energy requirement and/or latency for the next access to a cache storage data unit in the same sector is lower than the energy requirement and/or latency for the next access to a cache storage data unit in a different same sector.
G06F 12/0802 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache
47.
MECHANISM FOR NEURAL NETWORK PROCESSING UNIT SKIPPING
A system and computer-implemented method to train and use a neural network is disclosed. For each group of elements of a feature map in a layer in the neural network, a record is accessed to determine if at least one element of the group is active. When at least one element of the group is active, a gradient is determined for each active element of the group, copied to a 5 group element position indicated by the entry for the group in record, and the group is sent to a dot product unit to update weights in the layer based on the group. When no element of the group is active, the dot product unit is signaled to prevent update of weights based on the group. The record is set during the forward path of the feature map through the network.
G06N 3/063 - Réalisation physique, c. à d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to determine options for decisions in connection with design features of a computing device. In a particular implementation, design options for two or more design decisions of neural network processing device may be identified based, at least in part, on combination of a definition of available computing resources and one or more predefined performance constraints.
Systems and methods for processing data for a neural network are described. The system comprises non-transitory memory configured to receive data bits defining a kernel of weights, the data bits being suitable for processing input data; and a data processing unit, configured to: receive bits defining a kernel of weights for the neural network, the kernel of weights comprising one or more non-zero value weights and one or more zero-valued weights; generate a set of mask bits, a position of each bit in the set of mask bits corresponds to a position within the kernel of weights and the value of each bit indicates whether a weight in the corresponding position is a zero-valued weight or a non-zero value weight; and transmit the non-zero value weights and the set of mask bits for storage, the non-zero value weights and the set of mask bits represent the kernel of weights.
H03M 7/30 - Compression; Expansion; Elimination de données inutiles, p.ex. réduction de redondance
G06F 7/544 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p.ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul
50.
APPARATUS, METHOD OF OPERATING AN APPARATUS AND A COMPUTER PROGRAM
There is provided an apparatus, a method of operating the apparatus and a computer program for controlling a host data processing apparatus to provide an instruction execution environment equivalent to the apparatus. The apparatus comprises processing circuitry configured to execute a sequence of program instructions to process data items. The processing circuitry is configured to generate a signature indicative of executed instructions in the sequence of program instructions and indicative of the data items. The apparatus is also provided with validation circuitry configured to implement a validation procedure. The validation procedure comprises the steps of evaluating the signature against a predefined policy to verify that the processing circuitry has processed the data items using the sequence of program instructions and, in response to a match between the signature and the predefined policy, generating confirmation information to indicate the match.
G06F 21/52 - Contrôle des usagers, programmes ou dispositifs de préservation de l’intégrité des plates-formes, p.ex. des processeurs, des micrologiciels ou des systèmes d’exploitation au stade de l’exécution du programme, p.ex. intégrité de la pile, débordement de tampon ou prévention d'effacement involontaire de données
G06F 11/28 - Détection d'erreurs; Correction d'erreurs; Contrôle de fonctionnement en vérifiant que l'ordre du traitement est correct
An apparatus comprises an instruction decoder to decode instructions; processing circuitry to perform data processing in response to decoding of the instructions by the instruction decoder; and at least one control register to specify instruction-function-selecting information. In response to a no-operation-compatible instruction, the instruction decoder is configured to control the processing circuitry to: treat the no-operation-compatible instruction as a no-operation instruction, when the instruction-function-selecting information specified by the at least one control register is in a first state; perform both a first operation and a second operation, when the instruction-function-selecting information specified by the at least one control register is in a second state; and perform the first operation but not the second operation, when the instruction-function-selecting information specified by the at least one control register is in a third state.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p.ex. décodage d'instructions
G06F 21/00 - Dispositions de sécurité pour protéger les calculateurs, leurs composants, les programmes ou les données contre une activité non autorisée
52.
TECHNIQUE FOR TRACKING MODIFICATION OF CONTENT OF REGIONS OF MEMORY
Address translation circuitry (20) converts virtual addresses into physical addresses with reference to intermediate level and final level page tables. Final level descriptors within final level page tables identify address translation data for an associated region of memory. Intermediate level descriptors within intermediate level page tables identify intermediate address translation data used to identify an associated page table at a next level of the page tables. Page table update circuitry (35) maintains state information within each final and intermediate level descriptor, and updates the state information from a clean state to a dirty state: in the final level descriptors to indicate that a modification of content of the associated memory region is permitted; in the intermediate level descriptors to indicate occurrence of an update from the clean state to the dirty state within the state information of any final level descriptors that are accessed via that intermediate level descriptor.
An apparatus and method of converting data into an Enhanced Block Floating Point (EBFP) format with a shared exponent is provided. The EBFP format enables data within a wide range of values to be stored using a reduced number of bits compared with conventional floating-point or fixed-point formats. The data to be converted may be in any other format, such as fixed-point, floating-point, block floating-point or EBFP.
G06F 5/01 - Procédés ou dispositions pour la conversion de données, sans modification de l'ordre ou du contenu des données maniées pour le décalage, p.ex. la justification, le changement d'échelle, la normalisation
54.
SYSTEM, DEVICES AND/OR PROCESSES FOR IMAGE ANTI-ALIASING
Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, techniques to apply an image anti-aliasing operation to an image frame.
Various implementations described herein are related to a device having bitline drivers coupled to passgates of bitcells via bitlines and buried metal lines formed within a substrate including a buried enable signal line and a buried ground line coupled to ground connections of the bitline drivers. The buried enable signal line transfers a negative bias to a selected bitline of the bitlines via the buried ground line that is coupled to the ground connections of the bitline drivers so as to increase gate-source bias of the passgates of the selected bitcell to thereby enhance write capability of the selected bitcell.
G11C 11/412 - Mémoires numériques caractérisées par l'utilisation d'éléments d'emmagasinage électriques ou magnétiques particuliers; Eléments d'emmagasinage correspondants utilisant des éléments électriques utilisant des dispositifs à semi-conducteurs utilisant des transistors formant des cellules avec réaction positive, c. à d. des cellules ne nécessitant pas de rafraîchissement ou de régénération de la charge, p.ex. multivibrateur bistable, déclencheur de Schmitt utilisant uniquement des transistors à effet de champ
H01L 27/11 - Structures de mémoires statiques à accès aléatoire
An apparatus supports decoding and execution of a bulk memory instruction specifying a block size parameter. The apparatus comprises control circuitry to determine whether the block size corresponding to the block size parameter exceeds a predetermined threshold, and performs a micro-architectural control action to influence the handling of at least one bulk memory operation by memory operation processing circuitry. The micro-architectural control action varies depending on whether the block size exceeds the predetermined threshold, and further depending on the states of other components and operations within or coupled with the apparatus. The micro-architectural control action could include an alignment correction action, cache allocation control action, or processing circuitry selection action.
G06F 3/06 - Entrée numérique à partir de, ou sortie numérique vers des supports d'enregistrement
G06F 12/0802 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache
A data processing apparatus is configured to determine a product of two operands stored in an Extended Block Floating-Point format. The operands are decoded, based on their tags and payloads, to generate exponent differences and at least the fractional parts of significands. The significands are multiplied to generate an output significand and shared exponents and exponent differences of the operands are combined to generate an output exponent. Signs of the operands may also be combined to provide an output sign. The apparatus may be combined with an accumulator having one or more lanes to provide an apparatus for determining dot products.
In a data processor, an input datum, having a sign, a tag and a payload, is decoded by first determining a format of the payload based on the tag. For a first format, an exponent difference and an output fraction are decoded from the payload. For a second format, an exponent difference is decoded from the payload and the output fraction may be assumed to be zero. The exponent difference is subtracted from a shared exponent to produce the output exponent. The decoded output may be stored in a standard format for floating-point numbers.
G06F 7/483 - Calculs avec des nombres représentés par une combinaison non linéaire de nombres codés, p.ex. nombres rationnels, système de numération logarithmique ou nombres à virgule flottante
When performing tile-based rendering in a graphics processing system, lists indicative of fragments to be processed are maintained for respective sub-regions of tiles to be rendered, with each list entry including, inter alia, at least an indication of the coverage within the tile sub-region of the group of fragments that the list entry represents, and an indication of whether the group of fragments that the list entry represents is eligible to undergo particular processing operations. The coverage information and eligibility information for the list entries is then used to control the processing of fragments for sub-regions of a tile, in such a way as to ensure that processing order dependencies are enforced and met.
When performing tile-based rendering in a graphics processing system, lists indicative of fragments to be processed are maintained for respective sub-regions of tiles to be rendered, with each list entry representing a group of one or more fragments and including an indication of the coverage within the tile sub-region of the group of fragments that the list entry represents. The coverage information for the list entries is then used to set for entries in the list indicative of fragments to be processed for a sub-region, information indicating whether one or more processing operations are eligible to be performed for fragments that entries in the list represent.
A method and processor comprising a command processing unit to receive, from a host processor, a sequence of commands to be executed; and generate based on the sequence of commands a plurality of tasks. The processor also comprises a plurality of compute units each having a first processing module for executing tasks of a first task type, a second processing module for executing tasks of a second task type, different from the first task type, and a local cache shared by at least the first processing module and the second processing module. The command processing unit issues the plurality of tasks to at least one of the plurality of compute units, and wherein at least one of the plurality of compute units is to process at least one of the plurality of tasks.
In a data processing system comprising a first cache operable to store data for use when performing a data processing operation, and a second cache operable to store data required for fetching data into the first cache from memory, when it is determined that there is no entry for data for a data processing operation in the first cache, an entry in the first cache is allocated for the required data, and information that indicates an entry in the second cache for data required for fetching the required data is stored in the tag portion of the allocated entry. Then, once a request has been sent to a memory system for the required data, the information in the tag portion for the allocated entry in the first cache that indicates an entry in the second cache is replaced with information indicative of an address for the data required for fetching the required data.
G06F 12/0895 - Mémoires cache caractérisées par leur organisation ou leur structure de parties de mémoires cache, p.ex. répertoire ou matrice d’étiquettes
G06F 12/0842 - Systèmes de mémoire cache multi-utilisateurs, multiprocesseurs ou multitraitement pour multitraitement ou multitâche
There is provided a processor configured to transfer data to a plurality of processor circuits. The apparatus includes broadcast circuitry that broadcasts first machine learning data to at least a subset of the plurality of processor circuits.
In a data processor, an input value having a sign, an exponent and a significand is encoded by determining an exponent difference between a base exponent and the exponent. When the exponent difference is not less than a first threshold, only the exponent difference, or a designated value, is encoded to a payload of the output value and one or more tag bits of the output value are set to a first value. When the exponent difference is less than the first threshold, the significand and exponent difference are encoded to the payload of an output value and, optionally, the one or more tag bits of the output value. A sign bit in the output value is set corresponding to the sign of the input value, and the output value is stored.
A method to operate a head-mountable processing system, is provided. The head-mountable processing system comprising generating one or more control signals based upon a visual motion of a sequence of images for display by the head-mountable processing system, and transmitting the generated one or more control signals to a plurality of transducers to stimulate a wearer's vestibular system.
A61H 23/02 - Massage par percussion ou vibration, p.ex. en utilisant une vibration ultrasonique; Massage par succion-vibration; Massage avec des membranes mobiles à entraînement électrique ou magnétique
G06F 3/01 - Dispositions d'entrée ou dispositions d'entrée et de sortie combinées pour l'interaction entre l'utilisateur et le calculateur
H04R 3/12 - Circuits pour transducteurs pour distribuer des signaux à plusieurs haut-parleurs
H04R 1/40 - Dispositions pour obtenir la fréquence désirée ou les caractéristiques directionnelles pour obtenir la caractéristique directionnelle désirée uniquement en combinant plusieurs transducteurs identiques
66.
APPARATUS AND METHOD OF OPTIMISING DIVERGENT PROCESSING IN THREAD GROUPS
A data processor is disclosed in which groups of execution threads comprising a thread group can execute a set of instructions in lockstep, and in which a plurality of execution lanes can perform processing operations for the execution threads. In response to an execution thread issuing circuit determining whether a portion of active threads of a first thread group and a portion of active threads of a second thread group use different execution lanes of the plurality of execution lanes, the execution thread issuing circuit issuing both the portion of active threads of a first thread group and a portion of active threads of a second thread group for execution. This can have the effect of increasing data processor efficiency, thereby increasing throughput and reducing latency.
Disclosed herein is a graphics processor that comprises a programmable execution unit operable to execute programs to perform graphics processing operations. The graphics processor further comprises a dedicated machine learning processing circuit operable to perform processing operations for machine learning processing tasks. The machine learning processing circuit is in communication with the programmable execution unit internally to the graphics processor. In this way, the graphics processor can be configured such that machine learning processing tasks can be performed by the programmable execution unit, the machine learning processing circuit, or a combination of both, with the different units being able to message each other accordingly to control the processing.
There is provided an apparatus configured to operate as a shader core, the shader core configured to perform a complex rendering process comprising a rendering process and a machine learning process, the shader core comprising: one or more tile buffers configured to store data locally to the shader core, wherein during the rendering process, the one or more tile buffers are configured to store rendered fragment data relating to a tile; and during the machine learning process, the one or more tile buffers are configured to store an input feature map, kernel weights or an output feature map relating to the machine learning process.
Aspects of the present disclosure relate to an apparatus comprising a plurality of processing elements having a spatial layout, and control circuitry to assign workloads to said plurality of processing elements. The control circuitry is configured to, based on a timing parameter, determine one or more active processing elements to deactivate; determine, based on the spatial layout, one or more inactive processing elements to activate; and deactivate said one or more active processing elements and activate said one or more inactive processing elements.
Briefly, embodiments, such as methods and/or systems for operations and/or procedures to test magnetic memory devices. In a particular implementation, a bit error rate of a magnetic memory device may be estimated based, at least in part, on an observed bit error rate in the presence of an externally applied magnetic field.
There is provided a neural processing unit for calculating an attention matrix during machine learning inference. The neural processing unit is configured to calculate: a first score matrix based on differences between a query matrix and a key matrix; a second score matrix based on differences between the key matrix and a learned key matrix; a similarity matrix based on a combination of the first score matrix and second score matrix; and an attention matrix comprising applying a normalisation function to the similarity matrix. Also provided is an apparatus comprising at least one said neural processing unit and at least one memory, the memory configured to pass, on demand, a learned key matrix to the neural processing unit. Also provided is a computer program product having computer readable program code stored thereon which, when executed by said neural processing unit, causes the unit to perform said calculations.
G06N 3/063 - Réalisation physique, c. à d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
Various implementations described herein are directed to a method that tests and repairs memory fabricated on a wafer or a package. The method may generate and store a reuse table based on memory repair results. The method may manufacture the memory after repairing the memory. The method may access and reuse data stored in the reuse table to repair the memory after manufacturing the memory.
A masked-vector-comparison instruction specifies a source vector operand comprising a plurality of source data elements, a mask value, and a comparison target operand. In response to the masked-vector-comparison instruction, an instruction decoder 10 controls processing circuitry 16 to: for each active source data element of the source vector operand, determine whether the active source data element satisfies a comparison condition, based on a masked comparison between one or more compared bits of the active source data element and one or more compared bits of the comparison target operand, the mask value specifying a pattern of compared bits and non-compared bits within the comparison target operand and the active source data element; and generate a result value indicative of which of the source data elements of the source vector operand, if any, is an active source data element satisfying the comparison condition. This instruction is useful for variable length decoding operations.
Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, techniques to process image signal values sampled from a multi color channel imaging device. In particular, methods and/or techniques disclosed herein are directed to synthesizing a temporally upsampled image frame to be in a temporal sequence of images frames.
G06T 3/40 - Changement d'échelle d'une image entière ou d'une partie d'image
G06T 3/00 - Transformation géométrique de l'image dans le plan de l'image
G06T 1/20 - Architectures de processeurs; Configuration de processeurs p.ex. configuration en pipeline
G06V 10/56 - Extraction de caractéristiques d’images ou de vidéos relative à la couleur
G06V 10/60 - Extraction de caractéristiques d’images ou de vidéos relative aux propriétés luminescentes, p.ex. utilisant un modèle de réflectance ou d’éclairage
G06V 10/80 - Fusion, c. à d. combinaison des données de diverses sources au niveau du capteur, du prétraitement, de l’extraction des caractéristiques ou de la classification
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
A spiking neural network is described that comprises a plurality of neurons in a first layer connected to at least one neuron in a second layer, each neuron in the first layer being connected to the at least one neuron in the second layer via a respective variable delay path. The at least one neuron in the second layer comprises one or more logic components configured to generate an output signal in dependence upon signals received along the variable delay paths from the plurality of neurons in the first layer. A timing component is configured to determine a timing value in response to receiving the output signal from the one or more logic components, and an accumulate component is configured to accumulate a value based timing values from the timing component. A neuron fires in a case that a value accumulated at the accumulate component reaches a threshold value.
A data processing apparatus is provided. Prefetch circuitry generates a prefetch request for a cache line prior to the cache line being explicitly requested. The cache line is predicted to be required for a store operation in the future. Issuing circuitry issues the prefetch request to a memory hierarchy and filter circuitry filters the prefetch request based on at least one other prefetch request made to the cache line, to control whether the prefetch request is issued by the issuing circuitry.
G06F 12/0862 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache avec pré-lecture
G06F 12/0875 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache avec mémoire cache dédiée, p.ex. instruction ou pile
78.
SYSTEM, DEVICES AND/OR PROCESSES FOR APPLICATION OF KERNEL COEFFICIENTS
Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, techniques to process image signal intensity values sampled from a multi color channel imaging device. In particular, methods and/or techniques disclosed herein are directed to processing image signal intensity values by application of kernel coefficients to the image signal intensity values.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/77 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source
G06V 10/40 - Extraction de caractéristiques d’images ou de vidéos
Briefly, embodiments, such as methods and/or systems for employing external memory devices in the execution of activation function such as activation functions implemented in a neural network. In one aspect, a first activation input tensor may be partitioned as a plurality of tensor segments stored in one or more external memory devices. Individual stored tensor segments may be sequentially loaded to memories local to processing circuitry to apply activation functions associated with the stored tensor segments.
There is provided a data processing apparatus in which decode circuitry receives a memory copy instruction containing an indication of a source area of memory, an indication of a destination area of memory, and an indication of a remaining copy length. In response to receiving the memory copy instruction, the decode circuitry generates at least one active memory copy operation or a null memory copy operation. The active memory copy operation causes one or more execution units to perform a memory copy from part of the source area of memory to part of the destination area of memory and the null memory copy operation leaves the destination area of memory unmodified.
Methods and systems for detecting errors when performing a convolutional operation is provided. Predicted checksum data, corresponding to input checksum data and kernel checksum data, is obtained. The convolutional operation is performed to obtain an output feature map. Output checksum data is generated and the predicted checksum data and the output checksum data are compared, the comparing taking account of partial predicted checksum data configured to correct for a lack of padding when performing the convolution operation, wherein the partial predicted checksum data corresponds to input checksum data for a subset of the values in the input feature map and kernel checksum data for a subset of the values in the kernel.
There is provided a data processing apparatus in which receive circuitry receives a result signal from a lower level cache and a higher level cache in respect of a first instruction block. The lower level cache and the higher level cache are arranged hierarchically and transmit circuitry transmits, to the higher level cache, a query for the result signal. In response to the result signal originating from the higher level cache containing requested data, the transmit circuitry transmits a further query to the higher level cache for a subsequent instruction block at an earlier time than the further query is transmitted to the higher level cache when the result signal containing the requested data originates from the lower level cache.
According to one implementation of the present disclosure, a cache memory includes: a plurality of cache-lines, wherein each row of cache-lines comprises: tag bits of a tag-random access memory (tag-RAM); data bits of a data-random access memory (data-RAM), and a single set of retention bits corresponding to the tag-RAM. According to one implementation of the present disclosure, a method includes: sampling a single set of retention bits of a cache-line of a cache memory, where the cache-line comprises the single set of retention bits, tag-RAM and data-RAM, and where at least the single set of retention bits comprise eDRAM bitcells; and performing a refresh cycle of at least the data-RAM corresponding to the tag-RAM based on the sampled single set of retention bits.
Methods and systems for detecting errors when performing a convolutional operation is provided. Predicted checksum data, corresponding to input checksum data and kernel checksum data, is obtained. The convolutional operation is performed to obtain an output feature map. Output checksum data is generated and the predicted checksum data and the output checksum data are compared, the comparing taking account of partial predicted checksum data configured to correct for a lack of padding when performing the convolution operation, wherein the partial predicted checksum data corresponds to input checksum data for a subset of the values in the input feature map and kernel checksum data for a subset of the values in the kernel.
G06N 3/063 - Réalisation physique, c. à d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
85.
METHODS AND HARDWARE FOR INTER-LAYER DATA FORMAT CONVERSION IN NEURAL NETWORKS
The present disclosure relates to a method of inter-layer format conversion for a neural network, the neural network comprising at least two computation layers including a first layer to process first data in a first data format and a second layer to process second data in a second data format, the method comprising: extracting data statistics from data output by the first layer, said data statistics being representative of the data output by the first layer; determining one or more conversion parameters based on the extracted data statistics and the second data format; and generating the second data for the second layer by modifying said data output by the first layer using the one or more conversion parameters.
According to one implementation of the present disclosure, an integrated circuit includes comparator circuitry coupled to peripheral circuitry of a multiport memory and configured to transmit one or more data input signals or one or more write enable signals to respective memory outputs when a memory address collision is detected for one or more respective bitcells of the multi-port memory. In another implementation, a method comprises: detecting a read operation and a write operation to a same memory bitcell of a multiport memory in one clock cycle and in response to the detection, performing the read operation of a data input signal or a write enable signal of the multiport memory.
A data processing system (1) comprises a plurality of processing units (11) and a controller (30) operable to allocate processing units of the plurality of processing units into respective groups of the processing units, wherein each group of processing units comprises a set of one or more of the processing units of the plurality of processing units. The data processing system further comprises an arbiter (31, 32) for each group of processing units for controlling access by virtual machines (33, 34) that require processing operations to the processing units of the group of processing units that the arbiter has been allocated.
G06F 9/455 - Dispositions pour exécuter des programmes spécifiques Émulation; Interprétation; Simulation de logiciel, p.ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
G06F 11/07 - Réaction à l'apparition d'un défaut, p.ex. tolérance de certains défauts
88.
SECURITY MEASURES FOR SIGNAL PATHS WITH TREE STRUCTURES
Security measures for signal paths with tree structures can be implemented at design phase using an EDA software program or tool with security feature functionality that, when executed by a computing system, directs the computing system to: display a canvas through which components of a circuit are arranged; and provide a menu of commands, including an option to add components from a library to the canvas and an option to secure a tree. In response to receiving a selection of the option to secure the tree, the system can be directed to add a hardware countermeasure coupled to at least two lines or terminal nodes of a tree structure identified from components on the canvas or in a netlist corresponding to a circuit's design.
G06F 21/75 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du calcul ou du traitement de l’information par inhibition de l’analyse de circuit ou du fonctionnement, p.ex. pour empêcher l'ingénierie inverse
G06F 30/398 - Vérification ou optimisation de la conception, p.ex. par vérification des règles de conception [DRC], vérification de correspondance entre géométrie et schéma [LVS] ou par les méthodes à éléments finis [MEF]
Apparatuses and methods for monitoring sensor data are provided. Sensor data is captured from an environment and a locality-sensitive hash is generated from the sensor data. Hashes which have been generated are stored in association with a significance value, which periodically decrease. A similarity metric is generated for the locality-sensitive hash with respect to the hashes stored in the hash storage and when it exceeds a similarity threshold the significance value stored in association with the similar hash is increased. When the significance value stored in association with the similar hash exceeds an alert threshold an alert signal is generated.
G06F 21/74 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du calcul ou du traitement de l’information opérant en mode dual ou compartimenté, c. à d. avec au moins un mode sécurisé
G06F 16/50 - Recherche d’informations; Structures de bases de données à cet effet; Structures de systèmes de fichiers à cet effet de données d’images fixes
G06F 16/783 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
H04L 9/06 - Dispositions pour les communications secrètes ou protégées; Protocoles réseaux de sécurité l'appareil de chiffrement utilisant des registres à décalage ou des mémoires pour le codage par blocs, p.ex. système DES
H04L 9/32 - Dispositions pour les communications secrètes ou protégées; Protocoles réseaux de sécurité comprenant des moyens pour vérifier l'identité ou l'autorisation d'un utilisateur du système
G06F 21/62 - Protection de l’accès à des données via une plate-forme, p.ex. par clés ou règles de contrôle de l’accès
A method, system and apparatus provide bit-sparse neural network optimization. Rather than quantizing and pruning weight and activation elements at the word level, weight and activation elements are pruned at the bit level, which reduces the density of effective “set” bits in weight and activation data, which, advantageously, reduces the power consumption of the neural network inference process by reducing the degree of bit-level switching during inference.
There is provided a computer-implemented method of defining bounding boxes for a primitive in a tile-based graphics processing pipeline comprising determining a part-way point on the primitive, wherein, for each pair of vertices, a part-way point is part-way between that pair of vertices, and defining a plurality of bounding boxes, wherein each bounding box intersects a part-way point. Also provided is a bounding box generation circuit comprising a part-way point calculation circuit to determine a part-way point on the primitive, wherein, for each pair of vertices, a part-way point is part-way between that pair of vertices, wherein the bounding box generation circuit to define a plurality of bounding boxes based upon the determined part-way point, wherein each bounding box intersects a part-way point. A method of defining bounding boxes for a point primitive is also provided.
An apparatus comprises an non-inclusive cache (14) configured to cache data and coherency control circuitry (16). The coherency control circuitry is configured to look up the non-inclusive cache in response to a coherent access request from a first requestor (4). In response to determining that the coherent access request can be serviced using data stored in a matching entry of the non-inclusive cache, the coherency control circuitry references snoop-filter information associated with the matching entry to determine whether the first requestor can use the data stored in the matching entry without waiting for a response to a snoop of a coherent cache (8).
Aspects of the present disclosure relate to interface circuitry to receive a pointer comprising a plurality of address bits, and pointer processing circuitry. The pointer processing circuitry is configured to extract and encrypt plurality of address bits from the pointer, to produce a plurality of encrypted address bits. The pointer processing circuitry determines, based at least in part on the plurality of address bits, a pointer authentication value. It then combines the pointer authentication value with the plurality of encrypted address bits, to produce a signed encrypted pointer.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p.ex. décodage d'instructions
G06F 12/14 - Protection contre l'utilisation non autorisée de mémoire
G06F 21/44 - Authentification de programme ou de dispositif
G06F 21/52 - Contrôle des usagers, programmes ou dispositifs de préservation de l’intégrité des plates-formes, p.ex. des processeurs, des micrologiciels ou des systèmes d’exploitation au stade de l’exécution du programme, p.ex. intégrité de la pile, débordement de tampon ou prévention d'effacement involontaire de données
G06F 21/54 - Contrôle des usagers, programmes ou dispositifs de préservation de l’intégrité des plates-formes, p.ex. des processeurs, des micrologiciels ou des systèmes d’exploitation au stade de l’exécution du programme, p.ex. intégrité de la pile, débordement de tampon ou prévention d'effacement involontaire de données par ajout de routines ou d’objets de sécurité aux programmes
G06F 21/56 - Détection ou gestion de programmes malveillants, p.ex. dispositions anti-virus
G06F 21/64 - Protection de l’intégrité des données, p.ex. par sommes de contrôle, certificats ou signatures
94.
METHODS AND APPARATUS FOR BRANCH INSTRUCTION SECURITY
Aspects of the present disclosure relate to an apparatus. Instruction receiving circuitry receives, as part of a program flow, a branch instruction, said branch instruction identifying a function. Instruction authentication circuitry determines, based at least in part on the function, an instruction authentication value. The instruction authentication circuitry then combines the instruction authentication value with the branch instruction to produce an authenticatable branch instruction. Branch circuitry authenticates the authenticatable branch instruction based on a function authentication value. Responsive to a successful authentication of the authenticatable branch instruction, the branch circuitry executes a jump in the program flow to said function.
G06F 21/44 - Authentification de programme ou de dispositif
G06F 21/54 - Contrôle des usagers, programmes ou dispositifs de préservation de l’intégrité des plates-formes, p.ex. des processeurs, des micrologiciels ou des systèmes d’exploitation au stade de l’exécution du programme, p.ex. intégrité de la pile, débordement de tampon ou prévention d'effacement involontaire de données par ajout de routines ou d’objets de sécurité aux programmes
G06F 21/64 - Protection de l’intégrité des données, p.ex. par sommes de contrôle, certificats ou signatures
G06F 9/30 - Dispositions pour exécuter des instructions machines, p.ex. décodage d'instructions
G06F 9/34 - Adressage de l'opérande d'instruction ou du résultat ou accès à l'opérande d'instruction ou au résultat
G06F 12/14 - Protection contre l'utilisation non autorisée de mémoire
An apparatus has processing circuitry to execute instructions. The processing circuitry has calculation circuitry which is responsive to one or more instructions requiring a calculation to be performed to compute the result of the calculation and approximation circuitry which is responsive to said one or more instructions to calculate an approximate result of the calculation independently of the calculation circuitry. The processing circuitry also has integrity checking circuitry to perform an integrity check by comparing the result of the calculation performed by the calculation circuitry and the approximate result of the calculation performed by the approximation circuity. The integrity checking circuitry detects an error in the processing circuitry if it is determined that a difference between the result of the calculation and the approximate result of the calculation is greater than a deviation threshold.
G06F 21/54 - Contrôle des usagers, programmes ou dispositifs de préservation de l’intégrité des plates-formes, p.ex. des processeurs, des micrologiciels ou des systèmes d’exploitation au stade de l’exécution du programme, p.ex. intégrité de la pile, débordement de tampon ou prévention d'effacement involontaire de données par ajout de routines ou d’objets de sécurité aux programmes
G06F 21/55 - Détection d’intrusion locale ou mise en œuvre de contre-mesures
G06F 21/74 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du calcul ou du traitement de l’information opérant en mode dual ou compartimenté, c. à d. avec au moins un mode sécurisé
G06F 11/14 - Détection ou correction d'erreur dans les données par redondance dans les opérations, p.ex. en utilisant différentes séquences d'opérations aboutissant au même résultat
G06F 11/16 - Détection ou correction d'erreur dans une donnée par redondance dans le matériel
A method of managing network-attachable computing entities comprising: training a machine-learning model to detect a bottleneck process segment in a process flow performed by a network-attachable computing entity; deploying a trained model to monitor a network-attachable computing entity in operation; responsive to detecting an instance of the bottleneck process segment, analyzing to determine a cause of the bottleneck; responsive to determining the cause of the bottleneck, generating an augmented functional unit to address the cause of the bottleneck; and deploying the augmented functional unit to at least one of the network-attached computing entities that has an instance of a process comprising the bottleneck process segment.
A data processing system (1) comprises a plurality of, e.g. graphics, processing units (11), and a management circuit (12) associated with the processing units and operable to configure the processing units of the plurality of processing units into respective groups of the processing units. The management circuit (12) is configured to always operate with a high level of fault protection, but the groups of the processing units can be selectively operated with either a higher level of fault protection or a lower level of fault protection, by selectively subjecting them to fault detection testing (60).
G06F 11/22 - Détection ou localisation du matériel d'ordinateur défectueux en effectuant des tests pendant les opérations d'attente ou pendant les temps morts, p.ex. essais de mise en route
Various implementations described herein are related to a device having memory architecture having multiple bitcell arrays. The device may include column multiplexer circuitry coupled to the memory architecture via multiple bitlines for read access operations. The column multiplexer circuitry may perform read access operations in the multiple bitcell arrays via the bitlines based on a sense amplifier enable signal and a read multiplexer signal. The device may include control circuitry that provides the read multiplexer signal to the column multiplexer circuitry based on a clock signal and the sense amplifier enable signal so that the column multiplexer circuitry is able to perform the read access operations.
Processing circuitry performs a processing operation to generate a two's complement result value representing a positive or negative number in two's complement representation. Normalization-and-rounding circuitry converts the two's complement result value to a normalized-and-rounded floating-point result value represented using sign-magnitude representation. The normalization-and-rounding circuitry comprises incrementing circuitry to perform an increment addition (e.g. a rounding increment or a conversion increment) to generate a fraction of the normalized-and-rounded floating-point result value. For an operation where the increment addition is required to be performed, tininess detection circuitry detects the after-rounding tininess status based on a still-to-be-incremented version of the normalized-and-rounded floating-point result value prior to the increment addition by the increment circuitry.
G06F 7/483 - Calculs avec des nombres représentés par une combinaison non linéaire de nombres codés, p.ex. nombres rationnels, système de numération logarithmique ou nombres à virgule flottante
Various implementations described herein are directed to a device having memory with banks of bitcells with each bank having a bitcell array. The device may have header circuitry that powers-up a selected bank and powers-down unselected banks during a wake-up mode of operation. In some instances, only the selected bank of the memory is powered-up with the header circuitry during the wake-up mode of operation.
G11C 11/412 - Mémoires numériques caractérisées par l'utilisation d'éléments d'emmagasinage électriques ou magnétiques particuliers; Eléments d'emmagasinage correspondants utilisant des éléments électriques utilisant des dispositifs à semi-conducteurs utilisant des transistors formant des cellules avec réaction positive, c. à d. des cellules ne nécessitant pas de rafraîchissement ou de régénération de la charge, p.ex. multivibrateur bistable, déclencheur de Schmitt utilisant uniquement des transistors à effet de champ