DeepMind Technologies Limited

United Kingdom

Back to Profile

1-100 of 413 for DeepMind Technologies Limited Sort by
Query
Patent
United States - USPTO
Aggregations Reset Report
Date
New (last 4 weeks) 23
2024 April (MTD) 14
2024 March 13
2024 February 10
2024 January 5
See more
IPC Class
G06N 3/08 - Learning methods 270
G06N 3/04 - Architecture, e.g. interconnection topology 210
G06N 3/045 - Combinations of networks 52
G06K 9/62 - Methods or arrangements for recognition using electronic means 48
G06N 20/00 - Machine learning 31
See more
Status
Pending 188
Registered / In Force 225
Found results for  patents
  1     2     3     ...     5        Next Page

1.

LEVERAGING OFFLINE TRAINING DATA AND AGENT COMPETENCY MEASURES TO IMPROVE ONLINE LEARNING

      
Application Number 18492415
Status Pending
Filing Date 2023-10-22
First Publication Date 2024-04-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Wen, Zheng
  • Van Roy, Benjamin
  • Jain, Rahul Anant
  • Hao, Botao

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a target action selection policy to control a target agent interacting with an environment. In one aspect, a method comprises: obtaining a set of offline training data, wherein the offline training data characterizes interaction of a baseline agent with an environment as the baseline agent performs actions selected in accordance with a baseline action selection policy; generating a set of online training data that characterizes interaction of the target agent with the environment as the target agent performs actions selected in accordance with the target action selection policy; and training the target action selection policy on both: (i) the offline training data, and (ii) the online training data, wherein the training of the target action selection policy on the offline training data is conditioned on a measure of competency of the baseline agent.

IPC Classes  ?

2.

GENERATING AUDIO USING NEURAL NETWORKS

      
Application Number 18519986
Status Pending
Filing Date 2023-11-27
First Publication Date 2024-04-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Van Den Oord, Aaron Gerard Antonius
  • Dieleman, Sander Etienne Lea
  • Kalchbrenner, Nal Emmerich
  • Simonyan, Karen
  • Vinyals, Oriol

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of audio data that comprises a respective audio sample at each of a plurality of time steps. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.

IPC Classes  ?

  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/045 - Combinations of networks
  • G06N 3/048 - Activation functions
  • G10L 13/06 - Elementary speech units used in speech synthesisers; Concatenation rules

3.

DISTRIBUTIONAL REINFORCEMENT LEARNING USING QUANTILE FUNCTION NEURAL NETWORKS

      
Application Number 18542476
Status Pending
Filing Date 2023-12-15
First Publication Date 2024-04-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Ostrovski, Georg
  • Dabney, William Clinton

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

IPC Classes  ?

4.

FAST EXPLORATION AND LEARNING OF LATENT GRAPH MODELS

      
Application Number 18373870
Status Pending
Filing Date 2023-09-27
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Swaminathan, Sivaramakrishnan
  • Dave, Meet Kirankumar
  • Lazaro-Gredilla, Miguel
  • George, Dileep

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a graph model representing an environment being interacted with by an agent. In one aspect, one of the methods include: obtaining experience data; using the experience data to update a visitation count for each of one or more state-action pairs represented by the graph model; and at each of multiple environment exploration steps: computing a utility measure for each of the one or more state-action pairs represented by the graph model; determining, based on the utility measures, a sequence of one or more planned actions that have an information gain that satisfies a threshold; and controlling the agent to perform the sequence of one or more planned actions to cause the environment to transition from a state characterized by a last observation received after a last action in the experience data into a different state.

IPC Classes  ?

  • G06F 16/901 - Indexing; Data structures therefor; Storage structures
  • G06F 17/12 - Simultaneous equations

5.

GENERATING A MODEL OF A TARGET ENVIRONMENT BASED ON INTERACTIONS OF AN AGENT WITH SOURCE ENVIRONMENTS

      
Application Number 18379988
Status Pending
Filing Date 2023-10-13
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Bellot, Alexis
  • Malek, Alan John
  • Chiappa, Silvia

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions for an agent in a target environment. In particular, the actions are selected using an environment model for the target environment that is parameterized using interactions of the agent with the target environment and one or more source environments.

IPC Classes  ?

  • G06F 30/20 - Design optimisation, verification or simulation
  • G06F 16/901 - Indexing; Data structures therefor; Storage structures

6.

OPTIMIZING ALGORITHMS FOR HARDWARE DEVICES

      
Application Number 17959210
Status Pending
Filing Date 2022-10-03
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Hubert, Thomas Keisuke
  • Huang, Shih-Chieh
  • Novikov, Alexander
  • Fawzi, Alhussein
  • Romera-Paredes, Bernardino
  • Silver, David
  • Hassabis, Demis
  • Swirszcz, Grzegorz Michal
  • Schrittwieser, Julian
  • Kohli, Pushmeet
  • Barekatain, Mohammadamin
  • Balog, Matej
  • Rodriguez Ruiz, Francisco Jesus

Abstract

A method performed by one or more computers for obtaining an optimized algorithm that (i) is functionally equivalent to a target algorithm and (ii) optimizes one or more target properties when executed on a target set of one or more hardware devices. The method includes: initializing a target tensor representing the target algorithm; generating, using a neural network having a plurality of network parameters, a tensor decomposition of the target tensor that parametrizes a candidate algorithm; generating target property values for each of the target properties when executing the candidate algorithm on the target set of hardware devices; determining a benchmarking score for the tensor decomposition based on the target property values of the candidate algorithm; generating a training example from the tensor decomposition and the benchmarking score; and storing, in a training data store, the training example for use in updating the network parameters of the neural network.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

7.

NEURAL NETWORKS WITH ADAPTIVE GRADIENT CLIPPING

      
Application Number 18275087
Status Pending
Filing Date 2022-02-02
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Brock, Andrew
  • De, Soham
  • Smith, Samuel Laurence
  • Simonyan, Karen

Abstract

There is disclosed a computer-implemented method for training a neural network. The method comprises determining a gradient associated with a parameter of the neural network. The method further comprises determining a ratio of a gradient norm to parameter norm and comparing the ratio to a threshold. In response to determining that the ratio exceeds the threshold, the value of the gradient is reduced such that the ratio is equal to or below the threshold. The value of the parameter is updated based upon the reduced gradient value.

IPC Classes  ?

  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 10/776 - Validation; Performance evaluation

8.

META-LEARNED EVOLUTIONARY STRATEGIES OPTIMIZER

      
Application Number 18475859
Status Pending
Filing Date 2023-09-27
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Lange, Robert Tjarko
  • Schaul, Tom
  • Chen, Yutian
  • Zahavy, Tom Ben Zion
  • Dalibard, Valentin Clement
  • Lu, Christopher Yenchuan
  • Baveja, Satinder Singh
  • Flennerhag, Johan Sebastian

Abstract

There is provided a computer-implemented method for updating a search distribution of an evolutionary strategies optimizer using an optimizer neural network comprising one or more attention blocks. The method comprises receiving a plurality of candidate solutions, one or more parameters defining the search distribution that the plurality of candidate solutions are sampled from, and fitness score data indicating a fitness of each respective candidate solution of the plurality of candidate solutions. The method further comprises processing, by the one or more attention neural network blocks, the fitness score data using an attention mechanism to generate respective recombination weights corresponding to each respective candidate solution. The method further comprises updating the one or more parameters defining the search distribution based upon the recombination weights applied to the plurality of candidate solutions.

IPC Classes  ?

  • G06N 3/086 - Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

9.

DISTRIBUTED TRAINING USING ACTOR-CRITIC REINFORCEMENT LEARNING WITH OFF-POLICY CORRECTION FACTORS

      
Application Number 18487428
Status Pending
Filing Date 2023-10-16
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Soyer, Hubert Josef
  • Espeholt, Lasse
  • Simonyan, Karen
  • Doron, Yotam
  • Firoiu, Vlad
  • Mnih, Volodymyr
  • Kavukcuoglu, Koray
  • Munos, Remi
  • Ward, Thomas
  • Harley, Timothy James Alexander
  • Dunning, Iain Robert

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.

IPC Classes  ?

10.

PREDICTING PROTEIN AMINO ACID SEQUENCES USING GENERATIVE MODELS CONDITIONED ON PROTEIN STRUCTURE EMBEDDINGS

      
Application Number 18275933
Status Pending
Filing Date 2022-01-27
First Publication Date 2024-04-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Senior, Andrew W.
  • Kohl, Simon
  • Yim, Jason
  • Bates, Russell James
  • Ionescu, Catalin-Dumitru
  • Nash, Charlie Thomas Curtis
  • Razavi-Nematollahi, Ali
  • Pritzel, Alexander
  • Jumper, John

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing protein design. In one aspect, a method comprises: processing an input characterizing a target protein structure of a target protein using an embedding neural network having a plurality of embedding neural network parameters to generate an embedding of the target protein structure of the target protein; determining a predicted amino acid sequence of the target protein based on the embedding of the target protein structure, comprising: conditioning a generative neural network having a plurality of generative neural network parameters on the embedding of the target protein structure; and generating, by the generative neural network conditioned on the embedding of the target protein structure, a representation of the predicted amino acid sequence of the target protein.

IPC Classes  ?

  • G16B 15/00 - ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
  • G16B 30/20 - Sequence assembly
  • G16B 40/20 - Supervised data analysis

11.

DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS

      
Application Number 18374447
Status Pending
Filing Date 2023-09-28
First Publication Date 2024-04-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Strudel, Robin
  • Leblond, Rémi
  • Sifre, Laurent
  • Dieleman, Sander Etienne Lea
  • Savinov, Nikolay
  • Grathwohl, Will S.
  • Tallec, Corentin
  • Altché, Florent
  • Ganin, Iaroslav
  • Mensch, Arthur
  • Du, Yilin

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of discrete tokens using a diffusion model. In one aspect, a method includes generating, by using the diffusion model, a final latent representation of the sequence of discrete tokens that includes a determined value for each of a plurality of latent variables; applying a de-embedding matrix to the final latent representation of the output sequence of discrete tokens to generate a de-embedded final latent representation that includes, for each of the plurality of latent variables, a respective numeric score for each discrete token in a vocabulary of multiple discrete tokens; selecting, for each of the plurality of latent variables, a discrete token from among the multiple discrete tokens in the vocabulary that has a highest numeric score; and generating the output sequence of discrete tokens that includes the selected discrete tokens.

IPC Classes  ?

12.

PROGRESSIVE NEURAL NETWORKS

      
Application Number 18479775
Status Pending
Filing Date 2023-10-02
First Publication Date 2024-04-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Rabinowitz, Neil Charles
  • Desjardins, Guillaume
  • Rusu, Andrei-Alexandru
  • Kavukcuoglu, Koray
  • Hadsell, Raia Thais
  • Pascanu, Razvan
  • Kirkpatrick, James
  • Soyer, Hubert Josef

Abstract

Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.

IPC Classes  ?

13.

EVALUATING REPRESENTATIONS WITH READ-OUT MODEL SWITCHING

      
Application Number 18475972
Status Pending
Filing Date 2023-09-27
First Publication Date 2024-04-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Li, Yazhe
  • Bornschein, Jorg
  • Hutter, Marcus

Abstract

A method of automatically selecting a neural network from a plurality of computer-implemented candidate neural networks, each candidate neural network comprising at least an encoder neural network trained to encode an input value as a latent representation. The method comprises: obtaining a sequence of data items, each of the data items comprising an input value and a target value; and determining a respective score for each of the candidate neural networks, comprising evaluating the encoder neural network of the candidate neural network using a plurality of read-out heads. Each read-out head comprises parameters for predicting a target value from a latent representation of an input value of a data item encoded using the encoder neural network of the candidate neural network. The method further comprises selecting the neural network from the plurality of candidate neural networks using the respective scores.

IPC Classes  ?

14.

CONTROLLING AGENTS USING REPORTER NEURAL NETWORKS

      
Application Number 18475157
Status Pending
Filing Date 2023-09-26
First Publication Date 2024-04-04
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Dasgupta, Ishita
  • Chen, Shiqi
  • Marino, Kenneth Daniel
  • Shang, Wenling
  • Ahuja, Arun

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents using reporter neural networks.

IPC Classes  ?

15.

REWARD-MODEL BASED REINFORCEMENT LEARNING FOR PERFORMING REASONING TASKS

      
Application Number 18475743
Status Pending
Filing Date 2023-09-27
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Higgins, Irina
  • Uesato, Jonathan Ken
  • Kushman, Nathaniel Arthur
  • Kumar, Ramana

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for A training a language model for performing a reasoning task. The system obtains a plurality of training examples. Each training example includes a respective sample query text sequence characterizing a respective sample query and a respective reference response text sequence that includes a reference final answer to the respective sample query. The system trains a reward model on the plurality of training examples. The reward model is configured to receive an input including a query text sequence characterizing a query and one or more reasoning steps that have been generated in response to the query and process the input to compute a reward score indicating how successful the one or more reasoning steps are in yielding a correct final answer to the query. The system trains the language model using the trained reward model.

IPC Classes  ?

16.

GENERATING NEURAL NETWORK OUTPUTS BY ENRICHING LATENT EMBEDDINGS USING SELF-ATTENTION AND CROSS-ATTENTION OPERATIONS

      
Application Number 18271611
Status Pending
Filing Date 2022-02-03
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Jaegle, Andrew Coulter
  • Carreira, Joao

Abstract

This specification describes a method for using a neural network to generate a network output that characterizes an entity. The method includes: obtaining a representation of the entity as a set of data element embeddings, obtaining a set of latent embeddings, and processing: (i) the set of data element embeddings, and (ii) the set of latent embeddings, using the neural network to generate the network output characterizing the entity. The neural network includes: (i) one or more cross-attention blocks, (ii) one or more self-attention blocks, and (iii) an output block. Each cross-attention block updates each latent embedding using attention over some or all of the data element embeddings. Each self-attention block updates each latent embedding using attention over the set of latent embeddings. The output block processes one or more latent embeddings to generate the network output that characterizes the entity.

IPC Classes  ?

17.

SEQUENCE-TO SEQUENCE NEURAL NETWORK SYSTEMS USING LOOK AHEAD TREE SEARCH

      
Application Number 18274748
Status Pending
Filing Date 2022-02-08
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Leblond, Rémi Bertrand Francis
  • Alayrac, Jean-Baptiste
  • Sifre, Laurent
  • Pîslar, Miruna
  • Lespiau, Jean-Baptiste
  • Antonoglou, Ioannis
  • Simonyan, Karen
  • Silver, David
  • Vinyals, Oriol

Abstract

A computer-implemented method for generating an output token sequence from an input token sequence. The method combines a look ahead tree search, such as a Monte Carlo tree search, with a sequence-to-sequence neural network system. The sequence-to-sequence neural network system has a policy output defining a next token probability distribution, and may include a value neural network providing a value output to evaluate a sequence. An initial partial output sequence is extended using the look ahead tree search guided by the policy output and, in implementations, the value output, of the sequence-to-sequence neural network system until a complete output sequence is obtained.

IPC Classes  ?

  • G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks

18.

GENERATING IMAGES USING SPARSE REPRESENTATIONS

      
Application Number 18275048
Status Pending
Filing Date 2022-02-07
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Nash, Charlie Thomas Curtis
  • Battaglia, Peter William

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating compressed representations of synthetic images. One of the methods is a method of generating a synthetic image using a generative neural network, and includes: generating, using the generative neural network, a plurality of coefficients that represent the synthetic image after the synthetic image has been encoded using a lossy compression algorithm; and decoding the synthetic image by applying the lossy compression algorithm to the plurality of coefficients.

IPC Classes  ?

  • G06T 9/00 - Image coding
  • G06T 7/11 - Region-based segmentation
  • G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods

19.

TEMPORAL DIFFERENCE SCALING WHEN CONTROLLING AGENTS USING REINFORCEMENT LEARNING

      
Application Number 18275145
Status Pending
Filing Date 2022-02-04
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor Schaul, Tom

Abstract

A reinforcement learning neural network system configured to manage rewards on scales that can vary significantly. The system determines the value of a scale factor that is applied to a temporal difference error used for reinforcement learning. The scale factor depends at least upon a variance of the rewards received during the reinforcement learning.

IPC Classes  ?

20.

NEURAL NETWORK REINFORCEMENT LEARNING WITH DIVERSE POLICIES

      
Application Number 18275511
Status Pending
Filing Date 2022-02-04
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Zahavy, Tom Ben Zion
  • O'Donoghue, Brendan Timothy
  • Da Motta Salles Barreto, Andre
  • Flennerhag, Johan Sebastian
  • Mnih, Volodymyr
  • Baveja, Satinder Singh

Abstract

In one aspect there is provided a method for training a neural network system by reinforcement learning. The neural network system may be configured to receive an input observation characterizing a state of an environment interacted with by an agent and to select and output an action in accordance with a policy aiming to satisfy an objective. The method may comprise obtaining a policy set comprising one or more policies for satisfying the objective and determining a new policy based on the one or more policies. The determining may include one or more optimization steps that aim to maximize a diversity of the new policy relative to the policy set under the condition that the new policy satisfies a minimum performance criterion based on an expected return that would be obtained by following the new policy.

IPC Classes  ?

21.

GUIDED DIALOGUE USING LANGUAGE GENERATION NEURAL NETWORKS AND SEARCH

      
Application Number 18471257
Status Pending
Filing Date 2023-09-20
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Irving, Geoffrey
  • Glaese, Amelia Marita Claudia
  • Mcaleese-Park, Nathaniel John
  • Hendricks, Lisa Anne Marie

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enabling a user to conduct a dialogue. Implementations of the system learn when to rely on supporting evidence, obtained from an external search system via a search system interface, and are also able to generate replies for the user that align with the preferences of a previously trained response selection neural network. Implementations of the system can also use a previously trained rule violation detection neural network to generate replies that take account of previously learnt rules.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates
  • G06F 40/35 - Discourse or dialogue representation
  • G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks
  • G06N 3/092 - Reinforcement learning

22.

AGENT CONTROL THROUGH IN-CONTEXT REINFORCEMENT LEARNING

      
Application Number 18477492
Status Pending
Filing Date 2023-09-28
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Laskin, Michael
  • Mnih, Volodymyr
  • Wang, Luyu
  • Baveja, Satinder Singh

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an agent can be controlled using an action selection neural network that performs in-context reinforcement learning when controlling an agent on a new task.

IPC Classes  ?

23.

Image processing of an environment to select an action to be performed by an agent interacting with the environment

      
Application Number 17737544
Grant Number 11941088
Status In Force
Filing Date 2022-05-05
First Publication Date 2024-03-26
Grant Date 2024-03-26
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Mnih, Volodymyr
  • Kavukcuoglu, Koray

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using recurrent attention. One of the methods includes determining a location in the first image; extracting a glimpse from the first image using the location; generating a glimpse representation of the extracted glimpse; processing the glimpse representation using a recurrent neural network to update a current internal state of the recurrent neural network to generate a new internal state; processing the new internal state to select a location in a next image in the image sequence after the first image; and processing the new internal state to select an action from a predetermined set of possible actions.

IPC Classes  ?

  • G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
  • G06F 18/2431 - Multiple classes
  • G06V 20/80 - Recognising image objects characterised by unique random patterns
  • G06V 30/194 - References adjustable by an adaptive method, e.g. learning
  • G06V 30/413 - Classification of content, e.g. text, photographs or tables

24.

ATTENTION NEURAL NETWORKS WITH SHORT-TERM MEMORY UNITS

      
Application Number 18275052
Status Pending
Filing Date 2022-02-07
First Publication Date 2024-03-21
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Banino, Andrea
  • Badia, Adrià Puigdomènech
  • Walker, Jacob Charles
  • Scholtes, Timothy Anthony Julian
  • Mitrovic, Jovana
  • Blundell, Charles

Abstract

A system for controlling an agent interacting with an environment to perform a task. The system includes an action selection neural network configured to generate action selection outputs that are used to select actions to be performed by the agent. The action selection neural network includes an encoder sub network configured to generate encoded representations of the current observations; an attention sub network configured to generate attention sub network outputs with the used of an attention mechanism; a recurrent sub network configured to generate recurrent sub network outputs; and an action selection sub network configured to generate the action selection outputs that are used to select the actions to be performed by the agent in response to the current observations.

IPC Classes  ?

  • G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
  • G06N 3/092 - Reinforcement learning

25.

DETERMINING PRINCIPAL COMPONENTS USING MULTI-AGENT INTERACTION

      
Application Number 18275045
Status Pending
Filing Date 2022-02-07
First Publication Date 2024-03-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gemp, Ian Michael
  • Mcwilliams, Brian

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining principal components of a data set using multi-agent interactions. One of the methods includes obtaining initial estimates for a plurality of principal components of a data set; and generating a final estimate for each principal component by repeatedly performing operations comprising: generating a reward estimate using the current estimate of the principal component, wherein the reward estimate is larger if the current estimate of the principal component captures more variance in the data set; generating, for each parent principal component of the principal component, a punishment estimate, wherein the punishment estimate is larger if the current estimate of the principal component and the current estimate of the parent principal component are not orthogonal; and updating the current estimate of the principal component according to a difference between the reward estimate and the punishment estimates.

IPC Classes  ?

  • G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

26.

PREDICTING COMPLETE PROTEIN REPRESENTATIONS FROM MASKED PROTEIN REPRESENTATIONS

      
Application Number 18273594
Status Pending
Filing Date 2022-01-27
First Publication Date 2024-03-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Pritzel, Alexander
  • Ionescu, Catalin-Dumitru
  • Kohl, Simon

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for unmasking a masked representation of a protein using a protein reconstruction neural network. In one aspect, a method comprises: receiving the masked representation of the protein; and processing the masked representation of the protein using the protein reconstruction neural network to generate a respective predicted embedding corresponding to one or more masked embeddings that are included in the masked representation of the protein, wherein a predicted embedding corresponding to a masked embedding in a representation of the amino acid sequence of the protein defines a prediction for an identity of an amino acid at a corresponding position in the amino acid sequence, wherein a predicted embedding corresponding to a masked embedding in a representation of the structure of the protein defines a prediction for a corresponding structural feature of the protein.

IPC Classes  ?

  • G16B 40/20 - Supervised data analysis
  • G16B 15/20 - Protein or domain folding
  • G16B 15/30 - Drug targeting using structural data; Docking or binding prediction
  • G16B 20/50 - Mutagenesis
  • G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids

27.

CONTROLLING AGENTS USING STATE ASSOCIATIVE LEARNING FOR LONG-TERM CREDIT ASSIGNMENT

      
Application Number 18275542
Status Pending
Filing Date 2022-02-04
First Publication Date 2024-03-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Ritter, Samuel
  • Raposo, David Nunes

Abstract

A computer-implemented reinforcement learning neural network system that learns a model of rewards in order to relate actions by an agent in an environment to their long-term consequences. The model learns to decompose the rewards into components explainable by different past states. That is, the model learns to associate when being in a particular state of the environment is predictive of a reward in a later state, even when the later state, and reward, is only achieved after a very long time delay.

IPC Classes  ?

28.

PREDICTING EXCHANGE-CORRELATION ENERGIES OF ATOMIC SYSTEMS USING NEURAL NETWORKS

      
Application Number 18260182
Status Pending
Filing Date 2022-01-07
First Publication Date 2024-02-29
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Kirkpatrick, James
  • Mcmorrow, Brendan Charles
  • Turban, David Herbert Phlipp
  • Gaunt, Alexander Lloyd
  • Spencer, James
  • Matthews, Alexander Graeme De Garis
  • Cohen, Aron Jonathan

Abstract

Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for predicting an exchange-correlation energy of an atomic system. The system obtains respective electron-orbital features of the atomic system at each of a plurality of grid points; generates, for each of the plurality of grid points, a respective input feature vector for the electron-orbital features at the grid point; and processes the respective input feature vectors for the plurality of grid points using a neural network to generate a predicted exchange-correlation energy of the atomic system.

IPC Classes  ?

  • G16C 20/30 - Prediction of properties of chemical compounds, compositions or mixtures
  • G06N 3/08 - Learning methods
  • G16C 20/70 - Machine learning, data mining or chemometrics

29.

RENDERING NEW IMAGES OF SCENES USING GEOMETRY-AWARE NEURAL NETWORKS CONDITIONED ON LATENT VARIABLES

      
Application Number 18275332
Status Pending
Filing Date 2022-02-04
First Publication Date 2024-02-29
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Kosiorek, Adam Roman
  • Strathmann, Heiko
  • Rezende, Danilo Jimenez
  • Zoran, Daniel
  • Moreno Comellas, Pol

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for rendering a new image that depicts a scene from a perspective of a camera at a new camera location. In one aspect, a method comprises: receiving a plurality of observations characterizing the scene; generating a latent variable representing the scene from the plurality of observations characterizing the scene; conditioning a scene representation neural network on the latent variable representing the scene, wherein the scene representation neural network conditioned on the latent variable representing the scene defines a geometric model of the scene as a three-dimensional (3D) radiance field; and rendering the new image that depicts the scene from the perspective of the camera at the new camera location using the scene representation neural network conditioned on the latent variable representing the scene.

IPC Classes  ?

30.

DATA-EFFICIENT REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL TASKS

      
Application Number 18351440
Status Pending
Filing Date 2023-07-12
First Publication Date 2024-02-22
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Riedmiller, Martin
  • Hafner, Roland
  • Vecerik, Mel
  • Lillicrap, Timothy Paul
  • Lampe, Thomas
  • Popov, Ivaylo
  • Barth-Maron, Gabriel
  • Heess, Nicolas Manfred Otto

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-efficient reinforcement learning. One of the systems is a system for training an actor neural network used to select actions to be performed by an agent that interacts with an environment by receiving observations characterizing states of the environment and, in response to each observation, performing an action selected from a continuous space of possible actions, wherein the actor neural network maps observations to next actions in accordance with values of parameters of the actor neural network, and wherein the system comprises: a plurality of workers, wherein each worker is configured to operate independently of each other worker, wherein each worker is associated with a respective agent replica that interacts with a respective replica of the environment during the training of the actor neural network.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/08 - Learning methods
  • G06N 3/088 - Non-supervised learning, e.g. competitive learning
  • G06F 18/21 - Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
  • G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting
  • G06N 3/045 - Combinations of networks

31.

SOLVING MIXED INTEGER PROGRAMS USING NEURAL NETWORKS

      
Application Number 18267363
Status Pending
Filing Date 2021-12-20
First Publication Date 2024-02-22
Owner DeepMind Technologies Limited (USA)
Inventor
  • Bartunov, Sergey
  • Gimeno Gil, Felix Axel
  • Von Glehn, Ingrid Karin
  • Lichocki, Pawel
  • Lobov, Ivan
  • Nair, Vinod
  • O'Donoghue, Brendan Timothy
  • Sonnerat, Nicolas
  • Tjandraatmadja, Christian
  • Wang, Pengming

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for solving mixed integer programs (MIPs) using neural networks. One of the methods includes obtaining data specifying parameters of a MIP; generating, from the parameters of the MIP, an input representation; processing the input representation using an encoder neural network to generate a respective embedding for each of the integer variables; generating a plurality of partial assignments by selecting a respective second, proper subset of the integer variables; and for each of the variables in the respective second subset, generating, using at least the respective embedding for the variable, a respective additional constraint on the value of the variable; generating, for each of the partial assignments, a corresponding candidate final assignment that assigns a respective value to each of the plurality of variables; and selecting, as a final assignment for the MIP, one of the candidate final assignments.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06F 17/11 - Complex mathematical operations for solving equations

32.

Selecting actions from large discrete action sets using reinforcement learning

      
Application Number 17131500
Grant Number 11907837
Status In Force
Filing Date 2020-12-22
First Publication Date 2024-02-20
Grant Date 2024-02-20
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Dulac-Arnold, Gabriel
  • Evans, Richard Andrew
  • Coppin, Benjamin Kenneth

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting actions from large discrete action sets. One of the methods includes receiving a particular observation representing a particular state of an environment; and selecting an action from a discrete set of actions to be performed by an agent interacting with the environment, comprising: processing the particular observation using an actor policy network to generate an ideal point; determining, from the points that represent actions in the set, the k nearest points to the ideal point; for each nearest point of the k nearest points: processing the nearest point and the particular observation using a Q network to generate a respective Q value for the action represented by the nearest point; and selecting the action to be performed by the agent from the k actions represented by the k nearest points based on the Q values.

IPC Classes  ?

33.

GRAPH NEURAL NETWORK SYSTEMS FOR GENERATING STRUCTURED REPRESENTATIONS OF OBJECTS

      
Application Number 18144810
Status Pending
Filing Date 2023-05-08
First Publication Date 2024-02-15
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Li, Yujia
  • Dyer, Christopher James
  • Vinyals, Oriol

Abstract

There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.

IPC Classes  ?

  • G06N 3/047 - Probabilistic or stochastic networks
  • G06F 16/901 - Indexing; Data structures therefor; Storage structures
  • G06F 17/18 - Complex mathematical operations for evaluating statistical data
  • G06N 3/08 - Learning methods
  • G06N 3/045 - Combinations of networks

34.

CONTROLLING AGENTS USING AUXILIARY PREDICTION NEURAL NETWORKS THAT GENERATE STATE VALUE ESTIMATES

      
Application Number 18230056
Status Pending
Filing Date 2023-08-03
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Zaheer, Muhammad
  • Modayil, Joseph Varughese

Abstract

Method, system, and non-transitory computer storage media for selecting actions to be performed by an agent to interact with an environment to perform a main task by for each time step in a sequence of time steps: receiving a set of features representing an observation; for each of one or more auxiliary prediction neural networks, generating a state value estimate for the current state of the environment relative to a corresponding auxiliary reward that measures values of a corresponding target feature from the set of features representing the observations for the sequence of time steps; processing an input comprising a respective intermediate output generated by each auxiliary neural network at the time step using an action selection neural network to generate an action selection output; and selecting the action to be performed by the agent at the time step using the action selection output.

IPC Classes  ?

35.

JOINTLY UPDATING AGENT CONTROL POLICIES USING ESTIMATED BEST RESPONSES TO CURRENT CONTROL POLICIES

      
Application Number 18275881
Status Pending
Filing Date 2022-02-07
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Marris, Luke Christopher
  • Muller, Paul Fernand Michel
  • Lanctot, Marc
  • Graepel, Thore Kurt Hartwig

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating control policies for controlling agents in an environment. One of the methods includes, at each of a plurality of iterations: obtaining a current joint control policy for a plurality of agents, the current joint control policy specifying a respective current control policy for each agent; and updating the current joint control policy, comprising, for each agent: generating a respective reward estimate for each of a plurality of alternate control policies that is an estimate of a reward received by the agent if the agent is controlled using the alternate control policy while the other agents are controlled using the respective current control policies; computing a best response for the agent from the respective reward estimates; and updating the respective current control policy for the agent using the best response for the agent.

IPC Classes  ?

36.

AUGMENTING ATTENTION-BASED NEURAL NETWORKS TO SELECTIVELY ATTEND TO PAST INPUTS

      
Application Number 18486060
Status Pending
Filing Date 2023-10-12
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Rae, Jack William
  • Potapenko, Anna
  • Lillicrap, Timothy Paul

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input that is a sequence to generate a network output. In one aspect, one of the methods includes, for each particular sequence of layer inputs: for each attention layer in the neural network: maintaining episodic memory data; maintaining compressed memory data; receiving a layer input to be processed by the attention layer; and applying an attention mechanism over (i) the compressed representation in the compressed memory data for the layer, (ii) the hidden states in the episodic memory data for the layer, and (iii) the respective hidden state at each of the plurality of input positions in the particular network input to generate a respective activation for each input position in the layer input.

IPC Classes  ?

  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/08 - Learning methods
  • G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting
  • G06N 3/047 - Probabilistic or stochastic networks

37.

MULTI-TASK NEURAL NETWORKS WITH TASK-SPECIFIC PATHS

      
Application Number 18487707
Status Pending
Filing Date 2023-10-16
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Wierstra, Daniel Pieter
  • Fernando, Chrisantha Thomas
  • Pritzel, Alexander
  • Banarse, Dylan Sunil
  • Blundell, Charles
  • Rusu, Andrei-Alexandru
  • Zwols, Yori
  • Ha, David

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using multi-task neural networks. One of the methods includes receiving a first network input and data identifying a first machine learning task to be performed on the first network input; selecting a path through the plurality of layers in a super neural network that is specific to the first machine learning task, the path specifying, for each of the layers, a proper subset of the modular neural networks in the layer that are designated as active when performing the first machine learning task; and causing the super neural network to process the first network input using (i) for each layer, the modular neural networks in the layer that are designated as active by the selected path and (ii) the set of one or more output layers corresponding to the identified first machine learning task.

IPC Classes  ?

  • G06N 3/086 - Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

38.

DATA-DRIVEN ROBOT CONTROL

      
Application Number 18331632
Status Pending
Filing Date 2023-06-08
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Cabi, Serkan
  • Wang, Ziyu
  • Novikov, Alexander
  • Konyushkova, Ksenia
  • Gomez Colmenarejo, Sergio
  • Reed, Scott Ellison
  • Denil, Misha Man Ray
  • Scholz, Jonathan Karl
  • Sushkov, Oleg O.
  • Jeong, Rae Chan
  • Barker, David
  • Budden, David
  • Vecerik, Mel
  • Aytar, Yusuf
  • Gomes De Freitas, Joao Ferdinando

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

IPC Classes  ?

39.

Reinforcement learning with scheduled auxiliary control

      
Application Number 16289531
Grant Number 11893480
Status In Force
Filing Date 2019-02-28
First Publication Date 2024-02-06
Grant Date 2024-02-06
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Riedmiller, Martin
  • Hafner, Roland

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning with scheduled auxiliary tasks. In one aspect, a method includes maintaining data specifying parameter values for a primary policy neural network and one or more auxiliary neural networks; at each of a plurality of selection time steps during a training episode comprising a plurality of time steps: receiving an observation, selecting a current task for the selection time step using a task scheduling policy, processing an input comprising the observation using the policy neural network corresponding to the selected current task to select an action to be performed by the agent in response to the observation, and causing the agent to perform the selected action.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

40.

JOINTLY LEARNING EXPLORATORY AND NON-EXPLORATORY ACTION SELECTION POLICIES

      
Application Number 18334112
Status Pending
Filing Date 2023-06-13
First Publication Date 2024-01-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Badia, Adrià Puigdomènech
  • Sprechmann, Pablo
  • Vitvitskyi, Alex
  • Guo, Zhaohan
  • Piot, Bilal
  • Kapturowski, Steven James
  • Tieleman, Olivier
  • Blundell, Charles

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by an agent interacting with an environment. In one aspect, the method comprises: receiving an observation characterizing a current state of the environment; processing the observation and an exploration importance factor using the action selection neural network to generate an action selection output; selecting an action to be performed by the agent using the action selection output; determining an exploration reward; determining an overall reward based on: (i) the exploration importance factor, and (ii) the exploration reward; and training the action selection neural network using a reinforcement learning technique based on the overall reward.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06F 18/22 - Matching criteria, e.g. proximity measures
  • G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

41.

ACTION CLASSIFICATION IN VIDEO CLIPS USING ATTENTION-BASED NEURAL NETWORKS

      
Application Number 18375941
Status Pending
Filing Date 2023-10-02
First Publication Date 2024-01-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Carreira, Joao
  • Doersch, Carl
  • Zisserman, Andrew

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying actions in a video. One of the methods obtaining a feature representation of a video clip; obtaining data specifying a plurality of candidate agent bounding boxes in the key video frame; and for each candidate agent bounding box: processing the feature representation through an action transformer neural network.

IPC Classes  ?

  • G06V 20/40 - Scenes; Scene-specific elements in video content
  • G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
  • G06N 3/045 - Combinations of networks

42.

NEURAL NETWORKS IMPLEMENTING ATTENTION OVER OBJECT EMBEDDINGS FOR OBJECT-CENTRIC VISUAL REASONING

      
Application Number 18029980
Status Pending
Filing Date 2021-10-01
First Publication Date 2024-01-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Ding, Fengning
  • Santoro, Adam Anthony
  • Hill, Felix George
  • Botvinick, Matthew
  • Piloto, Luis

Abstract

A video processing system configured to analyze a sequence of video frames to detect objects in the video frames and provide information relating to the detected objects in response to a query. The query may comprise, for example, a request for a prediction of a future event, or of the location of an object, or a request for a prediction of what would happen if an object were modified. The system uses a transformer neural network subsystem to process representations of objects in the video.

IPC Classes  ?

  • G06V 20/40 - Scenes; Scene-specific elements in video content
  • G06V 10/26 - Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 10/776 - Validation; Performance evaluation

43.

Selecting reinforcement learning actions using a low-level controller

      
Application Number 17541186
Grant Number 11875258
Status In Force
Filing Date 2021-12-02
First Publication Date 2024-01-16
Grant Date 2024-01-16
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Heess, Nicolas Manfred Otto
  • Lillicrap, Timothy Paul
  • Wayne, Gregory Duncan
  • Tassa, Yuval

Abstract

Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation. The subsystem receives a current observation characterizing a current state of the environment, determines whether criteria are satisfied for generating a new control signal, and based on the determination, provides appropriate inputs to the high-level and low-level controllers for selecting an action to be performed by the agent.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

44.

VOCABULARY SELECTION FOR TEXT PROCESSING TASKS USING POWER INDICES

      
Application Number 18038631
Status Pending
Filing Date 2021-11-22
First Publication Date 2024-01-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gemp, Ian Michael
  • Bachrach, Yoram
  • Patel, Roma
  • Dyer, Christopher James

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting an input vocabulary for a machine learning model using power indices. One of the methods includes computing a respective score for each of a plurality of text tokens in an initial vocabulary and then selecting the text tokens in the input vocabulary based on the respective scores.

IPC Classes  ?

  • G10L 13/047 - Architecture of speech synthesisers
  • G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

45.

RECURRENT NEURAL NETWORKS FOR DATA ITEM GENERATION

      
Application Number 18367305
Status Pending
Filing Date 2023-09-12
First Publication Date 2023-12-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gregor, Karol
  • Danihelka, Ivo

Abstract

Methods, and systems, including computer programs encoded on computer storage media for generating data items. A method includes reading a glimpse from a data item using a decoder hidden state vector of a decoder for a preceding time step, providing, as input to a encoder, the glimpse and decoder hidden state vector for the preceding time step for processing, receiving, as output from the encoder, a generated encoder hidden state vector for the time step, generating a decoder input from the generated encoder hidden state vector, providing the decoder input to the decoder for processing, receiving, as output from the decoder, a generated a decoder hidden state vector for the time step, generating a neural network output update from the decoder hidden state vector for the time step, and combining the neural network output update with a current neural network output to generate an updated neural network output.

IPC Classes  ?

  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

46.

PREDICTING PROTEIN STRUCTURES USING PROTEIN GRAPHS

      
Application Number 18034989
Status Pending
Filing Date 2021-11-23
First Publication Date 2023-12-21
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Pritzel, Alexander
  • Figurnov, Mikhail
  • Jumper, John

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a predicted structure of a protein. According to one aspect, there is provided a method comprising maintaining graph data representing a graph of the protein; obtaining a respective pair embedding for each edge in the graph; processing the pair embeddings using a sequence of update blocks, wherein each update block performs operations comprising, for each edge in the graph: generating a respective representation of each of a plurality of cycles in the graph that include the edge by, for each cycle, processing embeddings for edges in the cycle in accordance with the values of the update block parameters of the update block to generate the representation of the cycle; and updating the pair embedding for the edge using the representations of the cycles in the graph that include the edge.

IPC Classes  ?

47.

Distributional reinforcement learning for continuous control tasks

      
Application Number 18303117
Grant Number 11948085
Status In Force
Filing Date 2023-04-19
First Publication Date 2023-12-21
Grant Date 2024-04-02
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Budden, David
  • Hoffman, Matthew William
  • Barth-Maron, Gabriel

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

IPC Classes  ?

48.

PREDICTING PROTEIN STRUCTURES OVER MULTIPLE ITERATIONS USING RECYCLING

      
Application Number 18034280
Status Pending
Filing Date 2021-11-23
First Publication Date 2023-12-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Jumper, John
  • Figurnov, Mikhail

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting a structure of a protein comprising one or more chains. In one aspect, a method comprises, at each subsequent iteration after a first iteration in a sequence of iterations: obtaining a network input for the subsequent iteration that characterizes the protein; generating, from (i) structure parameters generated at a preceding iteration that precedes the subsequent iteration in the sequence, (ii) one or intermediate outputs generated by the protein structure prediction neural network while generating the structure parameters at the last iteration, or (iii) both, features for the subsequent iteration; and processing the features and the network input for the subsequent iteration using the protein structure prediction neural network to generate structure parameters for the subsequent iteration that define another predicted structure for the protein.

IPC Classes  ?

49.

TRAINING A SPEAKER NEURAL NETWORK USING ONE OR MORE LISTENER NEURAL NETWORKS

      
Application Number 18199896
Status Pending
Filing Date 2023-05-19
First Publication Date 2023-12-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Singh, Aaditya K.
  • Ding, Fengning
  • Hill, Felix George
  • Lampinen, Andrew Kyle

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a speaker neural network using one or more listener neural networks.

IPC Classes  ?

  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 20/62 - Text, e.g. of license plates, overlay texts or captions on TV images

50.

Learning abstractions using patterns of activations of a neural network hidden layer

      
Application Number 16916939
Grant Number 11842270
Status In Force
Filing Date 2020-06-30
First Publication Date 2023-12-12
Grant Date 2023-12-12
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Lerchner, Alexander
  • Hassabis, Demis

Abstract

We describe an artificial neural network comprising: an input layer of input neurons, one or more hidden layers of neurons in successive layers of neurons above the input layer, and at least one further, concept-identifying layer of neurons above the hidden layers. The neural network includes an activation memory coupled to an intermediate, hidden layer of neurons between the input concept-identifying layers to store a pattern of activation of the intermediate layer. The neural network further includes a system to determine an overlap between a plurality of the stored patterns of activation and to activate in the intermediate hidden layer an overlap pattern such that the concept-identifying layer of neurons is configured to identify features of the overlap patterns. We also describe related methods, processor control code, and computing systems for the neural network. Optionally further, higher level concept-identifying layers of neurons may be included.

IPC Classes  ?

51.

PREDICTING PROTEIN STRUCTURES USING AUXILIARY FOLDING NETWORKS

      
Application Number 18034006
Status Pending
Filing Date 2021-11-23
First Publication Date 2023-12-07
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Kohl, Simon
  • Ronneberger, Olaf
  • Figurnov, Mikhail
  • Pritzel, Alexander

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a structure prediction neural network that comprises an embedding neural network and a main folding neural network. According to one aspect, a method comprises: obtaining a training network input characterizing a training protein; processing the training network input using the embedding neural network and the main folding neural network to generate a main structure prediction; for each auxiliary folding neural network in a set of one or more auxiliary folding neural networks, processing at least a corresponding intermediate output of the embedding neural network to generate an auxiliary structure prediction; determining a gradient of an objective function that includes a respective auxiliary structure loss term for each of the auxiliary folding neural networks; and updating the current values of the embedding network parameters and the main folding parameters based on the gradient.

IPC Classes  ?

52.

TRAINING REINFORCEMENT LEARNING AGENTS USING AUGMENTED TEMPORAL DIFFERENCE LEARNING

      
Application Number 18029979
Status Pending
Filing Date 2021-10-01
First Publication Date 2023-11-23
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gulcehre, Caglar
  • Pascanu, Razvan
  • Gomez, Sergio

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions performed by an agent interacting with an environment by performing actions that cause the environment to transition states. One of the methods includes maintaining a replay memory storing a plurality of transitions; selecting a plurality of transitions from the replay memory; and training the neural network on the plurality of transitions, comprising, for each transition: generating an initial Q value for the transition; determining a scaled Q value for the transition; determining a scaled temporal difference learning target for the transition; determining an error between the scaled temporal difference learning target and the scaled Q value; determining an update to the current values of the Q network parameters; and determining an update to the current value of the scaling term.

IPC Classes  ?

  • G06N 3/092 - Reinforcement learning
  • G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]

53.

TRAINING MACHINE LEARNING MODELS BY DETERMINING UPDATE RULES USING NEURAL NETWORKS

      
Application Number 18180754
Status Pending
Filing Date 2023-03-08
First Publication Date 2023-11-23
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Denil, Misha Man Ray
  • Schaul, Tom
  • Andrychowicz, Marcin
  • Gomes De Freitas, Joao Ferdinando
  • Colmenarejo, Sergio Gomez
  • Hoffman, Matthew William
  • Pfau, David Benjamin

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media for training machine learning models. One method includes obtaining a machine learning model, wherein the machine learning model comprises one or more model parameters, and the machine learning model is trained using gradient descent techniques to optimize an objective function; determining an update rule for the model parameters using a recurrent neural network (RNN); and applying a determined update rule for a final time step in a sequence of multiple time steps to the model parameters.

IPC Classes  ?

  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

54.

SELECTION-INFERENCE NEURAL NETWORK SYSTEMS

      
Application Number 18317878
Status Pending
Filing Date 2023-05-15
First Publication Date 2023-11-16
Owner DeepMind Technologies Limited (United Kingdom)
Inventor Creswell, Antonia Phoebe Nina

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a response to a query input using a selection-inference neural network.

IPC Classes  ?

  • B60W 50/06 - Improving the dynamic response of the control system, e.g. improving the speed of regulation or avoiding hunting or overshoot
  • G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
  • G06F 40/20 - Natural language analysis
  • B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
  • B60W 50/02 - Ensuring safety in case of control system failures, e.g. by diagnosing, circumventing or fixing failures

55.

CONSTRAINED REINFORCEMENT LEARNING NEURAL NETWORK SYSTEMS USING PARETO FRONT OPTIMIZATION

      
Application Number 18029992
Status Pending
Filing Date 2021-10-01
First Publication Date 2023-11-16
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Huang, Sandy Han
  • Abdolmaleki, Abbas

Abstract

A system and method that controls an agent to perform a task subject to one or more constraints. The system trains a preference neural network that learns which preferences produce constraint-satisfying action selection policies. Thus the system optimizes a hierarchical policy that is a product of a preference policy and a preference-conditioned action selection policy. Thus the system learns to jointly optimize a set of objectives relating to rewards and costs received during the task whilst also learning preferences, i.e. trade-offs between the rewards and costs, that are most likely to produce policies that satisfy the constraints.

IPC Classes  ?

56.

SIMULATING PHYSICAL ENVIRONMENTS USING GRAPH NEURAL NETWORKS

      
Application Number 18027174
Status Pending
Filing Date 2021-10-01
First Publication Date 2023-11-09
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Sanchez, Alvaro
  • Godwin, Jonathan William
  • Ying, Rex
  • Pfaff, Tobias
  • Fortunato, Meire
  • Battaglia, Peter William

Abstract

This specification describes a simulation system that performs simulations of physical environments using a graph neural network. At each of one or more time steps in a sequence of time steps, the system can process a representation of a current state of the physical environment at the current time step using the graph neural network to generate a prediction of a next state of the physical environment at the next time step. Some implementations of the system are adapted for hardware GLOBAL acceleration. As well as performing simulations, the system can be used to predict physical quantities based on measured real-world data. Implementations of the system are differentiable and can also be used for design optimization, and for optimal control tasks.

IPC Classes  ?

  • G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model

57.

TRAINING PROTEIN STRUCTURE PREDICTION NEURAL NETWORKS USING REDUCED MULTIPLE SEQUENCE ALIGNMENTS

      
Application Number 18025689
Status Pending
Filing Date 2021-08-12
First Publication Date 2023-11-09
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Evans, Richard Andrew
  • Jumper, John
  • Green, Timothy Frederick Goldie
  • Reiman, David

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training neural networks to predict the structure of a protein. In one aspect, a method comprises: obtaining, for each of a plurality of proteins, a full multiple sequence alignment for the protein; generating, for each of the plurality of proteins, target structure parameters characterizing a structure of the protein from the full multiple sequence alignment for the protein, comprising processing a representation of the full multiple sequence alignment for the protein using the structure prediction neural network to generate output structure parameters characterizing a structure of the protein, and determining the target structure parameters for the protein based on the output structure parameters for the protein; determining, for each of the plurality of proteins, a reduced multiple sequence alignment for the protein, comprising removing or masking data from the full multiple sequence alignment for the protein.

IPC Classes  ?

58.

LANGUAGE MODEL FOR PROCESSING A MULTI-MODE QUERY INPUT

      
Application Number 18141337
Status Pending
Filing Date 2023-04-28
First Publication Date 2023-11-02
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Alayrac, Jean-Baptiste
  • Donahue, Jeffrey
  • Lenc, Karel
  • Simonyan, Karen
  • Reynolds, Malcolm Kevin Campbell
  • Luc, Pauline
  • Mensch, Arthur
  • Barr, Iain
  • Miech, Antoine
  • Hasson, Yana Elizabeth
  • Millican, Katherine Elizabeth
  • Ring, Roman

Abstract

A query processing system is described which receives a query input comprising an input token string and also at least one data item having a second, different modality, and generates a corresponding output token string.

IPC Classes  ?

59.

PRIVACY-SENSITIVE NEURAL NETWORK TRAINING USING DATA AUGMENTATION

      
Application Number 18141273
Status Pending
Filing Date 2023-04-28
First Publication Date 2023-11-02
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • De, Soham
  • De Balle Pigem, Borja
  • Hayes, Jamie
  • Smith, Samuel Laurence
  • Berrada Lancrey Javal, Leonard Alix Jean Eric

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for privacy-sensitive training of a neural network. In one aspect, a method includes training a set of neural network parameters of the neural network on a set of training data over multiple training iterations to optimize an objective function. Each training iteration includes: sampling a batch of network inputs from the set of training data; determining a clipped gradient for each network input in the batch of network inputs; and updating the neural network parameters using the clipped gradients for the network inputs in the batch of network inputs.

IPC Classes  ?

  • G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules

60.

CLASSIFYING INPUT EXAMPLES USING A COMPARISON SET

      
Application Number 18211085
Status Pending
Filing Date 2023-06-16
First Publication Date 2023-10-19
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Blundell, Charles
  • Vinyals, Oriol

Abstract

Methods, systems, and apparatus for classifying a new example using a comparison set of comparison examples. One method includes maintaining a comparison set, the comparison set including comparison examples and a respective label vector for each of the comparison examples, each label vector including a respective score for each label in a predetermined set of labels; receiving a new example; determining a respective attention weight for each comparison example by applying a neural network attention mechanism to the new example and to the comparison examples; and generating a respective label score for each label in the predetermined set of labels from, for each of the comparison examples, the respective attention weight for the comparison example and the respective label vector for the comparison example, in which the respective label score for each of the labels represents a likelihood that the label is a correct label for the new example.

IPC Classes  ?

  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/08 - Learning methods
  • G06F 18/2413 - Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
  • G06F 18/22 - Matching criteria, e.g. proximity measures
  • G06F 18/21 - Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation

61.

REINFORCEMENT AND IMITATION LEARNING FOR A TASK

      
Application Number 18306711
Status Pending
Filing Date 2023-04-25
First Publication Date 2023-10-19
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Tunyasuvunakool, Saran
  • Zhu, Yuke
  • Merel, Joshua
  • Kramár, János
  • Wang, Ziyu
  • Heess, Nicolas Manfred Otto

Abstract

A neural network control system for controlling an agent to perform a task in a real-world environment, operates based on both image data and proprioceptive data describing the configuration of the agent. The training of the control system includes both imitation learning, using datasets generated from previous performances of the task, and reinforcement learning, based on rewards calculated from control data output by the control system.

IPC Classes  ?

  • B25J 9/16 - Programme controls
  • G06N 3/08 - Learning methods
  • G06N 3/008 - Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

62.

CROSS-DOMAIN IMITATION LEARNING USING GOAL CONDITIONED POLICIES

      
Application Number 18028966
Status Pending
Filing Date 2021-10-01
First Publication Date 2023-10-19
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Zhou, Yuxiang
  • Aytar, Yusuf
  • Bousmalis, Konstantinos

Abstract

It is described a system implemented as computer programs on one or more computers in one or more locations that trains a policy neural network that is used to control a robot, i.e., to select actions to be performed by the robot while the robot is interacting with an environment, through imitation learning in order to cause the robot to perform particular tasks in the environment.

IPC Classes  ?

63.

RATE CONTROL MACHINE LEARNING MODELS WITH FEEDBACK CONTROL FOR VIDEO ENCODING

      
Application Number 18030182
Status Pending
Filing Date 2021-11-03
First Publication Date 2023-10-19
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gu, Chenjie
  • Mao, Hongzi
  • Chiang, Ching-Han
  • Chen, Cheng
  • Han, Jingning
  • Pang, Ching Yin Derek
  • Claus, Rene Andre
  • Hechtman, Marisabel Guevara
  • Visentin, Daniel James
  • Fougner, Christopher Sigurd
  • Schaff, Charles Booth
  • Patil, Nishant
  • Bellido, Alejandro Ramirez

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for encoding video comprising a sequence of video frames. In one aspect, a method comprises for one or more of the video frames: obtaining a feature embedding for the video frame; processing the feature embedding using a rate control machine learning model to generate a respective score for each of multiple quantization parameter values; selecting a quantization parameter value using the scores; determining a cumulative amount of data required to represent: (i) an encoded representation of the video frame and (ii) encoded representations of each preceding video frame; determining, based on the cumulative amount of data, that a feedback control criterion for the video frame is satisfied; updating the selected quantization parameter value; and processing the video frame using an encoding model to generate the encoded representation of the video frame.

IPC Classes  ?

  • H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
  • H04N 19/126 - Quantisation - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
  • H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

64.

CONTROLLING AGENTS USING RELATIVE VARIATIONAL INTRINSIC CONTROL

      
Application Number 18025304
Status Pending
Filing Date 2021-09-10
First Publication Date 2023-10-12
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Warde-Farley, David Constantine Patrick
  • Hansen, Steven Stenberg
  • Mnih, Volodymyr
  • Baumli, Kate Alexandra

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network for use in controlling an agent using relative variational intrinsic control. In one aspect, a method includes: selecting a skill from a set of skills; generating a trajectory by controlling the agent using the policy neural network while the policy neural network is conditioned on the selected skill; processing an initial observation and a last observation using a relative discriminator neural network to generate a relative score; processing the last observation using an absolute discriminator neural network to generate an absolute score; generating a reward for the trajectory from the absolute score corresponding to the selected skill and the relative score corresponding to the selected skill; and training the policy neural network on the reward for the trajectory.

IPC Classes  ?

65.

ALLOCATING COMPUTING RESOURCES BETWEEN MODEL SIZE AND TRAINING DATA DURING TRAINING OF A MACHINE LEARNING MODEL

      
Application Number 18127551
Status Pending
Filing Date 2023-03-28
First Publication Date 2023-10-05
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Hoffmann, Jordan
  • Borgeaud Dit Avocat, Sebastian
  • Sifre, Laurent
  • Mensch, Arthur

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model to perform a machine learning task. In one aspect, a method performed by one or more computer is described. The method includes: obtaining data defining a compute budget that characterizes an amount of computing resources allocated for training a machine learning model to perform a machine learning task; processing the data defining the compute budget using an allocation mapping, in accordance with a set of allocation mapping parameters, to generate an allocation tuple defining: (i) a target model size for the machine learning model, and (ii) a target amount of training data for training the machine learning model; instantiating the machine learning model, where the machine learning model has the target model size; and obtaining the target amount of training data for training the machine learning model.

IPC Classes  ?

  • G06F 9/50 - Allocation of resources, e.g. of the central processing unit [CPU]

66.

TRAINING NEURAL NETWORKS

      
Application Number 17711951
Status Pending
Filing Date 2022-04-01
First Publication Date 2023-10-05
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Calian, Dan-Andrei
  • Gowal, Sven Adrian
  • Mann, Timothy Arthur
  • György, András

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media for processing a network input using a trained neural network with network parameters to generate an output for a machine learning task. The training includes: receiving a set of training examples each including a training network input and a reference output; for each training iteration, generating a corrupted network input for each training network input using a corruption neural network; updating perturbation parameters of the corruption neural network using a first objective function based on the corrupted network inputs; generating an updated corrupted network input for each training network input based on the updated perturbation parameters; and generating a network output for each updated corrupted network input using the neural network; for each training example, updating the network parameters using a second objective function based on the network output and the reference output.

IPC Classes  ?

  • G06V 10/774 - Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 10/776 - Validation; Performance evaluation

67.

TRAINING VIDEO DATA GENERATION NEURAL NETWORKS USING VIDEO FRAME EMBEDDINGS

      
Application Number 18020856
Status Pending
Filing Date 2021-09-08
First Publication Date 2023-09-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Assael, Ioannis Alexandros
  • Shillingford, Brendan

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a video data generation neural network having a plurality of video generation network parameters. In one aspect, a method includes generating one or more sequences of training video frames using the video data generation neural network in accordance with current values of the video data generation network parameters; obtaining one or more sequences of target video frames; and training the video data generation neural network using training signals derived from a similarity between respective embeddings of the training and target video frames. The embeddings are generated by a video data embedding neural network.

IPC Classes  ?

68.

PREDICTING PROTEIN STRUCTURES BY SHARING INFORMATION BETWEEN MULTIPLE SEQUENCE ALIGNMENTS AND PAIR EMBEDDINGS

      
Application Number 18026376
Status Pending
Filing Date 2021-11-23
First Publication Date 2023-09-21
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Figurnov, Mikhail
  • Pritzel, Alexander
  • Evans, Richard Andrew
  • Bates, Russell James
  • Ronneberger, Olaf
  • Kohl, Simon
  • Jumper, John

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting a structure of a protein comprising one or more chains. In one aspect, a method comprises: obtaining an initial multiple sequence alignment (MSA) representation; obtaining a respective initial pair embedding for each pair of amino acids in the protein; processing an input comprising the initial MSA representation and the initial pair embeddings using an embedding neural network to generate an output that comprises a final MSA representation and a respective final pair embedding for each pair of amino acids in the protein; and determining a predicted structure of the protein using the final MSA representation, the final pair embeddings, or both.

IPC Classes  ?

69.

TRAINING ACTION SELECTION NEURAL NETWORKS USING AUXILIARY TASKS OF CONTROLLING OBSERVATION EMBEDDINGS

      
Application Number 18016746
Status Pending
Filing Date 2021-07-27
First Publication Date 2023-09-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Wulfmeier, Markus
  • Hertweck, Tim
  • Riedmiller, Martin

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment to accomplish a goal. In one aspect, a method comprises: obtaining an observation characterizing a state of the environment, processing the observation using an embedding model to generate a lower-dimensional embedding of the observation, determining an auxiliary task reward based on a value of a particular dimension of the embedding, determining an overall reward based at least in part on the auxiliary task reward, and determining an update to values of multiple parameters of an action selection neural network based on the overall reward using a reinforcement learning technique.

IPC Classes  ?

  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 10/70 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning

70.

AUGMENTED RECURRENT NEURAL NETWORK WITH EXTERNAL MEMORY

      
Application Number 18174394
Status Pending
Filing Date 2023-02-24
First Publication Date 2023-09-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Grefenstette, Edward Thomas
  • Hermann, Karl Moritz
  • Suleyman, Mustafa
  • Blunsom, Philip

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from the neural network output for the time step as a system output for the time step; maintaining a current state of the external memory; determining, from the neural network output for the time step, memory state parameters for the time step; updating the current state of the external memory using the memory state parameters for the time step; reading data from the external memory in accordance with the updated state of the external memory; and combining the data read from the external memory with a system input for the next time step to generate the neural network input for the next time step.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/082 - Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

71.

SEMI-SUPERVISED KEYPOINT BASED MODELS

      
Application Number 18006229
Status Pending
Filing Date 2021-07-28
First Publication Date 2023-09-07
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Vecerik, Mel
  • Scholz, Jonathan Karl
  • Regli, Jean-Bapiste

Abstract

A method for training a neural network to predict keypoints of unseen objects using a training data set including labeled and unlabeled training data is described. The method comprising: receiving the training data set comprising a plurality of training samples, each training sample comprising a set of synchronized images of one or more objects from a respective scene, wherein each image in the set is synchronously taken by a respective camera from a different point of view, and wherein a subset of the set of synchronized images is labeled with ground-truth keypoints and the remaining images in the set are unlabeled; and for each of one or more training samples of the plurality of training samples: training the neural network on the training sample by updating current values of parameters of the neural network to minimize a loss function which is a combination of a supervised loss function and an unsupervised loss function.

IPC Classes  ?

  • G06V 10/774 - Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

72.

Population based training of neural networks

      
Application Number 18120715
Grant Number 11941527
Status In Force
Filing Date 2023-03-13
First Publication Date 2023-09-07
Grant Date 2024-03-26
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Jaderberg, Maxwell Elliot
  • Czarnecki, Wojciech
  • Green, Timothy Frederick Goldie
  • Dalibard, Valentin Clement

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having a plurality of network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having a plurality of hyperparameters, the method comprising: maintaining a plurality of candidate neural networks and, for each of the candidate neural networks, data specifying: (i) respective values of the network parameters for the candidate neural network, (ii) respective values of the hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the plurality of candidate neural networks, repeatedly performing additional training operations.

IPC Classes  ?

73.

LEARNING OBSERVATION REPRESENTATIONS BY PREDICTING THE FUTURE IN LATENT SPACE

      
Application Number 18090243
Status Pending
Filing Date 2022-12-28
First Publication Date 2023-08-31
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Van Den Oord, Aaron Gerard Antonius
  • Li, Yazhe
  • Vinyals, Oriol

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network that is configured to process an input observation to generate a latent representation of the input observation. In one aspect, a method includes: obtaining a sequence of observations; for each observation in the sequence of observations, processing the observation using the encoder neural network to generate a latent representation of the observation; for each of one or more given observations in the sequence of observations: generating a context latent representation of the given observation; and generating, from the context latent representation of the given observation, a respective estimate of the latent representations of one or more particular observations that are after the given observation in the sequence of observations.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06F 17/16 - Matrix or vector computation
  • G06N 3/08 - Learning methods
  • G06F 18/22 - Matching criteria, e.g. proximity measures
  • G06N 3/045 - Combinations of networks
  • G06N 3/048 - Activation functions
  • G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
  • G06V 10/77 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

74.

ALIGNING ENTITIES USING NEURAL NETWORKS

      
Application Number 18016124
Status Pending
Filing Date 2021-07-16
First Publication Date 2023-08-17
Owner DeepMind Technologies Limited (United Kingdom)
Inventor Creswell, Antonia Phoebe Nina

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for aligning entities across time.

IPC Classes  ?

  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads

75.

OFF-LINE LEARNING FOR ROBOT CONTROL USING A REWARD PREDICTION MODEL

      
Application Number 18018421
Status Pending
Filing Date 2021-07-27
First Publication Date 2023-08-17
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Zolna, Konrad
  • Reed, Scott Ellison

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for off-line learning using a reward prediction model. One of the methods includes obtaining robot experience data; training, on a first subset of the robot experience data, a reward prediction model that receives a reward input comprising an input observation and generates as output a reward prediction that is a prediction Neural Network of a task-specific reward for the particular task that should be assigned to the input observation; processing experiences in the robot experience data using the trained reward prediction model to generate a respective reward prediction for each of the processed experiences; and training a policy neural network on (i) the processed experiences and (ii) the respective reward predictions for the processed experiences.

IPC Classes  ?

76.

REINFORCEMENT LEARNING USING DISTRIBUTED PRIORITIZED REPLAY

      
Application Number 18131753
Status Pending
Filing Date 2023-04-06
First Publication Date 2023-08-10
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Budden, David
  • Barth-Maron, Gabriel
  • Quan, John
  • Horgan, Daniel George

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. One of the systems includes (i) a plurality of actor computing units, in which each of the actor computing units is configured to maintain a respective replica of the action selection neural network and to perform a plurality of actor operations, and (ii) one or more learner computing units, in which each of the one or more learner computing units is configured to perform a plurality of learner operations.

IPC Classes  ?

77.

Training more secure neural networks by using local linearity regularization

      
Application Number 18079791
Grant Number 11775830
Status In Force
Filing Date 2022-12-12
First Publication Date 2023-08-10
Grant Date 2023-10-03
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Qin, Chongli
  • Gowal, Sven Adrian
  • De, Soham
  • Stanforth, Robert
  • Martens, James
  • Dvijotham, Krishnamurthy
  • Krishnan, Dilip
  • Fawzi, Alhussein

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes processing each training input using the neural network and in accordance with the current values of the network parameters to generate a network output for the training input; computing a respective loss for each of the training inputs by evaluating a loss function; identifying, from a plurality of possible perturbations, a maximally non-linear perturbation; and determining an update to the current values of the parameters of the neural network by performing an iteration of a neural network training procedure to decrease the respective losses for the training inputs and to decrease the non-linearity of the loss function for the identified maximally non-linear perturbation.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting
  • G06F 18/2135 - Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
  • G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
  • G06V 10/774 - Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting

78.

TRAINING NEURAL NETWORKS USING A PRIORITIZED EXPERIENCE MEMORY

      
Application Number 18103416
Status Pending
Filing Date 2023-01-30
First Publication Date 2023-08-03
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Schaul, Tom
  • Quan, John
  • Silver, David

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network used to select actions performed by a reinforcement learning agent interacting with an environment. In one aspect, a method includes maintaining a replay memory, where the replay memory stores pieces of experience data generated as a result of the reinforcement learning agent interacting with the environment. Each piece of experience data is associated with a respective expected learning progress measure that is a measure of an expected amount of progress made in the training of the neural network if the neural network is trained on the piece of experience data. The method further includes selecting a piece of experience data from the replay memory by prioritizing for selection pieces of experience data having relatively higher expected learning progress measures and training the neural network on the selected piece of experience data.

IPC Classes  ?

79.

AUGMENTING MACHINE LEARNING LANGUAGE MODELS USING SEARCH ENGINE RESULTS

      
Application Number 18104210
Status Pending
Filing Date 2023-01-31
First Publication Date 2023-08-03
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Lazaridou, Angeliki
  • Gribovskaya, Elena
  • Grigorev, Nikolai
  • Stokowiec, Wojciech Jan

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting machine learning language models using search engine results. One of the methods includes obtaining question data representing a question; generating, from the question data, a search engine query for a search engine; obtaining a plurality of documents identified by the search engine in response to processing the search engine query; generating, from the plurality of documents, a plurality of conditioning inputs each representing at least a portion of one or more of the obtained documents; for each of a plurality of the generated conditioning inputs, processing a network input generated from (i) the question data and (ii) the conditioning input using a neural network to generate a network output representing a candidate answer to the question; and generating, from the network outputs representing respective candidate answers, answer data representing a final answer to the question.

IPC Classes  ?

80.

MULTI-AGENT REINFORCEMENT LEARNING WITH MATCHMAKING POLICIES

      
Application Number 18131567
Status Pending
Filing Date 2023-04-06
First Publication Date 2023-08-03
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Silver, David
  • Vinyals, Oriol
  • Jaderberg, Maxwell Elliot

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network having a plurality of policy parameters and used to select actions to be performed by an agent to control the agent to perform a particular task while interacting with one or more other agents in an environment. In one aspect, the method includes: maintaining data specifying a pool of candidate action selection policies; maintaining data specifying respective matchmaking policy; and training the policy neural network using a reinforcement learning technique to update the policy parameters. The policy parameters define policies to be used in controlling the agent to perform the particular task.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • H04L 9/40 - Network security protocols
  • G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting

81.

LEARNING FROM DELAYED OUTCOMES USING NEURAL NETWORKS

      
Application Number 18131580
Status Pending
Filing Date 2023-04-06
First Publication Date 2023-08-03
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Hu, Huiyi
  • Jiang, Ray
  • Mann, Timothy Arthur
  • Gowal, Sven Adrian
  • Lakshminarayanan, Balaji
  • György, András

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for learning from delayed outcomes using neural networks. One of the methods includes receiving an input observation; generating, from the input observation, an output label distribution over possible labels for the input observation at a final time, comprising: processing the input observation using a first neural network configured to process the input observation to generate a distribution over possible values for an intermediate indicator at a first time earlier than the final time; generating, from the distribution, an input value for the intermediate indicator; and processing the input value for the intermediate indicator using a second neural network configured to process the input value for the intermediate indicator to determine the output label distribution over possible values for the input observation at the final time; and providing an output derived from the output label distribution.

IPC Classes  ?

82.

GENERATING SEQUENCES OF DATA ELEMENTS USING CROSS-ATTENTION OPERATIONS

      
Application Number 18102985
Status Pending
Filing Date 2023-01-30
First Publication Date 2023-08-03
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Hawthorne, Curtis Glenn-Macway
  • Jaegle, Andrew Coulter
  • Cangea, Catalina-Codruta
  • Borgeaud Dit Avocat, Sebastian
  • Nash, Charlie Thomas Curtis
  • Malinowski, Mateusz
  • Dieleman, Sander Etienne Lea
  • Vinyals, Oriol
  • Botvinick, Matthew
  • Simon, Ian Stuart
  • Sheahan, Hannah Rachel
  • Zeghidour, Neil
  • Alayrac, Jean-Baptiste
  • Carreira, Joao
  • Engel, Jesse

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a sequence of data elements that includes a respective data element at each position in a sequence of positions. In one aspect, a method includes: for each position after a first position in the sequence of positions: obtaining a current sequence of data element embeddings that includes a respective data element embedding of each data element at a position that precedes the current position, obtaining a sequence of latent embeddings, and processing: (i) the current sequence of data element embeddings, and (ii) the sequence of latent embeddings, using a neural network to generate the data element at the current position. The neural network includes a sequence of neural network blocks including: (i) a cross-attention block, (ii) one or more self-attention blocks, and (iii) an output block.

IPC Classes  ?

  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks

83.

LEARNED COMPUTER CONTROL USING POINTING DEVICE AND KEYBOARD ACTIONS

      
Application Number 18103309
Status Pending
Filing Date 2023-01-30
First Publication Date 2023-08-03
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Humphreys, Peter Conway
  • Lillicrap, Timothy Paul
  • Pohlen, Tobias Markus
  • Santoro, Adam Anthony

Abstract

A computer-implemented method for controlling a particular computer to execute a task is described. The method includes receiving a control input comprising a visual input, the visual input including one or more screen frames of a computer display that represent at least a current state of the particular computer; processing the control input using a neural network to generate one or more control outputs that are used to control the particular computer to execute the task, in which the one or more control outputs include an action type output that specifies at least one of a pointing device action or a keyboard action to be performed to control the particular computer; determining one or more actions from the one or more control outputs; and executing the one or more actions to control the particular computer.

IPC Classes  ?

  • G06F 3/033 - Pointing devices displaced or positioned by the user; Accessories therefor
  • G06F 3/023 - Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
  • G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates

84.

COMPUTER CODE GENERATION FROM TASK DESCRIPTIONS USING NEURAL NETWORKS

      
Application Number 18105211
Status Pending
Filing Date 2023-02-02
First Publication Date 2023-08-03
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Li, Yujia
  • Choi, David Hugo
  • Chung, Junyoung
  • Kushman, Nathaniel Arthur
  • Schrittwieser, Julian
  • Leblond, Rémi
  • Eccles, Thomas Edward
  • Keeling, James Thomas
  • Gimeno Gil, Felix Axel
  • Dal Lago, Agustín Matías
  • Hubert, Thomas Keisuke
  • Choy, Peter
  • De Masson D'Autume, Cyprien
  • Sutherland Robson, Esme
  • Vinyals, Oriol

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating computer code using neural networks. One of the methods includes receiving description data describing a computer programming task; receiving a first set of inputs for the computer programming task; generating a plurality of candidate computer programs by sampling a plurality of output sequences from a set of one or more generative neural networks; for each candidate computer program in a subset of the candidate computer programs and for each input in the first set: executing the candidate computer program on the input to generate an output; and selecting, from the candidate computer programs, one or more computer programs as synthesized computer programs for performing the computer programming task based at least in part on the outputs generated by executing the candidate computer programs in the subset on the inputs in the first set of inputs.

IPC Classes  ?

  • G06F 8/30 - Creation or generation of source code

85.

Dynamic placement of computation sub-graphs

      
Application Number 18094308
Grant Number 11861474
Status In Force
Filing Date 2023-01-06
First Publication Date 2023-07-27
Grant Date 2024-01-02
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Foerster, Jakob Nicolaus
  • Sharifi, Matthew

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for assigning operations of a computational graph to a plurality of computing devices are disclosed. Data characterizing a computational graph is obtained. Context information for a computational environment in which to perform the operations of the computational graph is received. A model input is generated, which includes at least the context information and the data characterizing the computational graph. The model input is processed using the machine learning model to generate an output defining placement assignments of the operations of the computational graph to the plurality of computing devices. The operations of the computational graph are assigned to the plurality of computing device according to the defined placement assignments.

IPC Classes  ?

86.

PERFORMANCE OF A NEURAL NETWORK USING AUTOMATICALLY UNCOVERED FAILURE CASES

      
Application Number 18160860
Status Pending
Filing Date 2023-01-27
First Publication Date 2023-07-27
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Perez, Ethan Josean
  • Huang, Saffron Shan
  • Mcaleese-Park, Nathaniel John
  • Irving, Geoffrey

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for adjusting a target neural network using automatically generated test cases before deployment of the target neural network in a deployment environment. One of the methods may include generating a plurality of test inputs by using a test case generation neural network; processing the plurality of test inputs using a target neural network to generate one or more test outputs for each test input; and identifying, from the one or more test outputs generated by the target neural network for each test input, failing test inputs that result in generation of test outputs by the target neural network that fail one or more criteria.

IPC Classes  ?

87.

TRAINING AN ACTION SELECTION SYSTEM USING RELATIVE ENTROPY Q-LEARNING

      
Application Number 18008838
Status Pending
Filing Date 2021-07-27
First Publication Date 2023-07-06
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Jeong, Rae Chan
  • Springenberg, Jost Tobias
  • Kay, Jacqueline Ok-Chan
  • Zheng, Daniel Hai Huan
  • Galashov, Alexandre
  • Heess, Nicolas Manfred Otto
  • Nori, Francesco

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection system using reinforcement learning techniques. In one aspect, a method comprises at each of multiple iterations: obtaining a batch of experience, each experience tuple comprising: a first observation, an action, a second observation, and a reward; for each experience tuple, determining a state value for the second observation, comprising: processing the first observation using a policy neural network to generate an action score for each action in a set of possible actions; sampling multiple actions from the set of possible actions in accordance with the action scores; processing the second observation using a Q neural network to generate a Q value for each sampled action; and determining the state value for the second observation; and determining an update to current values of the Q neural network parameters using the state values.

IPC Classes  ?

88.

REINFORCEMENT LEARNING USING A RELATIONAL NETWORK FOR GENERATING DATA ENCODING RELATIONSHIPS BETWEEN ENTITIES IN AN ENVIRONMENT

      
Application Number 18168123
Status Pending
Filing Date 2023-02-13
First Publication Date 2023-06-22
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Li, Yujia
  • Bapst, Victor Constant
  • Zambaldi, Vinicius
  • Raposo, David Nunes
  • Santoro, Adam Anthony

Abstract

A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.

IPC Classes  ?

  • G06N 5/043 - Distributed expert systems; Blackboards
  • G06F 17/16 - Matrix or vector computation
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/08 - Learning methods
  • G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

89.

Distributional reinforcement learning using quantile function neural networks

      
Application Number 18169803
Grant Number 11887000
Status In Force
Filing Date 2023-02-15
First Publication Date 2023-06-22
Grant Date 2024-01-30
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Ostrovski, Georg
  • Dabney, William Clinton

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

IPC Classes  ?

90.

Parallel video processing systems

      
Application Number 18108873
Grant Number 11967150
Status In Force
Filing Date 2023-02-13
First Publication Date 2023-06-15
Grant Date 2024-04-23
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Osindero, Simon
  • Carreira, Joao
  • Patraucean, Viorica
  • Zisserman, Andrew

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for parallel processing of video frames using neural networks. One of the methods includes receiving a video sequence comprising a respective video frame at each of a plurality of time steps; and processing the video sequence using a video processing neural network to generate a video processing output for the video sequence, wherein the video processing neural network includes a sequence of network components, wherein the network components comprise a plurality of layer blocks each comprising one or more neural network layers, wherein each component is active for a respective subset of the plurality of time steps, and wherein each layer block is configured to, at each time step at which the layer block is active, receive an input generated at a previous time step and to process the input to generate a block output.

IPC Classes  ?

  • G06V 20/40 - Scenes; Scene-specific elements in video content
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks
  • G06N 3/049 - Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
  • G06T 1/20 - Processor architectures; Processor configuration, e.g. pipelining

91.

SCENE UNDERSTANDING AND GENERATION USING NEURAL NETWORKS

      
Application Number 18164021
Status Pending
Filing Date 2023-02-03
First Publication Date 2023-06-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Rezende, Danilo Jimenez
  • Eslami, Seyed Mohammadali
  • Gregor, Karol
  • Besse, Frederic Olivier

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for image rendering. In one aspect, a method comprises receiving a plurality of observations characterizing a particular scene, each observation comprising an image of the particular scene and data identifying a location of a camera that captured the image. In another aspect, the method comprises receiving a plurality of observations characterizing a particular video, each observation comprising a video frame from t31he particular video and data identifying a time stamp of the video frame in the particular video. In yet another aspect, the method comprises receiving a plurality of observations characterizing a particular image, each observation comprising a crop of the particular image and data characterizing the crop of the particular image. The method processes each of the plurality of observations using an observation neural network to determine a numeric representation as output.

IPC Classes  ?

  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06T 7/90 - Determination of colour characteristics
  • G06T 7/70 - Determining position or orientation of objects or cameras
  • G06T 11/00 - 2D [Two Dimensional] image generation
  • G06V 30/262 - Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
  • G06V 20/00 - Scenes; Scene-specific elements
  • G06V 20/40 - Scenes; Scene-specific elements in video content
  • G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

92.

TRAINING CONDITIONAL COMPUTATION NEURAL NETWORKS USING REINFORCEMENT LEARNING

      
Application Number 18076978
Status Pending
Filing Date 2022-12-07
First Publication Date 2023-06-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Clark, Aidan
  • Mensch, Arthur

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network having one or more conditional computation layers, where each conditional computation layer includes a gating sub-layer having multiple gating parameters and an expert sub-layer having multiple expert neural networks. In one aspect, a method comprises: sampling a batch of target output sequences that comprises a respective ground truth output token at each of multiple output positions; for each target output sequence, processing the target output sequence using the neural network to generate a network output that includes respective score distributions over the vocabulary of output tokens for the output positions in the target output sequence; and training each gating sub-layer using respective rewards for the gating sub-layer for the output positions through reinforcement learning to optimize a reinforcement learning objective function that measures an expected reward received by the gating sub-layer.

IPC Classes  ?

  • G06N 3/04 - Architecture, e.g. interconnection topology

93.

LARGE SCALE RETRIEVAL FOR SEQUENCE GENERATION

      
Application Number 18076984
Status Pending
Filing Date 2022-12-07
First Publication Date 2023-06-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Borgeaud Dit Avocat, Sebastian
  • Sifre, Laurent
  • Mensch, Arthur
  • Hoffmann, Jordan

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a final output sequence. In one aspect, a method comprises: receiving a current output sequence comprising one or more current output segments; receiving a set of reference segments and a respective reference segment embedding of each reference segment that has been generated using an embedding neural network; for each current output segment: processing the current output segment using the embedding neural network to generate a current output segment embedding of the current output segment; and selecting k most similar reference segments to the current output segment using the reference segment embeddings and the current output segment embedding; and processing the current output sequence and the k most similar reference segments for each current output segment to generate an additional output segment that follows the current output sequence in the final output sequence.

IPC Classes  ?

94.

CONTROLLING INTERACTIVE AGENTS USING MULTI-MODAL INPUTS

      
Application Number 18077194
Status Pending
Filing Date 2022-12-07
First Publication Date 2023-06-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Abramson, Joshua Simon
  • Ahuja, Arun
  • Carnevale, Federico Javier
  • Georgiev, Petko Ivanov
  • Hung, Chia-Chun
  • Lillicrap, Timothy Paul
  • Muldal, Alistair Michael
  • Santoro, Adam Anthony
  • Von Glehn, Tamara Louise
  • Landon, Jessica Paige
  • Wayne, Gregory Duncan
  • Yan, Chen
  • Zhu, Rui

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an interactive agent can be controlled based on multi-modal inputs that include both an observation image and a natural language text sequence.

IPC Classes  ?

  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 15/16 - Speech classification or search using artificial neural networks
  • G10L 13/02 - Methods for producing synthetic speech; Speech synthesisers
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 20/50 - Context or environment of the image
  • G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates
  • G06F 40/40 - Processing or translation of natural language
  • G06V 10/774 - Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
  • G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

95.

DISTRIBUTED TOP K COMPUTATION

      
Application Number 17990578
Status Pending
Filing Date 2022-11-18
First Publication Date 2023-05-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Elsen, Erich Konrad
  • Abercrombie, Stuart Christopher Benedict

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing a top k computation across multiple computing units of an integrated circuit One of the methods includes computing, by each of the plurality of computing units and for each candidate vector in a respective subset of the candidate vectors assigned to the computing unit, a respective distance between the query vector and the candidate vector; initializing, by the integrated circuit, a cut-off distance value; determining, by the integrated circuit, a final cut-off distance value; and providing, by the integrated circuit and as an output of a top k computation for the query vector and the set of candidate vectors, the candidate vectors that have respective distances that satisfy the final cut-off distance value.

IPC Classes  ?

  • G06F 11/36 - Preventing errors by testing or debugging of software

96.

Distributed training using actor-critic reinforcement learning with off-policy correction factors

      
Application Number 18149771
Grant Number 11868894
Status In Force
Filing Date 2023-01-04
First Publication Date 2023-05-18
Grant Date 2024-01-09
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Soyer, Hubert Josef
  • Espeholt, Lasse
  • Simonyan, Karen
  • Doron, Yotam
  • Firoiu, Vlad
  • Mnih, Volodymyr
  • Kavukcuoglu, Koray
  • Munos, Remi
  • Ward, Thomas
  • Harley, Timothy James Alexander
  • Dunning, Iain Robert

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.

IPC Classes  ?

97.

LEARNING OPTIONS FOR ACTION SELECTION WITH META-GRADIENTS IN MULTI-TASK REINFORCEMENT LEARNING

      
Application Number 17918365
Status Pending
Filing Date 2021-06-07
First Publication Date 2023-05-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Jeya Veeraiah, Vivek Veeriah
  • Zahavy, Tom Ben Zion
  • Hessel, Matteo
  • Xu, Zhongwen
  • Oh, Junhyuk
  • Kemaev, Iurii
  • Van Hasselt, Hado Philip
  • Silver, David
  • Baveja, Satinder Singh

Abstract

A reinforcement learning system, method, and computer program code for controlling an agent to perform a plurality of tasks while interacting with an environment. The system learns options, where an option comprises a sequence of primitive actions performed by the agent under control of an option policy neural network. In implementations the system discovers options which are useful for multiple different tasks by meta-learning rewards for training the option policy neural network whilst the agent is interacting with the environment.

IPC Classes  ?

98.

RATING TASKS AND POLICIES USING CONDITIONAL PROBABILITY DISTRIBUTIONS DERIVED FROM EQUILIBRIUM-BASED SOLUTIONS OF GAMES

      
Application Number 17963113
Status Pending
Filing Date 2022-10-10
First Publication Date 2023-05-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor Marris, Luke Christopher

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for rating tasks and policies using conditional probability distributions derived from equilibrium-based solutions of games. One of the methods includes: determining, for each action selection policy in a pool of action selection policies, a respective performance measure of the action selection policy on each task in a pool of tasks, processing the performance measures of the action selection policies on the tasks to generate data defining a joint probability distribution over a set of action selection policy — task pairs, and processing the joint probability distribution over the set of action selection policy — task pairs to generate a respective rating for each action selection policy in the pool of action selection policies, where the respective rating for each action selection policy characterizes a utility of the action selection policy in performing tasks from the pool of tasks.

IPC Classes  ?

  • A63F 13/798 - Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories for assessing skills or for ranking players, e.g. for generating a hall of fame
  • G06F 17/18 - Complex mathematical operations for evaluating statistical data
  • A63F 13/822 - Strategy games; Role-playing games 

99.

GENERATING NEURAL NETWORK OUTPUTS BY ENRICHING LATENT EMBEDDINGS USING SELF-ATTENTION AND CROSS-ATTENTION OPERATIONS

      
Application Number 18095925
Status Pending
Filing Date 2023-01-11
First Publication Date 2023-05-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Jaegle, Andrew Coulter
  • Carreira, Joao

Abstract

This specification describes a method for using a neural network to generate a network output that characterizes an entity. The method includes: obtaining a representation of the entity as a set of data element embeddings, obtaining a set of latent embeddings, and processing: (i) the set of data element embeddings, and (ii) the set of latent embeddings, using the neural network to generate the network output characterizing the entity. The neural network includes: (i) one or more cross-attention blocks, (ii) one or more self-attention blocks, and (iii) an output block. Each cross-attention block updates each latent embedding using attention over some or all of the data element embeddings. Each self-attention block updates each latent embedding using attention over the set of latent embeddings. The output block processes one or more latent embeddings to generate the network output that characterizes the entity.

IPC Classes  ?

100.

DEEP NEURAL NETWORK SYSTEM FOR SIMILARITY-BASED GRAPH REPRESENTATIONS

      
Application Number 18087704
Status Pending
Filing Date 2022-12-22
First Publication Date 2023-05-04
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Li, Yujia
  • Gu, Chenjie
  • Dullien, Thomas
  • Vinyals, Oriol
  • Kohli, Pushmeet

Abstract

There is described a neural network system implemented by one or more computers for determining graph similarity. The neural network system comprises one or more neural networks configured to process an input graph to generate a node state representation vector for each node of the input graph and an edge representation vector for each edge of the input graph; and process the node state representation vectors and the edge representation vectors to generate a vector representation of the input graph. The neural network system further comprises one or more processors configured to: receive a first graph; receive a second graph; generate a vector representation of the first graph; generate a vector representation of the second graph; determine a similarity score for the first graph and the second graph based upon the vector representations of the first graph and the second graph.

IPC Classes  ?

  • G06F 21/56 - Computer malware detection or handling, e.g. anti-virus arrangements
  • G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06F 17/16 - Matrix or vector computation
  • G06F 16/901 - Indexing; Data structures therefor; Storage structures
  • G06F 18/22 - Matching criteria, e.g. proximity measures
  • G06V 30/196 - Recognition using electronic means using sequential comparisons of the image signals with a plurality of references
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 10/426 - Graphical representations
  1     2     3     ...     5        Next Page