DeepMind Technologies Limited

United Kingdom

Back to Profile

1-100 of 788 for DeepMind Technologies Limited Sort by
Query
Aggregations
IP Type
        Patent 732
        Trademark 56
Jurisdiction
        United States 432
        World 292
        Canada 38
        Europe 26
Date
New (last 4 weeks) 37
2024 April (MTD) 27
2024 March 20
2024 February 13
2024 January 7
See more
IPC Class
G06N 3/08 - Learning methods 434
G06N 3/04 - Architecture, e.g. interconnection topology 386
G06N 3/00 - Computing arrangements based on biological models 98
G06N 3/045 - Combinations of networks 83
G06K 9/62 - Methods or arrangements for recognition using electronic means 61
See more
NICE Class
42 - Scientific, technological and industrial services, research and design 54
41 - Education, entertainment, sporting and cultural services 49
09 - Scientific and electric apparatus and instruments 47
16 - Paper, cardboard and goods made from these materials 34
28 - Games; toys; sports equipment 34
See more
Status
Pending 215
Registered / In Force 573
  1     2     3     ...     8        Next Page

1.

LEVERAGING OFFLINE TRAINING DATA AND AGENT COMPETENCY MEASURES TO IMPROVE ONLINE LEARNING

      
Application Number 18492415
Status Pending
Filing Date 2023-10-22
First Publication Date 2024-04-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Wen, Zheng
  • Van Roy, Benjamin
  • Jain, Rahul Anant
  • Hao, Botao

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a target action selection policy to control a target agent interacting with an environment. In one aspect, a method comprises: obtaining a set of offline training data, wherein the offline training data characterizes interaction of a baseline agent with an environment as the baseline agent performs actions selected in accordance with a baseline action selection policy; generating a set of online training data that characterizes interaction of the target agent with the environment as the target agent performs actions selected in accordance with the target action selection policy; and training the target action selection policy on both: (i) the offline training data, and (ii) the online training data, wherein the training of the target action selection policy on the offline training data is conditioned on a measure of competency of the baseline agent.

IPC Classes  ?

2.

SELECTIVE ACQUISITION FOR MULTI-MODAL TEMPORAL DATA

      
Application Number EP2023079389
Publication Number 2024/084097
Status In Force
Filing Date 2023-10-21
Publication Date 2024-04-25
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Kossen, Jannik Lukas
  • Belgrave, Danielle Charlotte Mary
  • Tomasev, Nenad
  • Cangea, Catalina-Codruta
  • Ktena, Sofia Ira
  • Vértes, Eszter
  • Patraucean, Viorica
  • Jaegle, Andrew Coulter

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction characterizing an environment. In one aspect, a method includes obtaining a respective observation characterizing a state of an environment for each time step in a sequence of multiple time steps, comprising, for each time step after a first time step in the sequence of time steps: processing a network input that comprises observations obtained for one or more preceding time steps to generate a plurality of acquisition decisions; obtaining an observation for the time step, wherein the observation includes data corresponding to modalities that are selected for acquisition at the time step, does not include data corresponding to modalities that are not selected for acquisition at the time step; and processing a model input that includes the observation for each time step in the sequence of time steps to generate the prediction.

IPC Classes  ?

3.

GENERATING AUDIO USING NEURAL NETWORKS

      
Application Number 18519986
Status Pending
Filing Date 2023-11-27
First Publication Date 2024-04-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Van Den Oord, Aaron Gerard Antonius
  • Dieleman, Sander Etienne Lea
  • Kalchbrenner, Nal Emmerich
  • Simonyan, Karen
  • Vinyals, Oriol

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of audio data that comprises a respective audio sample at each of a plurality of time steps. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.

IPC Classes  ?

  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/045 - Combinations of networks
  • G06N 3/048 - Activation functions
  • G10L 13/06 - Elementary speech units used in speech synthesisers; Concatenation rules

4.

DISTRIBUTIONAL REINFORCEMENT LEARNING USING QUANTILE FUNCTION NEURAL NETWORKS

      
Application Number 18542476
Status Pending
Filing Date 2023-12-15
First Publication Date 2024-04-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Ostrovski, Georg
  • Dabney, William Clinton

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

IPC Classes  ?

5.

FAST EXPLORATION AND LEARNING OF LATENT GRAPH MODELS

      
Application Number 18373870
Status Pending
Filing Date 2023-09-27
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Swaminathan, Sivaramakrishnan
  • Dave, Meet Kirankumar
  • Lazaro-Gredilla, Miguel
  • George, Dileep

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a graph model representing an environment being interacted with by an agent. In one aspect, one of the methods include: obtaining experience data; using the experience data to update a visitation count for each of one or more state-action pairs represented by the graph model; and at each of multiple environment exploration steps: computing a utility measure for each of the one or more state-action pairs represented by the graph model; determining, based on the utility measures, a sequence of one or more planned actions that have an information gain that satisfies a threshold; and controlling the agent to perform the sequence of one or more planned actions to cause the environment to transition from a state characterized by a last observation received after a last action in the experience data into a different state.

IPC Classes  ?

  • G06F 16/901 - Indexing; Data structures therefor; Storage structures
  • G06F 17/12 - Simultaneous equations

6.

GENERATING A MODEL OF A TARGET ENVIRONMENT BASED ON INTERACTIONS OF AN AGENT WITH SOURCE ENVIRONMENTS

      
Application Number 18379988
Status Pending
Filing Date 2023-10-13
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Bellot, Alexis
  • Malek, Alan John
  • Chiappa, Silvia

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions for an agent in a target environment. In particular, the actions are selected using an environment model for the target environment that is parameterized using interactions of the agent with the target environment and one or more source environments.

IPC Classes  ?

  • G06F 30/20 - Design optimisation, verification or simulation
  • G06F 16/901 - Indexing; Data structures therefor; Storage structures

7.

OPTIMIZING ALGORITHMS FOR HARDWARE DEVICES

      
Application Number 17959210
Status Pending
Filing Date 2022-10-03
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Hubert, Thomas Keisuke
  • Huang, Shih-Chieh
  • Novikov, Alexander
  • Fawzi, Alhussein
  • Romera-Paredes, Bernardino
  • Silver, David
  • Hassabis, Demis
  • Swirszcz, Grzegorz Michal
  • Schrittwieser, Julian
  • Kohli, Pushmeet
  • Barekatain, Mohammadamin
  • Balog, Matej
  • Rodriguez Ruiz, Francisco Jesus

Abstract

A method performed by one or more computers for obtaining an optimized algorithm that (i) is functionally equivalent to a target algorithm and (ii) optimizes one or more target properties when executed on a target set of one or more hardware devices. The method includes: initializing a target tensor representing the target algorithm; generating, using a neural network having a plurality of network parameters, a tensor decomposition of the target tensor that parametrizes a candidate algorithm; generating target property values for each of the target properties when executing the candidate algorithm on the target set of hardware devices; determining a benchmarking score for the tensor decomposition based on the target property values of the candidate algorithm; generating a training example from the tensor decomposition and the benchmarking score; and storing, in a training data store, the training example for use in updating the network parameters of the neural network.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

8.

NEURAL NETWORKS WITH ADAPTIVE GRADIENT CLIPPING

      
Application Number 18275087
Status Pending
Filing Date 2022-02-02
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Brock, Andrew
  • De, Soham
  • Smith, Samuel Laurence
  • Simonyan, Karen

Abstract

There is disclosed a computer-implemented method for training a neural network. The method comprises determining a gradient associated with a parameter of the neural network. The method further comprises determining a ratio of a gradient norm to parameter norm and comparing the ratio to a threshold. In response to determining that the ratio exceeds the threshold, the value of the gradient is reduced such that the ratio is equal to or below the threshold. The value of the parameter is updated based upon the reduced gradient value.

IPC Classes  ?

  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 10/776 - Validation; Performance evaluation

9.

META-LEARNED EVOLUTIONARY STRATEGIES OPTIMIZER

      
Application Number 18475859
Status Pending
Filing Date 2023-09-27
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Lange, Robert Tjarko
  • Schaul, Tom
  • Chen, Yutian
  • Zahavy, Tom Ben Zion
  • Dalibard, Valentin Clement
  • Lu, Christopher Yenchuan
  • Baveja, Satinder Singh
  • Flennerhag, Johan Sebastian

Abstract

There is provided a computer-implemented method for updating a search distribution of an evolutionary strategies optimizer using an optimizer neural network comprising one or more attention blocks. The method comprises receiving a plurality of candidate solutions, one or more parameters defining the search distribution that the plurality of candidate solutions are sampled from, and fitness score data indicating a fitness of each respective candidate solution of the plurality of candidate solutions. The method further comprises processing, by the one or more attention neural network blocks, the fitness score data using an attention mechanism to generate respective recombination weights corresponding to each respective candidate solution. The method further comprises updating the one or more parameters defining the search distribution based upon the recombination weights applied to the plurality of candidate solutions.

IPC Classes  ?

  • G06N 3/086 - Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

10.

DISTRIBUTED TRAINING USING ACTOR-CRITIC REINFORCEMENT LEARNING WITH OFF-POLICY CORRECTION FACTORS

      
Application Number 18487428
Status Pending
Filing Date 2023-10-16
First Publication Date 2024-04-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Soyer, Hubert Josef
  • Espeholt, Lasse
  • Simonyan, Karen
  • Doron, Yotam
  • Firoiu, Vlad
  • Mnih, Volodymyr
  • Kavukcuoglu, Koray
  • Munos, Remi
  • Ward, Thomas
  • Harley, Timothy James Alexander
  • Dunning, Iain Robert

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.

IPC Classes  ?

11.

PATHOGENICITY PREDICTION FOR PROTEIN MUTATIONS USING AMINO ACID SCORE DISTRIBUTIONS

      
Application Number EP2023078227
Publication Number 2024/079204
Status In Force
Filing Date 2023-10-11
Publication Date 2024-04-18
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Avsec, Ziga
  • Novati, Guido
  • Cheng, Jun

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a pathogenicity score characterizing a likelihood that a mutation to a protein is a pathogenic mutation, wherein the mutation modifies an amino acid sequence of the protein by replacing an original amino acid by a substitute amino acid at a mutation position in the amino acid sequence of the protein. In one aspect, a method comprises: generating a network input to a pathogenicity prediction neural network, wherein the network input comprises a multiple sequence alignment (MSA) representation that represents an MSA for the protein; processing the network input using the pathogenicity prediction neural network to generate a score distribution over a set of amino acids; and generating the pathogenicity score using the score distribution over the set of amino acids.

IPC Classes  ?

12.

PREDICTING PROTEIN AMINO ACID SEQUENCES USING GENERATIVE MODELS CONDITIONED ON PROTEIN STRUCTURE EMBEDDINGS

      
Application Number 18275933
Status Pending
Filing Date 2022-01-27
First Publication Date 2024-04-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Senior, Andrew W.
  • Kohl, Simon
  • Yim, Jason
  • Bates, Russell James
  • Ionescu, Catalin-Dumitru
  • Nash, Charlie Thomas Curtis
  • Razavi-Nematollahi, Ali
  • Pritzel, Alexander
  • Jumper, John

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing protein design. In one aspect, a method comprises: processing an input characterizing a target protein structure of a target protein using an embedding neural network having a plurality of embedding neural network parameters to generate an embedding of the target protein structure of the target protein; determining a predicted amino acid sequence of the target protein based on the embedding of the target protein structure, comprising: conditioning a generative neural network having a plurality of generative neural network parameters on the embedding of the target protein structure; and generating, by the generative neural network conditioned on the embedding of the target protein structure, a representation of the predicted amino acid sequence of the target protein.

IPC Classes  ?

  • G16B 15/00 - ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
  • G16B 30/20 - Sequence assembly
  • G16B 40/20 - Supervised data analysis

13.

DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS

      
Application Number 18374447
Status Pending
Filing Date 2023-09-28
First Publication Date 2024-04-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Strudel, Robin
  • Leblond, Rémi
  • Sifre, Laurent
  • Dieleman, Sander Etienne Lea
  • Savinov, Nikolay
  • Grathwohl, Will S.
  • Tallec, Corentin
  • Altché, Florent
  • Ganin, Iaroslav
  • Mensch, Arthur
  • Du, Yilin

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of discrete tokens using a diffusion model. In one aspect, a method includes generating, by using the diffusion model, a final latent representation of the sequence of discrete tokens that includes a determined value for each of a plurality of latent variables; applying a de-embedding matrix to the final latent representation of the output sequence of discrete tokens to generate a de-embedded final latent representation that includes, for each of the plurality of latent variables, a respective numeric score for each discrete token in a vocabulary of multiple discrete tokens; selecting, for each of the plurality of latent variables, a discrete token from among the multiple discrete tokens in the vocabulary that has a highest numeric score; and generating the output sequence of discrete tokens that includes the selected discrete tokens.

IPC Classes  ?

14.

PROGRESSIVE NEURAL NETWORKS

      
Application Number 18479775
Status Pending
Filing Date 2023-10-02
First Publication Date 2024-04-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Rabinowitz, Neil Charles
  • Desjardins, Guillaume
  • Rusu, Andrei-Alexandru
  • Kavukcuoglu, Koray
  • Hadsell, Raia Thais
  • Pascanu, Razvan
  • Kirkpatrick, James
  • Soyer, Hubert Josef

Abstract

Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.

IPC Classes  ?

15.

EVALUATING REPRESENTATIONS WITH READ-OUT MODEL SWITCHING

      
Application Number 18475972
Status Pending
Filing Date 2023-09-27
First Publication Date 2024-04-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Li, Yazhe
  • Bornschein, Jorg
  • Hutter, Marcus

Abstract

A method of automatically selecting a neural network from a plurality of computer-implemented candidate neural networks, each candidate neural network comprising at least an encoder neural network trained to encode an input value as a latent representation. The method comprises: obtaining a sequence of data items, each of the data items comprising an input value and a target value; and determining a respective score for each of the candidate neural networks, comprising evaluating the encoder neural network of the candidate neural network using a plurality of read-out heads. Each read-out head comprises parameters for predicting a target value from a latent representation of an input value of a data item encoded using the encoder neural network of the candidate neural network. The method further comprises selecting the neural network from the plurality of candidate neural networks using the respective scores.

IPC Classes  ?

16.

OPTIMIZING ALGORITHMS FOR HARDWARE DEVICES

      
Application Number EP2023077237
Publication Number 2024/074452
Status In Force
Filing Date 2023-10-02
Publication Date 2024-04-11
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Hubert, Thomas Keisuke
  • Huang, Shih-Chieh
  • Novikov, Alexander
  • Fawzi, Alhussein
  • Romera-Paredes, Bernardino
  • Silver, David
  • Hassabis, Demis
  • Swirszcz, Grzegorz Michal
  • Schrittwieser, Julian
  • Kohli, Pushmeet
  • Barekatain, Mohammadamin
  • Balog, Matej
  • Rodriguez Ruiz, Francisco Jesus

Abstract

A method performed by one or more computers for obtaining an optimized algorithm that (i) is functionally equivalent to a target algorithm and (ii) optimizes one or more target properties when executed on a target set of one or more hardware devices. The method includes: initializing a target tensor representing the target algorithm; generating, using a neural network having a plurality of network parameters, a tensor decomposition of the target tensor that parametrizes a candidate algorithm; generating target property values for each of the target properties when executing the candidate algorithm on the target set of hardware devices; determining a benchmarking score for the tensor decomposition based on the target property values of the candidate algorithm; generating a training example from the tensor decomposition and the benchmarking score; and storing, in a training data store, the training example for use in updating the network parameters of the neural network.

IPC Classes  ?

  • G06F 16/901 - Indexing; Data structures therefor; Storage structures
  • G06N 3/092 - Reinforcement learning
  • G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn
  • G06N 5/01 - Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

17.

CONTROLLING AGENTS USING REPORTER NEURAL NETWORKS

      
Application Number EP2023076516
Publication Number 2024/068610
Status In Force
Filing Date 2023-09-26
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Dasgupta, Ishita
  • Chen, Shiqi
  • Marino, Kenneth Daniel
  • Shang, Wenling
  • Ahuja, Arun

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents using reporter neural networks.

IPC Classes  ?

18.

SCORE MODELLING FOR SIMULATION-BASED INFERENCE

      
Application Number EP2023076529
Publication Number 2024/068622
Status In Force
Filing Date 2023-09-26
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Mnih, Andriy
  • Geffner, Tomas

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for using simulation-based inference to inferring a set of parameters such as measurements, from observations, e.g. real world observations. The method uses a score generation neural network to determine scores for individual observations or for groups of observations that are combined and used to iteratively adjust values of the parameters.

IPC Classes  ?

  • G06F 16/906 - Clustering; Classification
  • G06N 3/00 - Computing arrangements based on biological models

19.

GRAPH NEURAL NETWORKS THAT MODEL FACE-FACE INTERACTIONS BETWEEN MESHES

      
Application Number EP2023076797
Publication Number 2024/068788
Status In Force
Filing Date 2023-09-27
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Allen, Kelsey Rebecca
  • Rubanova, Yulia
  • Lopez Guevara, Tatiana
  • Whitney, William Fairclough
  • Sanchez, Alvaro
  • Battaglia, Peter William
  • Pfaff, Tobias

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for simulating a state of an environment over a sequence of time steps. In one aspect, a method comprises, at each of one or more time steps: obtaining an environment mesh representing the state of the environment at the time step; generating a graph representing the state of the environment at the time step, comprising: determining that a first face of a first object mesh is within a collision distance of a second face of a second object mesh; and in response, instantiating a face-face edge in the graph that connects: (i) a first set of graph nodes in the graph that represent the first face in the first object mesh, and (ii) a second set of graph nodes in the graph that represent the second face in the second object mesh.

IPC Classes  ?

  • G06F 30/20 - Design optimisation, verification or simulation
  • B25J 9/00 - Programme-controlled manipulators

20.

LEARNING TASKS USING SKILL SEQUENCING FOR TEMPORALLY-EXTENDED EXPLORATION

      
Application Number EP2023076798
Publication Number 2024/068789
Status In Force
Filing Date 2023-09-27
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Vezzani, Giulia
  • Tirumala Bukkapatnam, Dhruva
  • Wulfmeier, Markus
  • Riedmiller, Martin
  • Heess, Nicolas Manfred Otto

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling an agent that is interacting with an environment. Implementations of the system use previously learned skills to explore states of the environment to collect and store training data, which is then used to train an action selection system. The system includes a set of skill action selection subsystems, each configured to select actions for the agent to perform for a respective skill. The set of skill action selection subsystems is used to explore states of the environment to collect the training data, keeping their individual action selection policies unchanged. A scheduler neural network selects the skill neural networks to use. The action selection system is trained on the stored training data.

IPC Classes  ?

21.

REINFORCEMENT LEARNING USING DENSITY ESTIMATION WITH ONLINE CLUSTERING FOR EXPLORATION

      
Application Number EP2023076893
Publication Number 2024/068841
Status In Force
Filing Date 2023-09-28
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Saade, Alaa
  • Kapturowski, Steven James
  • Calandriello, Daniele
  • Blundell, Charles
  • Valko, Michal
  • Sprechmann, Pablo
  • Piot, Bilal

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions to be performed by an agent interacting with an environment. Implementations of the described techniques can learn to explore the environment efficiently by storing and updating state embedding cluster centers based on observations characterizing states of the environment.

IPC Classes  ?

22.

AGENT CONTROL THROUGH IN-CONTEXT REINFORCEMENT LEARNING

      
Application Number EP2023076897
Publication Number 2024/068843
Status In Force
Filing Date 2023-09-28
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Laskin, Michael
  • Mnih, Volodymyr
  • Wang, Luyu
  • Baveja, Satinder Singh

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an agent can be controlled using an action selection neural network that performs in-context reinforcement learning when controlling an agent on a new task.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
  • G06N 3/092 - Reinforcement learning

23.

CONTROLLING AGENTS USING REPORTER NEURAL NETWORKS

      
Application Number 18475157
Status Pending
Filing Date 2023-09-26
First Publication Date 2024-04-04
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Dasgupta, Ishita
  • Chen, Shiqi
  • Marino, Kenneth Daniel
  • Shang, Wenling
  • Ahuja, Arun

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents using reporter neural networks.

IPC Classes  ?

24.

DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS

      
Application Number EP2023076788
Publication Number 2024/068781
Status In Force
Filing Date 2023-09-27
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Strudel, Robin
  • Leblond, Rémi
  • Sifre, Laurent
  • Dieleman, Sander Etienne Lea
  • Savinov, Nikolay
  • Grathwohl, Will S.
  • Tallec, Corentin
  • Altché, Florent
  • Ganin, Iaroslav
  • Mensch, Arthur
  • Du, Yilun

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of discrete tokens using a diffusion model. In one aspect, a method includes generating, by using the diffusion model, a final latent representation of the sequence of discrete tokens that includes a determined value for each of a plurality of latent variables; applying a de-embedding matrix to the final latent representation of the output sequence of discrete tokens to generate a de-embedded final latent representation that includes, for each of the plurality of latent variables, a respective numeric score for each discrete token in a vocabulary of multiple discrete tokens; selecting, for each of the plurality of latent variables, a discrete token from among the multiple discrete tokens in the vocabulary that has a highest numeric score; and generating the output sequence of discrete tokens that includes the selected discrete tokens.

IPC Classes  ?

25.

REWARD-MODEL BASED REINFORCEMENT LEARNING FOR PERFORMING REASONING TASKS

      
Application Number EP2023076792
Publication Number 2024/068784
Status In Force
Filing Date 2023-09-27
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Higgins, Irina
  • Uesato, Jonathan Ken
  • Kushman, Nathaniel Arthur
  • Kumar, Ramana

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for A training a language model for performing a reasoning task. The system obtains a plurality of training examples. Each training example includes a respective sample query text sequence characterizing a respective sample query and a respective reference response text sequence that includes a reference final answer to the respective sample query. The system trains a reward model on the plurality of training examples. The reward model is configured to receive an input including a query text sequence characterizing a query and one or more reasoning steps that have been generated in response to the query and process the input to compute a reward score indicating how successful the one or more reasoning steps are in yielding a correct final answer to the query. The system trains the language model using the trained reward model.

IPC Classes  ?

26.

SYSTEM AND METHOD FOR REINFORCEMENT LEARNING BASED ON PRIOR TRAJECTORIES

      
Application Number EP2023076793
Publication Number 2024/068785
Status In Force
Filing Date 2023-09-27
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Bruce, Jacob
  • Anand, Ankit
  • Fergus, Robert David

Abstract

A reinforcement learning system is proposed in which a policy model neural network is trained to control an agent to perform a task in successive time steps, by training a control system including the policy model neural network to select a respective action for each time step which gives a high value for a reward function based on the action, and which indicates the contribution of the action to solving the task. The reward function includes a term based on a progress value output by a progress model. The progress model generates the progress value upon receiving a first observation of the state of the environment at a time step before the performance of the action, and a second observation of the state of the environment at a time step following the performance of the action. The progress value is an estimate of the average time which an ensemble of experts who produced the demonstrations would have taken to transform the environment from how it appears in the first observation to how it appears in the second observation.

IPC Classes  ?

27.

NEURAL NETWORKS WITH REGULARIZED ATTENTION LAYERS

      
Application Number EP2023076794
Publication Number 2024/068786
Status In Force
Filing Date 2023-09-27
Publication Date 2024-04-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • He, Bobby Boyi
  • Martens, James

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing a network input using a neural network that includes one or more regularized attention layers. In one aspect, a method comprises: receiving a layer input to a regularized attention layer, wherein the layer input to the regularized attention layer comprises a set of input embeddings; and applying a regularized attention operation over the set of input embeddings to generate a set of output embeddings, comprising: transforming intermediate attention scores using a set of shaping constants to generate a set of transformed attention scores, wherein: values of the shaping constants are initialized prior to training of the neural network and are not adjusted during the training of the neural network; and the values of the shaping constants are selected to regularize the set of output embeddings.

IPC Classes  ?

28.

REWARD-MODEL BASED REINFORCEMENT LEARNING FOR PERFORMING REASONING TASKS

      
Application Number 18475743
Status Pending
Filing Date 2023-09-27
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Higgins, Irina
  • Uesato, Jonathan Ken
  • Kushman, Nathaniel Arthur
  • Kumar, Ramana

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for A training a language model for performing a reasoning task. The system obtains a plurality of training examples. Each training example includes a respective sample query text sequence characterizing a respective sample query and a respective reference response text sequence that includes a reference final answer to the respective sample query. The system trains a reward model on the plurality of training examples. The reward model is configured to receive an input including a query text sequence characterizing a query and one or more reasoning steps that have been generated in response to the query and process the input to compute a reward score indicating how successful the one or more reasoning steps are in yielding a correct final answer to the query. The system trains the language model using the trained reward model.

IPC Classes  ?

29.

GUIDED DIALOGUE USING LANGUAGE GENERATION NEURAL NETWORKS AND SEARCH

      
Application Number EP2023075931
Publication Number 2024/061963
Status In Force
Filing Date 2023-09-20
Publication Date 2024-03-28
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Irving, Geoffrey
  • Glaese, Amelia Marita Claudia
  • Mcaleese-Park, Nathaniel John
  • Hendricks, Lisa Anne Marie

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enabling a user to conduct a dialogue. Implementations of the system learn when to rely on supporting evidence, obtained from an external search system via a search system interface, and are also able to generate replies for the user that align with the preferences of a previously trained response selection neural network. Implementations of the system can also use a previously trained rule violation detection neural network to generate replies that take account of previously learnt rules.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/092 - Reinforcement learning
  • G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks
  • G06N 3/042 - Knowledge-based neural networks; Logical representations of neural networks
  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/094 - Adversarial learning
  • G06N 5/01 - Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

30.

GENERATING NEURAL NETWORK OUTPUTS BY ENRICHING LATENT EMBEDDINGS USING SELF-ATTENTION AND CROSS-ATTENTION OPERATIONS

      
Application Number 18271611
Status Pending
Filing Date 2022-02-03
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Jaegle, Andrew Coulter
  • Carreira, Joao

Abstract

This specification describes a method for using a neural network to generate a network output that characterizes an entity. The method includes: obtaining a representation of the entity as a set of data element embeddings, obtaining a set of latent embeddings, and processing: (i) the set of data element embeddings, and (ii) the set of latent embeddings, using the neural network to generate the network output characterizing the entity. The neural network includes: (i) one or more cross-attention blocks, (ii) one or more self-attention blocks, and (iii) an output block. Each cross-attention block updates each latent embedding using attention over some or all of the data element embeddings. Each self-attention block updates each latent embedding using attention over the set of latent embeddings. The output block processes one or more latent embeddings to generate the network output that characterizes the entity.

IPC Classes  ?

31.

SEQUENCE-TO SEQUENCE NEURAL NETWORK SYSTEMS USING LOOK AHEAD TREE SEARCH

      
Application Number 18274748
Status Pending
Filing Date 2022-02-08
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Leblond, Rémi Bertrand Francis
  • Alayrac, Jean-Baptiste
  • Sifre, Laurent
  • Pîslar, Miruna
  • Lespiau, Jean-Baptiste
  • Antonoglou, Ioannis
  • Simonyan, Karen
  • Silver, David
  • Vinyals, Oriol

Abstract

A computer-implemented method for generating an output token sequence from an input token sequence. The method combines a look ahead tree search, such as a Monte Carlo tree search, with a sequence-to-sequence neural network system. The sequence-to-sequence neural network system has a policy output defining a next token probability distribution, and may include a value neural network providing a value output to evaluate a sequence. An initial partial output sequence is extended using the look ahead tree search guided by the policy output and, in implementations, the value output, of the sequence-to-sequence neural network system until a complete output sequence is obtained.

IPC Classes  ?

  • G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks

32.

GENERATING IMAGES USING SPARSE REPRESENTATIONS

      
Application Number 18275048
Status Pending
Filing Date 2022-02-07
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Nash, Charlie Thomas Curtis
  • Battaglia, Peter William

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating compressed representations of synthetic images. One of the methods is a method of generating a synthetic image using a generative neural network, and includes: generating, using the generative neural network, a plurality of coefficients that represent the synthetic image after the synthetic image has been encoded using a lossy compression algorithm; and decoding the synthetic image by applying the lossy compression algorithm to the plurality of coefficients.

IPC Classes  ?

  • G06T 9/00 - Image coding
  • G06T 7/11 - Region-based segmentation
  • G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods

33.

TEMPORAL DIFFERENCE SCALING WHEN CONTROLLING AGENTS USING REINFORCEMENT LEARNING

      
Application Number 18275145
Status Pending
Filing Date 2022-02-04
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor Schaul, Tom

Abstract

A reinforcement learning neural network system configured to manage rewards on scales that can vary significantly. The system determines the value of a scale factor that is applied to a temporal difference error used for reinforcement learning. The scale factor depends at least upon a variance of the rewards received during the reinforcement learning.

IPC Classes  ?

34.

NEURAL NETWORK REINFORCEMENT LEARNING WITH DIVERSE POLICIES

      
Application Number 18275511
Status Pending
Filing Date 2022-02-04
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Zahavy, Tom Ben Zion
  • O'Donoghue, Brendan Timothy
  • Da Motta Salles Barreto, Andre
  • Flennerhag, Johan Sebastian
  • Mnih, Volodymyr
  • Baveja, Satinder Singh

Abstract

In one aspect there is provided a method for training a neural network system by reinforcement learning. The neural network system may be configured to receive an input observation characterizing a state of an environment interacted with by an agent and to select and output an action in accordance with a policy aiming to satisfy an objective. The method may comprise obtaining a policy set comprising one or more policies for satisfying the objective and determining a new policy based on the one or more policies. The determining may include one or more optimization steps that aim to maximize a diversity of the new policy relative to the policy set under the condition that the new policy satisfies a minimum performance criterion based on an expected return that would be obtained by following the new policy.

IPC Classes  ?

35.

GUIDED DIALOGUE USING LANGUAGE GENERATION NEURAL NETWORKS AND SEARCH

      
Application Number 18471257
Status Pending
Filing Date 2023-09-20
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Irving, Geoffrey
  • Glaese, Amelia Marita Claudia
  • Mcaleese-Park, Nathaniel John
  • Hendricks, Lisa Anne Marie

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enabling a user to conduct a dialogue. Implementations of the system learn when to rely on supporting evidence, obtained from an external search system via a search system interface, and are also able to generate replies for the user that align with the preferences of a previously trained response selection neural network. Implementations of the system can also use a previously trained rule violation detection neural network to generate replies that take account of previously learnt rules.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates
  • G06F 40/35 - Discourse or dialogue representation
  • G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks
  • G06N 3/092 - Reinforcement learning

36.

AGENT CONTROL THROUGH IN-CONTEXT REINFORCEMENT LEARNING

      
Application Number 18477492
Status Pending
Filing Date 2023-09-28
First Publication Date 2024-03-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Laskin, Michael
  • Mnih, Volodymyr
  • Wang, Luyu
  • Baveja, Satinder Singh

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an agent can be controlled using an action selection neural network that performs in-context reinforcement learning when controlling an agent on a new task.

IPC Classes  ?

37.

Image processing of an environment to select an action to be performed by an agent interacting with the environment

      
Application Number 17737544
Grant Number 11941088
Status In Force
Filing Date 2022-05-05
First Publication Date 2024-03-26
Grant Date 2024-03-26
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Mnih, Volodymyr
  • Kavukcuoglu, Koray

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using recurrent attention. One of the methods includes determining a location in the first image; extracting a glimpse from the first image using the location; generating a glimpse representation of the extracted glimpse; processing the glimpse representation using a recurrent neural network to update a current internal state of the recurrent neural network to generate a new internal state; processing the new internal state to select a location in a next image in the image sequence after the first image; and processing the new internal state to select an action from a predetermined set of possible actions.

IPC Classes  ?

  • G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
  • G06F 18/2431 - Multiple classes
  • G06V 20/80 - Recognising image objects characterised by unique random patterns
  • G06V 30/194 - References adjustable by an adaptive method, e.g. learning
  • G06V 30/413 - Classification of content, e.g. text, photographs or tables

38.

ATTENTION NEURAL NETWORKS WITH SHORT-TERM MEMORY UNITS

      
Application Number 18275052
Status Pending
Filing Date 2022-02-07
First Publication Date 2024-03-21
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Banino, Andrea
  • Badia, Adrià Puigdomènech
  • Walker, Jacob Charles
  • Scholtes, Timothy Anthony Julian
  • Mitrovic, Jovana
  • Blundell, Charles

Abstract

A system for controlling an agent interacting with an environment to perform a task. The system includes an action selection neural network configured to generate action selection outputs that are used to select actions to be performed by the agent. The action selection neural network includes an encoder sub network configured to generate encoded representations of the current observations; an attention sub network configured to generate attention sub network outputs with the used of an attention mechanism; a recurrent sub network configured to generate recurrent sub network outputs; and an action selection sub network configured to generate the action selection outputs that are used to select the actions to be performed by the agent in response to the current observations.

IPC Classes  ?

  • G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
  • G06N 3/092 - Reinforcement learning

39.

CONTROLLING INDUSTRIAL FACILITIES USING HIERARCHICAL REINFORCEMENT LEARNING

      
Application Number EP2023075295
Publication Number 2024/056800
Status In Force
Filing Date 2023-09-14
Publication Date 2024-03-21
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Wong, William
  • Dutta, Praneet
  • Luo, Jerry Jiayu

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling a facility through hierarchical reinforcement learning. In particular, the facility is controlled using a high-level controller neural network that makes high-level decisions and a low-level controller neural network that makes low-level controller decisions.

IPC Classes  ?

  • G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric

40.

DATA-EFFICIENT REINFORCEMENT LEARNING WITH ADAPTIVE RETURN COMPUTATION SCHEMES

      
Application Number EP2023075512
Publication Number 2024/056891
Status In Force
Filing Date 2023-09-15
Publication Date 2024-03-21
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Jiang, Ray
  • Puigdomènech Badia, Adrià
  • Campos Camúñez, Víctor
  • Kapturowski, Steven James
  • Rakicevic, Nemanja

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for data-efficient reinforcement learning with adaptive return computation schemes.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/045 - Combinations of networks
  • G06N 3/092 - Reinforcement learning
  • G06N 3/096 - Transfer learning
  • G06N 3/0464 - Convolutional networks [CNN, ConvNet]
  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn

41.

TRAINING POLICY NEURAL NETWORKS IN SIMULATION USING SCENE SYNTHESIS MACHINE LEARNING MODELS

      
Application Number EP2023075514
Publication Number 2024/056892
Status In Force
Filing Date 2023-09-15
Publication Date 2024-03-21
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Byravan, Arunkumar
  • Humplik, Jan
  • Hasenclever, Leonard
  • Brussee, Arthur Karl
  • Nori, Francesco

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network for use in controlling a robot. In particular, the policy neural network can be trained in simulation using images generated by a scene synthesis machine learning model.

IPC Classes  ?

  • G06N 3/0895 - Weakly supervised learning, e.g. semi-supervised or self-supervised learning
  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/092 - Reinforcement learning
  • G06N 3/096 - Transfer learning
  • G06N 3/045 - Combinations of networks

42.

DETERMINING PRINCIPAL COMPONENTS USING MULTI-AGENT INTERACTION

      
Application Number 18275045
Status Pending
Filing Date 2022-02-07
First Publication Date 2024-03-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gemp, Ian Michael
  • Mcwilliams, Brian

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining principal components of a data set using multi-agent interactions. One of the methods includes obtaining initial estimates for a plurality of principal components of a data set; and generating a final estimate for each principal component by repeatedly performing operations comprising: generating a reward estimate using the current estimate of the principal component, wherein the reward estimate is larger if the current estimate of the principal component captures more variance in the data set; generating, for each parent principal component of the principal component, a punishment estimate, wherein the punishment estimate is larger if the current estimate of the principal component and the current estimate of the parent principal component are not orthogonal; and updating the current estimate of the principal component according to a difference between the reward estimate and the punishment estimates.

IPC Classes  ?

  • G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

43.

PREDICTING COMPLETE PROTEIN REPRESENTATIONS FROM MASKED PROTEIN REPRESENTATIONS

      
Application Number 18273594
Status Pending
Filing Date 2022-01-27
First Publication Date 2024-03-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Pritzel, Alexander
  • Ionescu, Catalin-Dumitru
  • Kohl, Simon

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for unmasking a masked representation of a protein using a protein reconstruction neural network. In one aspect, a method comprises: receiving the masked representation of the protein; and processing the masked representation of the protein using the protein reconstruction neural network to generate a respective predicted embedding corresponding to one or more masked embeddings that are included in the masked representation of the protein, wherein a predicted embedding corresponding to a masked embedding in a representation of the amino acid sequence of the protein defines a prediction for an identity of an amino acid at a corresponding position in the amino acid sequence, wherein a predicted embedding corresponding to a masked embedding in a representation of the structure of the protein defines a prediction for a corresponding structural feature of the protein.

IPC Classes  ?

  • G16B 40/20 - Supervised data analysis
  • G16B 15/20 - Protein or domain folding
  • G16B 15/30 - Drug targeting using structural data; Docking or binding prediction
  • G16B 20/50 - Mutagenesis
  • G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids

44.

CONTROLLING AGENTS USING STATE ASSOCIATIVE LEARNING FOR LONG-TERM CREDIT ASSIGNMENT

      
Application Number 18275542
Status Pending
Filing Date 2022-02-04
First Publication Date 2024-03-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Ritter, Samuel
  • Raposo, David Nunes

Abstract

A computer-implemented reinforcement learning neural network system that learns a model of rewards in order to relate actions by an agent in an environment to their long-term consequences. The model learns to decompose the rewards into components explainable by different past states. That is, the model learns to associate when being in a particular state of the environment is predictive of a reward in a later state, even when the later state, and reward, is only achieved after a very long time delay.

IPC Classes  ?

45.

ACTION ABSTRACTION CONTROLLER FOR FULLY ACTUATED ROBOTIC MANIPULATORS

      
Application Number EP2023067028
Publication Number 2024/051978
Status In Force
Filing Date 2023-06-22
Publication Date 2024-03-14
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Chen, Jose Enrique
  • Laurens, Antoine Marin Alix
  • Romano, Francesco
  • Scholz, Jonathan Karl
  • Fernandes Martins, Murilo
  • Nori, Francesco

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling a robot manipulator that has a plurality of joints. One of the methods includes obtaining a control input that comprises one or more velocity values that specify a target velocity of a reference point in a given coordinate frame; determining a respective joint velocity for each of the plurality of joints by generating a solution to an optimization problem formulated from the control input; and controlling the robot manipulator, including causing the plurality of joints of the robot manipulator to move in accordance with the respective joint velocities to approximate the control input.

IPC Classes  ?

  • B25J 9/16 - Programme controls
  • G05B 19/427 - Teaching successive positions by tracking the position of a joystick or handle to control the positioning servo of the tool head, master-slave control

46.

CONTROLLING AGENTS USING AMBIGUITY-SENSITIVE NEURAL NETWORKS AND RISK-SENSITIVE NEURAL NETWORKS

      
Application Number EP2023074759
Publication Number 2024/052544
Status In Force
Filing Date 2023-09-08
Publication Date 2024-03-14
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Grau Moya, Jordi
  • Delétang, Grégoire
  • Kunesch, Markus
  • Ortega Caballero, Pedro Alejandro

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an agent can be controlled using an action selection system that is risk-sensitive, ambiguity-sensitive, or both.

IPC Classes  ?

  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks
  • G06N 3/092 - Reinforcement learning
  • G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn
  • G06N 3/0464 - Convolutional networks [CNN, ConvNet]
  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

47.

SELECTION-INFERENCE NEURAL NETWORK SYSTEMS

      
Application Number EP2023073796
Publication Number 2024/047108
Status In Force
Filing Date 2023-08-30
Publication Date 2024-03-07
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Creswell, Antonia Phoebe Nina
  • Shanahan, Murray

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a response to a query input using a selection-inference neural network.

IPC Classes  ?

  • G06N 5/046 - Forward inferencing; Production systems
  • G06N 3/042 - Knowledge-based neural networks; Logical representations of neural networks
  • G06N 5/025 - Extracting rules from data
  • G06N 5/04 - Inference or reasoning models
  • G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence

48.

PREDICTING EXCHANGE-CORRELATION ENERGIES OF ATOMIC SYSTEMS USING NEURAL NETWORKS

      
Application Number 18260182
Status Pending
Filing Date 2022-01-07
First Publication Date 2024-02-29
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Kirkpatrick, James
  • Mcmorrow, Brendan Charles
  • Turban, David Herbert Phlipp
  • Gaunt, Alexander Lloyd
  • Spencer, James
  • Matthews, Alexander Graeme De Garis
  • Cohen, Aron Jonathan

Abstract

Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for predicting an exchange-correlation energy of an atomic system. The system obtains respective electron-orbital features of the atomic system at each of a plurality of grid points; generates, for each of the plurality of grid points, a respective input feature vector for the electron-orbital features at the grid point; and processes the respective input feature vectors for the plurality of grid points using a neural network to generate a predicted exchange-correlation energy of the atomic system.

IPC Classes  ?

  • G16C 20/30 - Prediction of properties of chemical compounds, compositions or mixtures
  • G06N 3/08 - Learning methods
  • G16C 20/70 - Machine learning, data mining or chemometrics

49.

RENDERING NEW IMAGES OF SCENES USING GEOMETRY-AWARE NEURAL NETWORKS CONDITIONED ON LATENT VARIABLES

      
Application Number 18275332
Status Pending
Filing Date 2022-02-04
First Publication Date 2024-02-29
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Kosiorek, Adam Roman
  • Strathmann, Heiko
  • Rezende, Danilo Jimenez
  • Zoran, Daniel
  • Moreno Comellas, Pol

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for rendering a new image that depicts a scene from a perspective of a camera at a new camera location. In one aspect, a method comprises: receiving a plurality of observations characterizing the scene; generating a latent variable representing the scene from the plurality of observations characterizing the scene; conditioning a scene representation neural network on the latent variable representing the scene, wherein the scene representation neural network conditioned on the latent variable representing the scene defines a geometric model of the scene as a three-dimensional (3D) radiance field; and rendering the new image that depicts the scene from the perspective of the camera at the new camera location using the scene representation neural network conditioned on the latent variable representing the scene.

IPC Classes  ?

50.

DATA-EFFICIENT REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL TASKS

      
Application Number 18351440
Status Pending
Filing Date 2023-07-12
First Publication Date 2024-02-22
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Riedmiller, Martin
  • Hafner, Roland
  • Vecerik, Mel
  • Lillicrap, Timothy Paul
  • Lampe, Thomas
  • Popov, Ivaylo
  • Barth-Maron, Gabriel
  • Heess, Nicolas Manfred Otto

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-efficient reinforcement learning. One of the systems is a system for training an actor neural network used to select actions to be performed by an agent that interacts with an environment by receiving observations characterizing states of the environment and, in response to each observation, performing an action selected from a continuous space of possible actions, wherein the actor neural network maps observations to next actions in accordance with values of parameters of the actor neural network, and wherein the system comprises: a plurality of workers, wherein each worker is configured to operate independently of each other worker, wherein each worker is associated with a respective agent replica that interacts with a respective replica of the environment during the training of the actor neural network.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/08 - Learning methods
  • G06N 3/088 - Non-supervised learning, e.g. competitive learning
  • G06F 18/21 - Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
  • G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting
  • G06N 3/045 - Combinations of networks

51.

DETERMINING FAILURE CASES IN TRAINED NEURAL NETWORKS USING GENERATIVE NEURAL NETWORKS

      
Application Number EP2023072617
Publication Number 2024/038114
Status In Force
Filing Date 2023-08-16
Publication Date 2024-02-22
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Gowal, Sven Adrian
  • Wiles, Olivia Anne
  • Carneiro De Albuquerque, Isabela Maria

Abstract

Methods, systems, and computer readable storage media for performing operations comprising: obtaining a plurality of initial network inputs that have been classified as belonging to a corresponding ground truth class; processing each of the plurality of initial network inputs using a trained target neural network to generate a respective predicted network output for each initial network input, the respective predicted network output comprising a respective score for each of a plurality of classes, the plurality of classes comprising the ground truth class; identifying, based on the respective predicted network outputs and the ground truth class, a subset of the initial network inputs as having been misclassified by the trained target neural network; and determining, based on the subset of initial network inputs, one or more failure case latent representations, wherein each failure case latent representation is a latent representation that characterizes network inputs that belong to the ground truth class but that are likely to be misclassified by the trained target neural network.

IPC Classes  ?

  • G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
  • G06N 3/045 - Combinations of networks
  • G06N 3/0475 - Generative networks
  • G06N 3/09 - Supervised learning

52.

SOLVING MIXED INTEGER PROGRAMS USING NEURAL NETWORKS

      
Application Number 18267363
Status Pending
Filing Date 2021-12-20
First Publication Date 2024-02-22
Owner DeepMind Technologies Limited (USA)
Inventor
  • Bartunov, Sergey
  • Gimeno Gil, Felix Axel
  • Von Glehn, Ingrid Karin
  • Lichocki, Pawel
  • Lobov, Ivan
  • Nair, Vinod
  • O'Donoghue, Brendan Timothy
  • Sonnerat, Nicolas
  • Tjandraatmadja, Christian
  • Wang, Pengming

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for solving mixed integer programs (MIPs) using neural networks. One of the methods includes obtaining data specifying parameters of a MIP; generating, from the parameters of the MIP, an input representation; processing the input representation using an encoder neural network to generate a respective embedding for each of the integer variables; generating a plurality of partial assignments by selecting a respective second, proper subset of the integer variables; and for each of the variables in the respective second subset, generating, using at least the respective embedding for the variable, a respective additional constraint on the value of the variable; generating, for each of the partial assignments, a corresponding candidate final assignment that assigns a respective value to each of the plurality of variables; and selecting, as a final assignment for the MIP, one of the candidate final assignments.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06F 17/11 - Complex mathematical operations for solving equations

53.

Selecting actions from large discrete action sets using reinforcement learning

      
Application Number 17131500
Grant Number 11907837
Status In Force
Filing Date 2020-12-22
First Publication Date 2024-02-20
Grant Date 2024-02-20
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Dulac-Arnold, Gabriel
  • Evans, Richard Andrew
  • Coppin, Benjamin Kenneth

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting actions from large discrete action sets. One of the methods includes receiving a particular observation representing a particular state of an environment; and selecting an action from a discrete set of actions to be performed by an agent interacting with the environment, comprising: processing the particular observation using an actor policy network to generate an ideal point; determining, from the points that represent actions in the set, the k nearest points to the ideal point; for each nearest point of the k nearest points: processing the nearest point and the particular observation using a Q network to generate a respective Q value for the action represented by the nearest point; and selecting the action to be performed by the agent from the k actions represented by the k nearest points based on the Q values.

IPC Classes  ?

54.

AUTOMATED DISCOVERY OF AGENTS IN SYSTEMS

      
Application Number EP2023071987
Publication Number 2024/033387
Status In Force
Filing Date 2023-08-08
Publication Date 2024-02-15
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Jebreel, Zachary Alex Kenton
  • Kumar, Ramana
  • Richens, Jonathan George
  • Everitt, Tom Åke Helmer
  • Farquhar, Aiken Sebastian
  • Macdermott, Matthew Joseph Tilley

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying agents in a system. According to one aspect, a method comprises: generating data defining a causal model of the system, comprising transmitting instructions to cause a plurality of interventions to be applied to the system, wherein each intervention modifies one or more variable elements in the system; processing the model of the system to identify one or more of the variable elements in the system as being decision elements, wherein each decision element represents an action selected by a respective agent in the system; and identifying one or more agents in the system based on the decision elements; and outputting data that identifies the agents in the system.

IPC Classes  ?

  • G06N 3/092 - Reinforcement learning
  • G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
  • G06N 3/042 - Knowledge-based neural networks; Logical representations of neural networks

55.

GRAPH NEURAL NETWORK SYSTEMS FOR GENERATING STRUCTURED REPRESENTATIONS OF OBJECTS

      
Application Number 18144810
Status Pending
Filing Date 2023-05-08
First Publication Date 2024-02-15
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Li, Yujia
  • Dyer, Christopher James
  • Vinyals, Oriol

Abstract

There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.

IPC Classes  ?

  • G06N 3/047 - Probabilistic or stochastic networks
  • G06F 16/901 - Indexing; Data structures therefor; Storage structures
  • G06F 17/18 - Complex mathematical operations for evaluating statistical data
  • G06N 3/08 - Learning methods
  • G06N 3/045 - Combinations of networks

56.

FINDING A STATIONARY POINT OF A LOSS FUNCTION BY AN ITERATIVE ALGORITHM USING A VARIABLE LEARNING RATE VALUE

      
Application Number EP2023072108
Publication Number 2024/033445
Status In Force
Filing Date 2023-08-09
Publication Date 2024-02-15
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Rosca, Mihaela
  • Dherin, Benoit Richard Umbert
  • Wu, Yan
  • Qin, Chongli

Abstract

A computer-implemented method for determining, for a loss function which is a function of a parameter vector comprising a plurality of parameters, values for the parameters for which the parameter vector is a stationary point of the loss function. The method comprises determining initial values for the parameters; and repeatedly updating the parameters by: (a) determining at least one drift value indicative of discretization drift for a discrete update to the parameters based on the loss function; (b) determining at least one learning rate value by evaluating a learning rate function based on, and having an inverse relationship with, the at least one drift value; (c) determining respective updates to the parameters based upon a product of the at least one learning rate value and a gradient of the loss function with respect to the respective parameter for current values of the parameters; and (d) updating the parameters based upon the determined respective updates.

IPC Classes  ?

57.

CONTROLLING AGENTS USING AUXILIARY PREDICTION NEURAL NETWORKS THAT GENERATE STATE VALUE ESTIMATES

      
Application Number 18230056
Status Pending
Filing Date 2023-08-03
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Zaheer, Muhammad
  • Modayil, Joseph Varughese

Abstract

Method, system, and non-transitory computer storage media for selecting actions to be performed by an agent to interact with an environment to perform a main task by for each time step in a sequence of time steps: receiving a set of features representing an observation; for each of one or more auxiliary prediction neural networks, generating a state value estimate for the current state of the environment relative to a corresponding auxiliary reward that measures values of a corresponding target feature from the set of features representing the observations for the sequence of time steps; processing an input comprising a respective intermediate output generated by each auxiliary neural network at the time step using an action selection neural network to generate an action selection output; and selecting the action to be performed by the agent at the time step using the action selection output.

IPC Classes  ?

58.

JOINTLY UPDATING AGENT CONTROL POLICIES USING ESTIMATED BEST RESPONSES TO CURRENT CONTROL POLICIES

      
Application Number 18275881
Status Pending
Filing Date 2022-02-07
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Marris, Luke Christopher
  • Muller, Paul Fernand Michel
  • Lanctot, Marc
  • Graepel, Thore Kurt Hartwig

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating control policies for controlling agents in an environment. One of the methods includes, at each of a plurality of iterations: obtaining a current joint control policy for a plurality of agents, the current joint control policy specifying a respective current control policy for each agent; and updating the current joint control policy, comprising, for each agent: generating a respective reward estimate for each of a plurality of alternate control policies that is an estimate of a reward received by the agent if the agent is controlled using the alternate control policy while the other agents are controlled using the respective current control policies; computing a best response for the agent from the respective reward estimates; and updating the respective current control policy for the agent using the best response for the agent.

IPC Classes  ?

59.

AUGMENTING ATTENTION-BASED NEURAL NETWORKS TO SELECTIVELY ATTEND TO PAST INPUTS

      
Application Number 18486060
Status Pending
Filing Date 2023-10-12
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Rae, Jack William
  • Potapenko, Anna
  • Lillicrap, Timothy Paul

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input that is a sequence to generate a network output. In one aspect, one of the methods includes, for each particular sequence of layer inputs: for each attention layer in the neural network: maintaining episodic memory data; maintaining compressed memory data; receiving a layer input to be processed by the attention layer; and applying an attention mechanism over (i) the compressed representation in the compressed memory data for the layer, (ii) the hidden states in the episodic memory data for the layer, and (iii) the respective hidden state at each of the plurality of input positions in the particular network input to generate a respective activation for each input position in the layer input.

IPC Classes  ?

  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/08 - Learning methods
  • G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting
  • G06N 3/047 - Probabilistic or stochastic networks

60.

MULTI-TASK NEURAL NETWORKS WITH TASK-SPECIFIC PATHS

      
Application Number 18487707
Status Pending
Filing Date 2023-10-16
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Wierstra, Daniel Pieter
  • Fernando, Chrisantha Thomas
  • Pritzel, Alexander
  • Banarse, Dylan Sunil
  • Blundell, Charles
  • Rusu, Andrei-Alexandru
  • Zwols, Yori
  • Ha, David

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using multi-task neural networks. One of the methods includes receiving a first network input and data identifying a first machine learning task to be performed on the first network input; selecting a path through the plurality of layers in a super neural network that is specific to the first machine learning task, the path specifying, for each of the layers, a proper subset of the modular neural networks in the layer that are designated as active when performing the first machine learning task; and causing the super neural network to process the first network input using (i) for each layer, the modular neural networks in the layer that are designated as active by the selected path and (ii) the set of one or more output layers corresponding to the identified first machine learning task.

IPC Classes  ?

  • G06N 3/086 - Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

61.

DATA-DRIVEN ROBOT CONTROL

      
Application Number 18331632
Status Pending
Filing Date 2023-06-08
First Publication Date 2024-02-08
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Cabi, Serkan
  • Wang, Ziyu
  • Novikov, Alexander
  • Konyushkova, Ksenia
  • Gomez Colmenarejo, Sergio
  • Reed, Scott Ellison
  • Denil, Misha Man Ray
  • Scholz, Jonathan Karl
  • Sushkov, Oleg O.
  • Jeong, Rae Chan
  • Barker, David
  • Budden, David
  • Vecerik, Mel
  • Aytar, Yusuf
  • Gomes De Freitas, Joao Ferdinando

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

IPC Classes  ?

62.

Reinforcement learning with scheduled auxiliary control

      
Application Number 16289531
Grant Number 11893480
Status In Force
Filing Date 2019-02-28
First Publication Date 2024-02-06
Grant Date 2024-02-06
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Riedmiller, Martin
  • Hafner, Roland

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning with scheduled auxiliary tasks. In one aspect, a method includes maintaining data specifying parameter values for a primary policy neural network and one or more auxiliary neural networks; at each of a plurality of selection time steps during a training episode comprising a plurality of time steps: receiving an observation, selecting a current task for the selection time step using a task scheduling policy, processing an input comprising the observation using the policy neural network corresponding to the selected current task to select an action to be performed by the agent in response to the observation, and causing the agent to perform the selected action.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

63.

JOINTLY LEARNING EXPLORATORY AND NON-EXPLORATORY ACTION SELECTION POLICIES

      
Application Number 18334112
Status Pending
Filing Date 2023-06-13
First Publication Date 2024-01-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Badia, Adrià Puigdomènech
  • Sprechmann, Pablo
  • Vitvitskyi, Alex
  • Guo, Zhaohan
  • Piot, Bilal
  • Kapturowski, Steven James
  • Tieleman, Olivier
  • Blundell, Charles

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by an agent interacting with an environment. In one aspect, the method comprises: receiving an observation characterizing a current state of the environment; processing the observation and an exploration importance factor using the action selection neural network to generate an action selection output; selecting an action to be performed by the agent using the action selection output; determining an exploration reward; determining an overall reward based on: (i) the exploration importance factor, and (ii) the exploration reward; and training the action selection neural network using a reinforcement learning technique based on the overall reward.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06F 18/22 - Matching criteria, e.g. proximity measures
  • G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

64.

ACTION CLASSIFICATION IN VIDEO CLIPS USING ATTENTION-BASED NEURAL NETWORKS

      
Application Number 18375941
Status Pending
Filing Date 2023-10-02
First Publication Date 2024-01-25
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Carreira, Joao
  • Doersch, Carl
  • Zisserman, Andrew

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying actions in a video. One of the methods obtaining a feature representation of a video clip; obtaining data specifying a plurality of candidate agent bounding boxes in the key video frame; and for each candidate agent bounding box: processing the feature representation through an action transformer neural network.

IPC Classes  ?

  • G06V 20/40 - Scenes; Scene-specific elements in video content
  • G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
  • G06N 3/045 - Combinations of networks

65.

OPTIMIZING ALGORITHMS FOR TARGET PROCESSORS USING REPRESENTATION NEURAL NETWORKS

      
Application Number EP2023070308
Publication Number 2024/018065
Status In Force
Filing Date 2023-07-21
Publication Date 2024-01-25
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Michi, Andrea
  • Mankowitz, Daniel J.
  • Zhernov, Anton
  • Gelmi, Marco Oreste
  • Selvi, Marco
  • Paduraru, Cosmin
  • Leurent, Edouard
  • Mandhane, Amol Balkishan
  • Iqbal, Shariq Nadeem
  • Silver, David
  • Riedmiller, Martin
  • Kohli, Pushmeet
  • Vinyals, Oriol

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for optimizing a target algorithm using a state representation neural network.

IPC Classes  ?

66.

NEURAL NETWORKS IMPLEMENTING ATTENTION OVER OBJECT EMBEDDINGS FOR OBJECT-CENTRIC VISUAL REASONING

      
Application Number 18029980
Status Pending
Filing Date 2021-10-01
First Publication Date 2024-01-18
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Ding, Fengning
  • Santoro, Adam Anthony
  • Hill, Felix George
  • Botvinick, Matthew
  • Piloto, Luis

Abstract

A video processing system configured to analyze a sequence of video frames to detect objects in the video frames and provide information relating to the detected objects in response to a query. The query may comprise, for example, a request for a prediction of a future event, or of the location of an object, or a request for a prediction of what would happen if an object were modified. The system uses a transformer neural network subsystem to process representations of objects in the video.

IPC Classes  ?

  • G06V 20/40 - Scenes; Scene-specific elements in video content
  • G06V 10/26 - Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 10/776 - Validation; Performance evaluation

67.

Selecting reinforcement learning actions using a low-level controller

      
Application Number 17541186
Grant Number 11875258
Status In Force
Filing Date 2021-12-02
First Publication Date 2024-01-16
Grant Date 2024-01-16
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Heess, Nicolas Manfred Otto
  • Lillicrap, Timothy Paul
  • Wayne, Gregory Duncan
  • Tassa, Yuval

Abstract

Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation. The subsystem receives a current observation characterizing a current state of the environment, determines whether criteria are satisfied for generating a new control signal, and based on the determination, provides appropriate inputs to the high-level and low-level controllers for selecting an action to be performed by the agent.

IPC Classes  ?

  • G06N 3/08 - Learning methods
  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

68.

VOCABULARY SELECTION FOR TEXT PROCESSING TASKS USING POWER INDICES

      
Application Number 18038631
Status Pending
Filing Date 2021-11-22
First Publication Date 2024-01-11
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gemp, Ian Michael
  • Bachrach, Yoram
  • Patel, Roma
  • Dyer, Christopher James

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting an input vocabulary for a machine learning model using power indices. One of the methods includes computing a respective score for each of a plurality of text tokens in an initial vocabulary and then selecting the text tokens in the input vocabulary based on the respective scores.

IPC Classes  ?

  • G10L 13/047 - Architecture of speech synthesisers
  • G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

69.

MODEL-FREE REINFORCEMENT LEARNING WITH REGULARIZED NASH DYNAMICS

      
Application Number EP2023067491
Publication Number 2024/003058
Status In Force
Filing Date 2023-06-27
Publication Date 2024-01-04
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Perolat, Julien
  • De Vylder, Bart
  • Tuyls, Karl Paul

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network that is used to control an agent. In particular, the policy neural network can be trained through model-free reinforcement learning with regularized Nash dynamics.

IPC Classes  ?

70.

RECURRENT NEURAL NETWORKS FOR DATA ITEM GENERATION

      
Application Number 18367305
Status Pending
Filing Date 2023-09-12
First Publication Date 2023-12-28
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gregor, Karol
  • Danihelka, Ivo

Abstract

Methods, and systems, including computer programs encoded on computer storage media for generating data items. A method includes reading a glimpse from a data item using a decoder hidden state vector of a decoder for a preceding time step, providing, as input to a encoder, the glimpse and decoder hidden state vector for the preceding time step for processing, receiving, as output from the encoder, a generated encoder hidden state vector for the time step, generating a decoder input from the generated encoder hidden state vector, providing the decoder input to the decoder for processing, receiving, as output from the decoder, a generated a decoder hidden state vector for the time step, generating a neural network output update from the decoder hidden state vector for the time step, and combining the neural network output update with a current neural network output to generate an updated neural network output.

IPC Classes  ?

  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

71.

SIMULATING INDUSTRIAL FACILITIES FOR CONTROL

      
Application Number EP2023067148
Publication Number 2023/247767
Status In Force
Filing Date 2023-06-23
Publication Date 2023-12-28
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Dutta, Praneet
  • Chervonyi, Iurii
  • Voicu, Octavian
  • Luo, Jerry Jiayu
  • Trochim, Piotr

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for simulating industrial facilities for control. One of the methods includes. at each of a plurality of time steps during a task episode: receiving, from a computer simulator of an industrial facility, measurements representing a current state of the facility; generating, from the measurements, an observation; providing the observation as input to a control policy for controlling the facility; receiving, as output, an action for controlling one or more setpoints of the facility; generating, from the action, one or more control inputs for the one or more setpoints of the facility; and providing, as input to the simulator, (i) the control inputs and (ii) current values for one or more configuration parameters of the simulator to cause the simulator to generate, as output, new measurements representing a new state of the facility.

IPC Classes  ?

  • G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
  • G05B 19/418 - Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control (DNC), flexible manufacturing systems (FMS), integrated manufacturing systems (IMS), computer integrated manufacturing (CIM)

72.

PREDICTING PROTEIN STRUCTURES USING PROTEIN GRAPHS

      
Application Number 18034989
Status Pending
Filing Date 2021-11-23
First Publication Date 2023-12-21
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Pritzel, Alexander
  • Figurnov, Mikhail
  • Jumper, John

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a predicted structure of a protein. According to one aspect, there is provided a method comprising maintaining graph data representing a graph of the protein; obtaining a respective pair embedding for each edge in the graph; processing the pair embeddings using a sequence of update blocks, wherein each update block performs operations comprising, for each edge in the graph: generating a respective representation of each of a plurality of cycles in the graph that include the edge by, for each cycle, processing embeddings for edges in the cycle in accordance with the values of the update block parameters of the update block to generate the representation of the cycle; and updating the pair embedding for the edge using the representations of the cycles in the graph that include the edge.

IPC Classes  ?

73.

Simulating Physical Environments with Discontinuous Dynamics Using Graph Neural Networks

      
Application Number EP2023066187
Publication Number 2023/242378
Status In Force
Filing Date 2023-06-15
Publication Date 2023-12-21
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Allen, Kelsey Rebecca
  • Lopez Guevara, Tatiana
  • Pfaff, Tobias
  • Sanchez, Alvaro
  • Rubanova, Yulia
  • Stachenfeld, Kimberly
  • Battaglia, Peter William

Abstract

This specification describes a simulation system that performs simulations of physical environments using a graph neural network. At each of one or more time steps in a sequence of time steps in a given time interval, the system can process a representation of a current state of the physical environment at the current time step using the graph neural network to generate a prediction of a next state of the physical environment at the next time step. Generally, the environment has discontinuous dynamics at one or more time points during the time interval.

IPC Classes  ?

  • G06F 30/20 - Design optimisation, verification or simulation
  • G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
  • G06F 119/12 - Timing analysis or timing optimisation

74.

Distributional reinforcement learning for continuous control tasks

      
Application Number 18303117
Grant Number 11948085
Status In Force
Filing Date 2023-04-19
First Publication Date 2023-12-21
Grant Date 2024-04-02
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Budden, David
  • Hoffman, Matthew William
  • Barth-Maron, Gabriel

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

IPC Classes  ?

75.

TRAINING CAMERA POLICY NEURAL NETWORKS THROUGH SELF PREDICTION

      
Application Number EP2023066186
Publication Number 2023/242377
Status In Force
Filing Date 2023-06-15
Publication Date 2023-12-21
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Grimes, Matthew Koichi
  • Mirowski, Piotr Wojciech
  • Modayil, Joseph Varughese

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a camera policy neural network.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/045 - Combinations of networks
  • G06N 3/0895 - Weakly supervised learning, e.g. semi-supervised or self-supervised learning
  • G06N 3/092 - Reinforcement learning

76.

PREDICTING PROTEIN STRUCTURES OVER MULTIPLE ITERATIONS USING RECYCLING

      
Application Number 18034280
Status Pending
Filing Date 2021-11-23
First Publication Date 2023-12-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Jumper, John
  • Figurnov, Mikhail

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting a structure of a protein comprising one or more chains. In one aspect, a method comprises, at each subsequent iteration after a first iteration in a sequence of iterations: obtaining a network input for the subsequent iteration that characterizes the protein; generating, from (i) structure parameters generated at a preceding iteration that precedes the subsequent iteration in the sequence, (ii) one or intermediate outputs generated by the protein structure prediction neural network while generating the structure parameters at the last iteration, or (iii) both, features for the subsequent iteration; and processing the features and the network input for the subsequent iteration using the protein structure prediction neural network to generate structure parameters for the subsequent iteration that define another predicted structure for the protein.

IPC Classes  ?

77.

HIERARCHICAL REINFORCEMENT LEARNING AT SCALE

      
Application Number EP2023065305
Publication Number 2023/237635
Status In Force
Filing Date 2023-06-07
Publication Date 2023-12-14
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Soyer, Hubert Josef
  • Behbahani, Feryal
  • Keck, Thomas Albert
  • Nikiforou, Kyriacos
  • Pires, Bernardo Avila
  • Baveja, Satinder Singh

Abstract

The invention describes a system and a method for controlling an agent interacting with an environment to perform a task, the method comprising, at each of a plurality of first time steps from a plurality of time steps: receiving an observation characterizing a state of the environment at the first time step; determining a goal representation for the first time step that characterizes a goal state of the environment to be reached by the agent; processing the observation and the goal representation using a low-level controller neural network to generate a low-level policy output that defines an action to be performed by the agent in response to the observation, wherein the low-level controller neural network comprises: a representation neural network configured to process the observation to generate an internal state representation of the observation, and a low-level policy head configured to process the state observation representation and the goal representation to generate the low-level policy output; and controlling the agent using the low-level policy output.

IPC Classes  ?

  • G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
  • G06N 3/045 - Combinations of networks
  • G06N 3/092 - Reinforcement learning
  • G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks

78.

REINFORCEMENT LEARNING TO EXPLORE ENVIRONMENTS USING META POLICIES

      
Application Number EP2023065306
Publication Number 2023/237636
Status In Force
Filing Date 2023-06-07
Publication Date 2023-12-14
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Zintgraf, Luisa Maria
  • Magalhaes Marinho, Zita Alexandra
  • Kemaev, Iurii
  • Kirsch, Louis Michel
  • Oh, Junhyuk
  • Schaul, Tom

Abstract

The invention describes the method performed by one or more computers and for training a base policy neural network that is configured to receive a base policy input comprising an observation of a state of an environment and to process the policy input to generate a base policy output that defines an action to be performed by an agent in response to the observation, the method comprising: generating training data for training the base policy neural network by controlling an agent using (i) the base policy neural network and (ii) an exploration strategy that maps, in accordance with a set of one or more parameters, base policy outputs generated by the base policy neural network to actions performed by the agent to interact with an environment, the generating comprising, at each of a plurality of time points: determining that criteria for updating the exploration strategy are satisfied at the time point; and in response to determining that the criteria are satisfied: generating a meta policy input that comprises data characterizing a performance of the base policy neural network in controlling the agent at the time point; processing the meta policy input using a meta policy to generate a meta policy output that specifies respective values for each of the set of one or more parameters that define the exploration strategy; and controlling the agent using the base policy neural network and in accordance with the exploration strategy defined by the respective values for the set of one or more parameters specified by the meta policy output.

IPC Classes  ?

  • G06N 3/008 - Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
  • G06N 3/045 - Combinations of networks
  • G06N 3/092 - Reinforcement learning
  • G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn

79.

TRAINING A SPEAKER NEURAL NETWORK USING ONE OR MORE LISTENER NEURAL NETWORKS

      
Application Number 18199896
Status Pending
Filing Date 2023-05-19
First Publication Date 2023-12-14
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Singh, Aaditya K.
  • Ding, Fengning
  • Hill, Felix George
  • Lampinen, Andrew Kyle

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a speaker neural network using one or more listener neural networks.

IPC Classes  ?

  • G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
  • G06V 20/62 - Text, e.g. of license plates, overlay texts or captions on TV images

80.

Learning abstractions using patterns of activations of a neural network hidden layer

      
Application Number 16916939
Grant Number 11842270
Status In Force
Filing Date 2020-06-30
First Publication Date 2023-12-12
Grant Date 2023-12-12
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Lerchner, Alexander
  • Hassabis, Demis

Abstract

We describe an artificial neural network comprising: an input layer of input neurons, one or more hidden layers of neurons in successive layers of neurons above the input layer, and at least one further, concept-identifying layer of neurons above the hidden layers. The neural network includes an activation memory coupled to an intermediate, hidden layer of neurons between the input concept-identifying layers to store a pattern of activation of the intermediate layer. The neural network further includes a system to determine an overlap between a plurality of the stored patterns of activation and to activate in the intermediate hidden layer an overlap pattern such that the concept-identifying layer of neurons is configured to identify features of the overlap patterns. We also describe related methods, processor control code, and computing systems for the neural network. Optionally further, higher level concept-identifying layers of neurons may be included.

IPC Classes  ?

81.

PREDICTING PROTEIN STRUCTURES USING AUXILIARY FOLDING NETWORKS

      
Application Number 18034006
Status Pending
Filing Date 2021-11-23
First Publication Date 2023-12-07
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Kohl, Simon
  • Ronneberger, Olaf
  • Figurnov, Mikhail
  • Pritzel, Alexander

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a structure prediction neural network that comprises an embedding neural network and a main folding neural network. According to one aspect, a method comprises: obtaining a training network input characterizing a training protein; processing the training network input using the embedding neural network and the main folding neural network to generate a main structure prediction; for each auxiliary folding neural network in a set of one or more auxiliary folding neural networks, processing at least a corresponding intermediate output of the embedding neural network to generate an auxiliary structure prediction; determining a gradient of an objective function that includes a respective auxiliary structure loss term for each of the auxiliary folding neural networks; and updating the current values of the embedding network parameters and the main folding parameters based on the gradient.

IPC Classes  ?

82.

SIMULATING PHYSICAL ENVIRONMENTS USING FINE-RESOLUTION AND COARSE-RESOLUTION MESHES

      
Application Number EP2023063755
Publication Number 2023/227586
Status In Force
Filing Date 2023-05-23
Publication Date 2023-11-30
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Fortunato, Meire
  • Pfaff, Tobias
  • Wirnsberger, Peter
  • Pritzel, Alexander
  • Battaglia, Peter William

Abstract

47 ABSTRACT Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for simulating a state of a physical environment. In one aspect, a method performed by one or more computers for simulating the state of the physical environment is provided. The method includes, for each of multiple time steps: obtaining data defining a fine-resolution mesh and a coarse-resolution mesh that each characterize the state of the physical environment at the current time step, where the fine-resolution mesh has a higher resolution than the coarse-resolution mesh; processing data defining the fine- resolution mesh and the coarse-resolution mesh using a graph neural network that includes: (i) one or more fine-resolution update blocks, (ii) one or more coarse-resolution update blocks, and (iii) one or more up-sampling update blocks; and determining the state of the physical environment at a next time step using updated node embeddings for nodes in the fine-resolution mesh. DeepMind Technologies Limited F&R Ref.: 45288-0255WO1 PCT Application

IPC Classes  ?

  • G06F 30/23 - Design optimisation, verification or simulation using finite element methods [FEM] or finite difference methods [FDM]
  • G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
  • G06N 3/02 - Neural networks
  • G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
  • G06F 111/10 - Numerical modelling
  • G06F 113/08 - Fluids

83.

EXPLORATION BY BOOTSTEPPED PREDICTION

      
Application Number EP2023063282
Publication Number 2023/222772
Status In Force
Filing Date 2023-05-17
Publication Date 2023-11-23
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Guo, Zhaohan
  • Altché, Florent
  • Tallec, Corentin
  • Pires, Bernardo Avila
  • Pîslar, Miruna
  • Thakoor, Shantanu Yogeshraj
  • Azar, Mohammad Gheshlaghi
  • Piot, Bilal

Abstract

An iterative method is proposed to train an action selection system of a reinforcement learning system, based on a reward function which defines a reward value for each action. The reward value includes an intrinsic reward term generated based on the outputs of two encoder models: an online encoder model and a target encoder model. The online encoder model is iteratively trained based on a loss function, and the target encoder model is updated to bring it closer to the online encoder model.

IPC Classes  ?

84.

MACHINE LEARNING SYSTEMS WITH COUNTERFACTUAL INTERVENTIONS

      
Application Number EP2023063488
Publication Number 2023/222884
Status In Force
Filing Date 2023-05-19
Publication Date 2023-11-23
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Rabinowitz, Neil Charles
  • Roy, Nicholas Andrew
  • Kim, Junkyung

Abstract

Systems, methods, and computer programs, for training and using a machine learning system to control an agent to perform a task. The machine learning system is trained using counterfactual internal states so that it can provide an output that explains the behavior of the system in causal terms, e.g. in terms of aspects of its environment that cause the system to select particular actions for the agent.

IPC Classes  ?

  • G06N 3/008 - Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks
  • G06N 3/084 - Backpropagation, e.g. using gradient descent

85.

LARGE-SCALE RETRIEVAL AUGMENTED REINFORCEMENT LEARNING

      
Application Number EP2023063492
Publication Number 2023/222885
Status In Force
Filing Date 2023-05-19
Publication Date 2023-11-23
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Humphreys, Peter Conway
  • Guez, Arthur Clement

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling a reinforcement learning agent in an environment to perform a task. In one aspect, a method comprises: maintaining a retrieval dataset that stores a plurality of history observations and, for each history observation, a respective associated context; receiving a current observation characterizing a current state of the environment; selecting one or more history observations from the plurality of history observations; processing, using an encoder neural network and in accordance with current values of encoder network parameters, an encoder network input comprising (i) the current observation and (ii) the one or more selected history observations and their respective associated context to generate a latent state representation for the current state of the environment; and using the latent state representation to determine an action to be performed by the agent in response to the current observation.

IPC Classes  ?

  • G06F 16/908 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
  • G06N 3/02 - Neural networks
  • G06N 20/00 - Machine learning

86.

CONTRASTIVE LEARNING USING POSITIVE PSEUDO LABELS

      
Application Number EP2023063496
Publication Number 2023/222889
Status In Force
Filing Date 2023-05-19
Publication Date 2023-11-23
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Mitrovic, Jovana
  • Bosnjak, Matko
  • Richemond, Pierre
  • Tomasev, Nenad
  • Strub, Florian
  • Walker, Jacob Charles
  • Hill, Felix George
  • Buesing, Lars
  • Pascanu, Razvan
  • Blundell, Charles

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network to perform a machine learning task on one or more received inputs by using a hybrid training dataset with a semi-supervised learning technique. The hybrid training dataset includes multiple unlabeled training inputs and multiple labeled training inputs and, in some cases, more unlabeled training inputs than labeled training inputs.

IPC Classes  ?

  • G06N 3/045 - Combinations of networks
  • G06N 3/0464 - Convolutional networks [CNN, ConvNet]
  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/0895 - Weakly supervised learning, e.g. semi-supervised or self-supervised learning

87.

TRAINING REINFORCEMENT LEARNING AGENTS USING AUGMENTED TEMPORAL DIFFERENCE LEARNING

      
Application Number 18029979
Status Pending
Filing Date 2021-10-01
First Publication Date 2023-11-23
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Gulcehre, Caglar
  • Pascanu, Razvan
  • Gomez, Sergio

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions performed by an agent interacting with an environment by performing actions that cause the environment to transition states. One of the methods includes maintaining a replay memory storing a plurality of transitions; selecting a plurality of transitions from the replay memory; and training the neural network on the plurality of transitions, comprising, for each transition: generating an initial Q value for the transition; determining a scaled Q value for the transition; determining a scaled temporal difference learning target for the transition; determining an error between the scaled temporal difference learning target and the scaled Q value; determining an update to the current values of the Q network parameters; and determining an update to the current value of the scaling term.

IPC Classes  ?

  • G06N 3/092 - Reinforcement learning
  • G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]

88.

TRAINING MACHINE LEARNING MODELS BY DETERMINING UPDATE RULES USING NEURAL NETWORKS

      
Application Number 18180754
Status Pending
Filing Date 2023-03-08
First Publication Date 2023-11-23
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Denil, Misha Man Ray
  • Schaul, Tom
  • Andrychowicz, Marcin
  • Gomes De Freitas, Joao Ferdinando
  • Colmenarejo, Sergio Gomez
  • Hoffman, Matthew William
  • Pfau, David Benjamin

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media for training machine learning models. One method includes obtaining a machine learning model, wherein the machine learning model comprises one or more model parameters, and the machine learning model is trained using gradient descent techniques to optimize an objective function; determining an update rule for the model parameters using a recurrent neural network (RNN); and applying a determined update rule for a final time step in a sequence of multiple time steps to the model parameters.

IPC Classes  ?

  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 3/045 - Combinations of networks

89.

RESOURCE NAVIGATION USING NEURAL NETWORKS

      
Application Number EP2023063486
Publication Number 2023/222882
Status In Force
Filing Date 2023-05-19
Publication Date 2023-11-23
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Marino, Kenneth Daniel
  • Zaheer, Manzil
  • Fergus, Robert David
  • Grathwohl, Will S.

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resource navigation using neural networks.

IPC Classes  ?

90.

DETERMINING GENERALIZED EIGENVECTORS USING MULTI-AGENT INTERACTIONS

      
Application Number EP2023063487
Publication Number 2023/222883
Status In Force
Filing Date 2023-05-19
Publication Date 2023-11-23
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Gemp, Ian Michael
  • Mcwilliams, Brian
  • Chen, Charlie Xiangyu

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining generalized eigenvectors that characterize a data set.

IPC Classes  ?

91.

INTRA-AGENT SPEECH TO FACILITATE TASK LEARNING

      
Application Number EP2023063494
Publication Number 2023/222887
Status In Force
Filing Date 2023-05-19
Publication Date 2023-11-23
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Yan, Chen
  • Carnevale, Federico Javier
  • Georgiev, Petko Ivanov
  • Santoro, Adam Anthony
  • Guy, Aurelia Adrianna
  • Muldal, Alistair Michael
  • Hung, Chia-Chun
  • Abramson, Joshua Simon
  • Lillicrap, Timothy Paul
  • Wayne, Gregory Duncan

Abstract

Systems, methods, and computer programs for learning to control an embodied agent to perform tasks. The techniques use internal, "intra-agent" speech when learning, and are thus able to perform tasks involving new objects without any direct experience of interacting with those objects, i.e. zero-shot. Implementations of the techniques use an image captioning neural network system to generate natural language captions used when training an action selection neural network system.

IPC Classes  ?

  • G06N 3/008 - Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
  • G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks
  • G06N 3/044 - Recurrent networks, e.g. Hopfield networks
  • G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
  • G06N 3/096 - Transfer learning

92.

SELECTION-INFERENCE NEURAL NETWORK SYSTEMS

      
Application Number 18317878
Status Pending
Filing Date 2023-05-15
First Publication Date 2023-11-16
Owner DeepMind Technologies Limited (United Kingdom)
Inventor Creswell, Antonia Phoebe Nina

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a response to a query input using a selection-inference neural network.

IPC Classes  ?

  • B60W 50/06 - Improving the dynamic response of the control system, e.g. improving the speed of regulation or avoiding hunting or overshoot
  • G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
  • G06F 40/20 - Natural language analysis
  • B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
  • B60W 50/02 - Ensuring safety in case of control system failures, e.g. by diagnosing, circumventing or fixing failures

93.

VARIABLE RESOLUTION VARIABLE FRAME RATE VIDEO CODING USING NEURAL NETWORKS

      
Application Number EP2023062431
Publication Number 2023/217867
Status In Force
Filing Date 2023-05-10
Publication Date 2023-11-16
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Assael, Ioannis Alexandros
  • Shillingford, Brendan

Abstract

Systems and methods for encoding video, and for decoding video at an arbitrary temporal and/or spatial resolution. The techniques use a scene representation neural network that, in implementations, is configured to represent frames of a 2D or 3D video as a 3D model encoded in the parameters of the neural network.

IPC Classes  ?

  • G06N 3/04 - Architecture, e.g. interconnection topology
  • H04N 19/31 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
  • H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain

94.

NEGOTIATING CONTRACTS FOR AGENT COOPERATION IN MULTI-AGENT SYSTEMS

      
Application Number EP2023062432
Publication Number 2023/217868
Status In Force
Filing Date 2023-05-10
Publication Date 2023-11-16
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor
  • Bachrach, Yoram
  • Tacchetti, Andrea
  • Gemp, Ian Michael
  • Kramár, János
  • Malinowski, Mateusz
  • Mckee, Kevin Robert

Abstract

Methods, systems and apparatus, including computer programs encoded on computer storage media, for enabling agents to cooperate with one another in a way that improves their collective efficiency. The agents can modify their behavior by taking into account the behavior of other agents, so that a better overall result can be achieved than if each agent acted independently. This is done by enabling the agents to negotiate contracts with one another that restrict their respective actions.

IPC Classes  ?

  • G06Q 10/04 - Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
  • G06N 3/02 - Neural networks

95.

CONSTRAINED REINFORCEMENT LEARNING NEURAL NETWORK SYSTEMS USING PARETO FRONT OPTIMIZATION

      
Application Number 18029992
Status Pending
Filing Date 2021-10-01
First Publication Date 2023-11-16
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Huang, Sandy Han
  • Abdolmaleki, Abbas

Abstract

A system and method that controls an agent to perform a task subject to one or more constraints. The system trains a preference neural network that learns which preferences produce constraint-satisfying action selection policies. Thus the system optimizes a hierarchical policy that is a product of a preference policy and a preference-conditioned action selection policy. Thus the system learns to jointly optimize a set of objectives relating to rewards and costs received during the task whilst also learning preferences, i.e. trade-offs between the rewards and costs, that are most likely to produce policies that satisfy the constraints.

IPC Classes  ?

96.

SELECTION-INFERENCE NEURAL NETWORK SYSTEMS

      
Application Number EP2023062781
Publication Number 2023/218040
Status In Force
Filing Date 2023-05-12
Publication Date 2023-11-16
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor Creswell, Antonia Phoebe Nina

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a response to a query input using a selection- inference neural network.

IPC Classes  ?

  • G06N 3/045 - Combinations of networks
  • G06F 40/20 - Natural language analysis
  • G06N 5/02 - Knowledge representation; Symbolic representation
  • G06N 3/042 - Knowledge-based neural networks; Logical representations of neural networks
  • G06F 40/35 - Discourse or dialogue representation
  • G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
  • G06N 3/096 - Transfer learning
  • G06N 5/046 - Forward inferencing; Production systems

97.

SIMULATING PHYSICAL ENVIRONMENTS USING GRAPH NEURAL NETWORKS

      
Application Number 18027174
Status Pending
Filing Date 2021-10-01
First Publication Date 2023-11-09
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Sanchez, Alvaro
  • Godwin, Jonathan William
  • Ying, Rex
  • Pfaff, Tobias
  • Fortunato, Meire
  • Battaglia, Peter William

Abstract

This specification describes a simulation system that performs simulations of physical environments using a graph neural network. At each of one or more time steps in a sequence of time steps, the system can process a representation of a current state of the physical environment at the current time step using the graph neural network to generate a prediction of a next state of the physical environment at the next time step. Some implementations of the system are adapted for hardware GLOBAL acceleration. As well as performing simulations, the system can be used to predict physical quantities based on measured real-world data. Implementations of the system are differentiable and can also be used for design optimization, and for optimal control tasks.

IPC Classes  ?

  • G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model

98.

DATA COMPRESSION AND RECONSTRUCTION USING SPARSE META-LEARNED NEURAL NETWORKS

      
Application Number EP2023061711
Publication Number 2023/213903
Status In Force
Filing Date 2023-05-03
Publication Date 2023-11-09
Owner DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor Schwarz, Jonathan

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for compressing and decompressing data signals using sparse, meta-learned neural networks.

IPC Classes  ?

  • G06N 3/045 - Combinations of networks
  • G06N 3/0495 - Quantised networks; Sparse networks; Compressed networks
  • G06N 3/082 - Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
  • G06N 3/084 - Backpropagation, e.g. using gradient descent
  • G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn

99.

TRAINING PROTEIN STRUCTURE PREDICTION NEURAL NETWORKS USING REDUCED MULTIPLE SEQUENCE ALIGNMENTS

      
Application Number 18025689
Status Pending
Filing Date 2021-08-12
First Publication Date 2023-11-09
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Evans, Richard Andrew
  • Jumper, John
  • Green, Timothy Frederick Goldie
  • Reiman, David

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training neural networks to predict the structure of a protein. In one aspect, a method comprises: obtaining, for each of a plurality of proteins, a full multiple sequence alignment for the protein; generating, for each of the plurality of proteins, target structure parameters characterizing a structure of the protein from the full multiple sequence alignment for the protein, comprising processing a representation of the full multiple sequence alignment for the protein using the structure prediction neural network to generate output structure parameters characterizing a structure of the protein, and determining the target structure parameters for the protein based on the output structure parameters for the protein; determining, for each of the plurality of proteins, a reduced multiple sequence alignment for the protein, comprising removing or masking data from the full multiple sequence alignment for the protein.

IPC Classes  ?

100.

LANGUAGE MODEL FOR PROCESSING A MULTI-MODE QUERY INPUT

      
Application Number 18141337
Status Pending
Filing Date 2023-04-28
First Publication Date 2023-11-02
Owner DeepMind Technologies Limited (United Kingdom)
Inventor
  • Alayrac, Jean-Baptiste
  • Donahue, Jeffrey
  • Lenc, Karel
  • Simonyan, Karen
  • Reynolds, Malcolm Kevin Campbell
  • Luc, Pauline
  • Mensch, Arthur
  • Barr, Iain
  • Miech, Antoine
  • Hasson, Yana Elizabeth
  • Millican, Katherine Elizabeth
  • Ring, Roman

Abstract

A query processing system is described which receives a query input comprising an input token string and also at least one data item having a second, different modality, and generates a corresponding output token string.

IPC Classes  ?

  1     2     3     ...     8        Next Page